|
|
|
|
Vol. 8, Issue 8, 842-847, August 1998
LETTERS
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |
ABSTRACT |
|---|
|
|
|---|
The dense RFLP linkage map of tomato (Lycopersicon esculentum) contains >300 anonymous cDNA clones. Of those clones, 272 were partially or completely sequenced. The sequences were compared at the DNA and protein level to known genes in databases. For 57% of the clones, a significant match to previously described genes was found. The information will permit the conversion of those markers to STS markers and allow their use in PCR-based mapping experiments. Furthermore, it will facilitate the comparative mapping of genes across distantly related plant species by direct comparison of DNA sequences and map positions.
[cDNA sequence data reported in this paper have been submitted to the EMBL database under accession nos. AA824695-AA825005 and the dbEST_Id database under accession nos. 1546519-1546862.]
| |
INTRODUCTION |
|---|
|
|
|---|
Molecular markers used in mapping experiments
should ideally have a known function. This requirement is fulfilled
best by the use of known genes in the form of genomic clones or cDNA
clones. Many genetic maps based on molecular markers, such as RFLPs
(restriction fragment length
polymorphisms) include a number of known genes. However,
the number of known genes available for a given organism is usually
very limited. To circumvent this problem, many maps contain a large
number of randomly selected cDNA clones. cDNAs have the additional
advantage that they are frequently single or low copy and thus ideal
for genetic mapping (Bernatzky and Tanksley 1986
). Furthermore, they
show a higher level of conservation between species than genomic clones
(Zamir and Tanksley 1988
) and allow more efficient cross-mapping in
related species for the determination of synteny (Ahn and Tanksley
1993
).
In recent years random cDNA clones have attracted considerable
interest, as the sequencing of such clones provides a means to obtain a
fast catalog of expressed genes from a given organism without
sequencing the entire genome. In plants, major efforts are now under
way to sequence random cDNAs in Arabidopsis and rice
(Sasaki et al. 1994
; Delseny et al. 1997
). Many thousands of such
fragments have been deposited in the respective databases. For
~30%-40% of the cDNA sequences a possible function can be deduced
based on homologies to known genes from bacteria, animals, or plants
(Delseny et al. 1997
).
For all expressed genes, not only the DNA sequence and the possible
function should be known but also their map position on the
chromosomes. In the long term this will allow merging of the classical
genetic map based on mutations with potential candidate genes for these
mutants. A common repertoire of mapped cDNA clones will, in the future,
enable us to study synteny even between distantly related species
(Paterson et al. 1996
) for which studies by cross-hybridization are
very difficult. The availability of sequence information for mapped
cDNA clones should make conclusions from hybridization studies more
firm and testable.
Tomato (Lycopersicon esculentum) has one of the most densely
populated genetic maps among plants. Currently, >1000 RFLP markers have been published by Tanksley et al. (1992)
, of which >300 are random cDNA clones. An additional 500 RFLP markers have been localized in reference to this map through the fact that all markers from potato
can be readily transferred to tomato on the basis of extensive synteny
(Gebhardt et al. 1991
; Jacobs et al. 1995
). We report here the results
of the sequencing of ~90% of the mapped cDNA clones from the
high-density tomato map.
| |
RESULTS |
|---|
|
|
|---|
cDNA Sequencing and Analysis
A total of >300 cDNA clones have been mapped onto the
high-density map of tomato by Tanksley et al. (1992)
. These clones were derived from two different libraries. CD clones were generated from
leaf tissue and CT clones from epidermal tissue. The clones were in
four different vectors. For sequencing, only those clones were used
that were generated in vectors with M13 or pBluescript primer sites.
This excluded some of the early cDNA clones (Bernatzky and Tanksley
1986
) because they were generated in pBR322. A total of 272 cDNA clones
could be sequenced in this way from at least one direction. Sequences
from another 15 clones were not included in the analysis because they
contained obvious cloning artifacts that could not be resolved. From
both sides 145 clones were sequenced, and for 73 clones the entire
insert was sequenced. The largest clone completely sequenced was 896 bp.
Because these clones had already been placed onto the genetic map of tomato, it was anticipated that most should represent different genes. Nevertheless, for some cDNA clones duplicates could be identified (clones CT71 and CT210, CT72 and CT242, CT75 and CT166, CT88 and CT257, CT115 and CT218, CT154 and CT214, CT223 and CD61, CD18 and CD34). In these cases, the cDNA clones show a complex hybridization pattern upon which it was not possible to state previously that they are derived from the same gene.
Sequence Homologies
For 156 of the 275 analyzed clones (57%), significant nucleic
acid and/or protein homologies were found with the respective databases. Of those, 125 clones showed matches with known DNA sequences
and 141 clones showed matches with protein sequences with a BLAST score
of <10
10. Some sequenced clones showed only homology
to other plant sequences at the DNA level and not at the protein level
because the cDNA clones were not integrated directionally into the
cloning vector and for these clones sequence information was obtained
only from the untranslated 3' region with the poly(A) tail. If
matches on the DNA and protein level were found, only those were
considered significant that matched to the same protein type or
corresponding genes. Table 1 shows a summary of the data for the clones
with significant
homologies. For 25 tomato genes already described we found a direct match with the
database.
|
| |
DISCUSSION |
|---|
|
|
|---|
The availability of 272 sequenced cDNA clones from the tomato map,
together with other previously mapped genes of known function (Pillen
et al. 1996b
), creates a framework of >350 markers for the tomato
genome for which at least part of the DNA sequence is known. This
information will be sufficient for the generation of PCR-based markers
for most regions of the tomato genome. Such sequence-tagged sites
(STSs) will function as genetic anchor markers (Inoue et al. 1994
) and
permit fast and high throughput analysis of loci in large populations
(Schumacher et al. 1995
). In plant breeding and genetic experiments,
STS markers, for example, in the form of CAPs (cleaved
amplified polymorphisms), are very useful to follow linked genes in an economical manner through generations (Konieczny and Ausubel 1993
). Such markers are especially useful for
preselecting recombinants in specific regions of the tomato genome for
the purpose of high-resolution mapping of genes targeted for map-based
cloning (Alpert and Tanksley 1996
). At the same time, they provide
starting points for the rapid isolation of large insert clones from
tomato DNA libraries, such as yeast artificial chromosomes (YACs)
(Martin et al. 1992
; Bonnema et al. 1996
; Pillen et al. 1996a
) or
bacterial artificial chromosomes (BACs) (Shizuya et al. 1992
) and
binary BAC (BiBAC) clones (Hamilton et al. 1996
).
Considerable effort is currently spent on the comparative mapping of
highly conserved cDNA clones across a wide range of plant taxa. This
has revealed that in limited regions, synteny exists even between
distantly related plant species (Paterson et al. 1996
). Sequence data
from mapped cDNA clones will eventually help to reveal synteny between
such plant species. For example, using the information from these
mapped tomato clones, it will be possible to identify genes or
expressed sequence tags (ESTs) with very high sequence homology on the
DNA and/or protein level to the model plant organism Arabidopsis
thaliana. Such probes are very likely cross-hybridizing between
Arabidopsis and tomato. If they are single copy in
hybridization on both genomes they can be mapped comparatively in these
species and it can be determined whether there is conservation of
linkage in certain areas of their genomes. Similarly, such experiments
can be expanded to additional plant species for which large numbers of
ESTs will be generated in the future. A carefully chosen approach
involving sequence comparisons in combination with genetic mapping by
hybridization is absolutely necessary for the study of synteny between
distantly related plant species to discriminate between orthologous and
paralogous genes (Tatusov et al. 1997
). Large numbers of cDNA clones
are currently sequenced and mapped in some plant genomes. For rice, EST
sequences were used for the construction of a high-density genetic map
(Kurata et al. 1994
). Large numbers of ESTs are mapped onto the genetic map or YAC contig map of A. thaliana (Agyare et al. 1997
).
With the sequencing of the entire Arabidopsis and rice genomes
in the forseeable future, it will be easier to study exclusively
orthologous sequences from those two genomes in comparison to data from
other plant species.
| |
METHODS |
|---|
|
|
|---|
cDNA Clones
All cDNA clones from the map of Tanksley et al. (1992)
that have
been cloned in pUC vectors, the pBluescript SK vector, or the pCR II
(Invitrogen) vector were used for this study. Insert sizes were
confirmed prior to sequencing by PCR using M13 forward and reverse
sequencing primers.
Plasmid Preparation and Sequencing
Plasmids were prepared according to standard protocols using Qiagen colums (Qiagen, Hilden, Germany) from 5-ml cultures. Plasmid DNA was sequenced using commercially available sequencing kits and analyzed on ALF (automated laser fluorescence) sequencers (Pharmacia) and ABI sequencers (Perkin Elmer) using either M13 forward (pUC vectors) or SK primers (pBluescript). Most of the clones were also sequenced from the opposite side using M13 reverse (pUC vectors) or KS primers (pBluescript).
Sequence Analysis
Raw sequences were transferred into the sequence analysis program
DNAsis (Hitachi) and edited for vector sequences, poly(A) tails, and
other cloning artifacts. If sequencing was performed from both sides of
a clone, it was determined whether the two sequences overlap and in
such cases they were edited and merged into a single file. The edited
sequences were analyzed. The DNA sequence and the translated protein
sequences were compared to all available DNA and protein sequences
using the NIH BLAST server (BLASTN and BLASTX). Matches with a score of
<10
10 were considered to be significant. Accession
numbers correspond to the respective entries in the GenBank Nucleotide
Sequence and GenBank Protein databases, respectively.
| |
ACKNOWLEDGMENTS |
|---|
The technical support for sequencing of S. König and S. Gentz is acknowledged. The data will be also available through the SolGenes database. Part of this research has been supported by the Deutsche Forschungsgemeinschaft (Ga470/1-2).
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| |
FOOTNOTES |
|---|
1 Corresponding author.
E-MAIL ganal{at}ipk-gatersleben.de; FAX 49-39482-5137.
| |
REFERENCES |
|---|
|
|
|---|
Received January 13, 1998; accepted in revised form July 9, 1998.
This article has been cited by other articles:
![]() |
L. Bermudez, U. Urias, D. Milstein, L. Kamenetzky, R. Asis, A. R. Fernie, M. A. Van Sluys, F. Carrari, and M. Rossi A candidate gene survey of quantitative trait loci affecting chemical composition in tomato fruit J. Exp. Bot., July 1, 2008; 59(10): 2875 - 2890. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Arunyawat, W. Stephan, and T. Stadler Using Multilocus Sequence Data to Assess Population Structure, Natural Selection, and Linkage Disequilibrium in Wild Tomatoes Mol. Biol. Evol., October 1, 2007; 24(10): 2310 - 2322. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Roselius, W. Stephan, and T. Stadler The Relationship of Nucleotide Polymorphism, Recombination Rate and Selection in Wild Tomato Species Genetics, October 1, 2005; 171(2): 753 - 763. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Baudry, C. Kerdelhue, H. Innan, and W. Stephan Species and Recombination Effects on DNA Variability in the Tomato Genus Genetics, August 1, 2001; 158(4): 1725 - 1735. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Stuger, S. Ranostaj, T. Materna, and C. Forreiter Messenger RNA-Binding Properties of Nonpolysomal Ribonucleoproteins from Heat-Stressed Tomato Cells Plant Physiology, May 1, 1999; 120(1): 23 - 32. [Abstract] [Full Text] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||