|
|
|
|
Genome Res. 13:1984-1997, 2003 ©2003 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/03 $5.00 Retrotransposons and Their Recognition of pol II Promoters: A Comprehensive Survey of the Transposable Elements From the Complete Genome Sequence of Schizosaccharomyces pombe1 Section on Eukaryotic Transposable Elements, Laboratory of Gene Regulation and Development, National Institute of Child Health and Human Development (NICHD), National Institutes of Health (NIH), Bethesda, Maryland 20892, USA 2 National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, Maryland 20894, USA 3 Unit on Biologic Computation, NICHD/OSD, NIH, Bethesda, Maryland 20892, USA 4 The Wellcome Trust Sanger Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
The complete DNA sequence of the genome of Schizosaccharomyces pombe provides the opportunity to investigate the entire complement of transposable elements (TEs), their association with specific sequences, their chromosomal distribution, and their evolution. Using homology-based sequence identification, we found that the sequenced strain of S. pombe contained only one family of full-length transposons. This family, Tf2, consisted of 13 full-length copies of a long terminal repeat (LTR) retrotransposon. We found that LTR-LTR recombination of previously existing transposons had resulted in extensive populations of solo LTRs. These included 35 solo LTRs of Tf2, as well as 139 solo LTRs from other Tf families. Phylogenetic analysis of solo Tf LTRs reveals that Tf1 and Tf2 were the most recently active elements within the genome. The solo LTRs also served as footprints for previous insertion events by the Tf retrotransposons. Analysis of 186 genomic insertion events revealed a close association with RNA polymerase II promoters. These insertions clustered in the promoter-proximal regions of genes, upstream of protein coding regions by 100 to 400 nucleotides. The association of Tf insertions with pol II promoters was very similar to the preference previously observed for Tf1 integration. We found that the recently active Tf elements were absent from centromeres and pericentromeric regions of the genome containing tandem tRNA gene clusters. In addition, our analysis revealed that chromosome III has twice the density of insertion events compared to the other two chromosomes. Finally we describe a novel repetitive sequence, wtf, which was also preferentially located on chromosome III, and was often located near solo LTRs of Tf elements.
Long terminal repeat (LTR) retrotransposons are structurally as well as phylogenetically related to endogenous and exogenous retroviruses (Xiong and Eickbush 1990; Coffin et al. 1997
The integration of retroviral and LTR retrotransposon DNA is inherently
mutagenic. Much of what we understand about the molecular mechanisms of
retroelements involved in many pathologies, including malignancies, has come
from the study of the associated retroviruses and oncogenic retroviruses
(Coffin et al. 1997
The Ty LTR-retrotransposons of Saccharomyces cerevisiae have
provided many molecular clues to the selection of target sites. In each case,
the sites of integration indicate that the transposons have evolved mechanisms
to avoid the disruption of host genes. For instance, the very specific
integration of the Ty3 element one to four nucleotides (nt) upstream of pol
III-transcribed genes avoids damaging pol III genes as well as any other
category of gene (Chalker and Sandmeyer
1992
LTR retrotransposons and retroviruses are found in at least two integrated
or proviral forms within eukaryotic genomes
(Boeke and Stoye 1997
In this report, we present a comprehensive analysis of transposon sequences
throughout the genome of S. pombe. Only two families of LTR
retrotransposons, Tf1 and Tf2, are known to exist in S. pombe
(Fig. 1). Tf1 and Tf2 are
self-priming elements belonging to the Ty3/Gypsy group of LTR retrotransposons
(Levin et al. 1990
The analysis of the genome sequence of S. pombe revealed repeats related to Tf elements and an unrelated element, wtf (Wood et al. 2002
The Identification of Sequences From Tf Elements
Full-Length Tf Elements
We found two elements in a direct tandem orientation on the distal end of
the right arm of chromosome I. The tandem elements, Tf2-7 and Tf2-8, share a
complete internal LTR and have TSDs flanking the two outer LTRs. It is likely
that this was initially a single insertion event. However, subsequent
misalignment and recombination between the LTRs on sister chromatids or
homologous chromosomes generated the tandem elements. An alternative model is
that prior to integration, two cDNAs recombined in homologous sequences of the
LTR to generate a tandem cDNA that was subsequently integrated. Evidence of
this possibility is that in the absence of the Sgs1p helicase, Ty1 cDNAs
multimerize and insert as tandem elements
(Bryk et al. 2001
Of the 13 full-length elements found in the genome, two (Tf2-10 and Tf2-13)
contain single-nucleotide deletions that lead to nonsense codons at two
different sites in the C-terminal region of the RT domain. These two mutations
resulted in inactive elements that we designated pseudo-Tf2s. The tandem
elements Tf2-7 and Tf2-8 and the chimeric element Tf2-11 have identical
missense mutations at three positions within their RT domains. Surprisingly,
one of these mutations changes the second aspartic acid of the active site
motif "YMDD" to an asparagine, yielding "YMDN." The
presence of three elements with this motif suggests that this variant of Tf2
may have been active sometime in the past. Previously, one would have
considered this unlikely given that the two aspartic acids of the
"F/YXDD" box were thought to be invariant in all known reverse
transcriptases (Xiong and Eickbush
1988
In comparison, the number of full-length elements is almost an order of
magnitude less than the 50 LTR-retrotransposons found in the similarly sized
genome of S. cerevisiae. It is interesting to note that of the 50
full-length elements in S. cerevisiae, 49 belong to the Ty1/Copia
group of LTR-retrotransposons. This class of transposons is only distantly
related to the Ty3/gypsy family of LTR-retrotransposons. Surprisingly, there
are no representatives of Ty1/Copia elements in the genome of S.
pombe. In this sense, the low copy number of Ty3/Gypsy group elements is
similar between S. cerevisiae and S. pombe. The absence of
Ty1/Copia group elements is also evident in the genomes of C. elegans
and human (Bowen and McDonald
1999
Tf Fragments
Solo LTRs
There are 28 LTRs in a clade supported by a bootstrap value of 74 that are
closely related to the LTRs from Tf1-107. These LTRs were designated Tf1 LTRs.
In accord with the use of Greek letters as names for the families of LTRs in
S. cerevisiae, we used
There are 60 LTRs found in a clade supported by a bootstrap value of 77
that are closely related to the LTRs from the query Tf2 LTR and to the LTRs of
the endogenous full-length Tf2 elements. We designated this group as Tf2 LTRs,
with the Greek letter
Tf1 and Tf2 appear to represent the largest number of recently active
elements within the genome of S. pombe. Recent activity is indicated
by the short distances of many of the terminal branches of the phylogeny
(Fig. 3). A method for
calculating the relative insertion time of TEs is to calculate the average
pairwise nucleotide identity across the complete LTR sequences of elements
that are very closely related at the phylogenetic level
(Kapitonov and Jurka 1996
There are several other lineages containing LTRs with identical or near
identical sequences, as indicated by the flat terminal branches
(Fig. 3). Upon closer
examination, we noticed that all of these LTRs are located in subtelomeric
regions of the genome. The sequence identity among these LTRs was not due to
recent transposition, but was the result of duplications of subtelomeric
sequences that contained these LTRs. The telomeric regions of many organisms,
including S. pombe, are known to cluster during meiotic prophase
(Scherthan et al. 1994
Chromosomal Distribution of Tf Insertions To determine whether preferences for integration sites existed during the insertion of the 186 Tf sequences in the S. pombe genome, we compared the locations of solo LTRs and full-length Tf2s to the positions of all 4984 predicted ORFs of S. pombe. First, we found that all insertions were located exclusively in intergenic regions of the genome. As this included Tf1 and Tf2 sequences, the data suggested that both elements target intergenic regions for insertion. To further describe these insertions, we determined the type of intergenic regions into which each Tf element had inserted. Adjacent ORFs can be described as being in tandem, divergent, or convergent orientations, depending on the predicted direction of transcription (Fig. 4). The frequency of Tf insertions into each type of intergenic region is shown in Figure 4. There are 80 insertions into intergenic regions between tandem genes, 94 insertions into intergenic regions between divergent genes, and only seven insertions into regions between convergent genes.
The position of the insertions in intergenic sequences and the distribution of these insertions within the three classes of intergenic regions may have resulted from one of two types of insertion mechanisms. In one case, all intergenic sequences per unit length could be recognized with equal probability. This would result in the most insertions in the class of intergenic that comprises the largest fraction of the genome. A second type of mechanism, based strictly on equal recognition of pol II promoters, would result in more insertions in the class of intergenic that contains the most pol II promoters. We tested the validity of these two models by calculating for each the expected frequency of insertions into each class of intergenic and comparing these numbers to the observed number of insertions found in each intergenic (Fig. 4). To calculate the number of insertions assuming that, per unit length, all intergenic sequence was recognized equally, the fraction of the total intergenic sequence that each of the three classes represents was multiplied by the total number of insertions, 181. This resulted in an expected number of insertions of 86 between tandem genes, 68 between divergent genes, and 27 between convergent genes. Even though intergenic regions with convergent orientation comprise less of the genome and are shorter than both the tandem and divergent regions, we expected that 27 insertions would be in this type of intergenic. However, we observed only seven. The underrepresentation of insertions into regions with convergent orientation indicates a strong bias for insertions into regions predicted to contain promoters for RNA polymerase II. A chi-square calculation indicated that the underrepresentation of insertions in convergent versus tandem plus divergent regions was significant (P < 0.00004; data not shown). To test whether it could be strictly the polymerase II promoters that were responsible for the position of the insertions, we considered only the 174 insertions between tandem and divergent genes and calculated the number of insertions expected for these two types of intergenic sequences based on the assumption that each promoter was recognized with equal probability. There are 2604 promoters located in divergent spaces and 2291 located in the tandem spaces of the S. pombe genome (Supplemental Table 2). Based on the assumption that RNA polymerase II promoters were recognized equally as targets, we multiplied the total number of insertions, 174, by the fraction of the 4895 promoters located in divergent (0.53) and tandem (0.47) regions. By this calculation, we expect 92 insertions in divergent regions and 82 insertions in tandem regions. These numbers were very close to the observed insertions, indicating that polymerase II promoters were a key factor associated with the positions of the insertions.
To further investigate the exact position of the Tf insertions, we
calculated the distance between the end of each LTR or full-length Tf2 and the
end of the nearest ORF. This distance indicates the position of the
integration site chosen by the Tf elements relative to the nearest ORF. The
results are presented in the form of a histogram in
Figure 5. If the insertion was
closest to the 5' end of an ORF, we placed it 5' of the ORF in
Figure 5. The insertions on the
3' end of the ORF in Figure
5 represent insertions that were closer to the 3' end of an
ORF in the genome. The insertions were binned in intervals of 100 bp from each
end of the nearest ORF. We found a significant bias for insertions associated
with the 5' end of genes. Eighty-three percent were closer to the
5' than to the 3' end of an ORF. A large number of these clustered
between 100 bp and 400 bp from the 5' end of the neighboring ORF. This
places the insertions into the promoter proximal region of these genes. In
four instances, start sites of transcription for a neighboring gene were
predicted using the S. pombe 5' UTR database
(ftp://ftp.sanger.ac.uk/pub/yeast/pombe/UTRs/).
In all four cases, the Tf element inserted upstream of the predicted TATA box,
thus leaving the core promoter intact. This corresponds well with the finding
that of the eight insertions of Tf1 tested, none altered the expression level
of neighboring genes (Behrens et al.
2000
As indicated above, the large majority of the insertions, 83% (155/186), were found closer to the 5' end of a neighboring ORF. However, this value is biased, because there are no 5' ends between convergent genes and there are two 5' ends between divergent genes. To establish unambiguously whether insertions favored 5' over 3' ends, we examined only those insertions that occurred between genes in tandem orientation. The results were similar to those shown in Figure 5 (data not shown). Of 80 insertions between genes in tandem orientation, 73% (58/80) were found closer to the 5' end of the ORF. Moreover, all of the insertions cluster in a region between 100 and 400 bp from the predicted start of translation for each ORF.
The examination of Tf sequences throughout the genome revealed strong
biases in their position. These specific positions could be the result of
selective pressures, either positive or negative, that favor populations of
S. pombe with each of the patterns observed. Alternatively, the
patterns of the Tf sequences could be strictly the result of biochemical
mechanisms of integration that caused the patterns we observed. Each of the
biases in the position of Tf sequences described above were very similar in
pattern and magnitude to the positions of insertions resulting from the
induction of Tf1 transposition (Behrens et
al. 2000 The Ty LTR-retrotransposons of S. cerevisiae are known to insert upstream of pol III-transcribed genes such as tRNAs. In comparison, we find no association of Tf sequences with the pol III-transcribed genes of S. pombe. Instead, there are several large clusters of tRNA genes around the centromeres of all three S. pombe chromosomes that appear to exclude Tf insertions. The Tf insertions were also individually mapped onto the contigs that have been assembled for each of the three chromosomes of S. pombe. The positions of the insertions were grouped in bins of 50,000 bp and displayed along the length of each chromosome (Fig. 6). The positions of the full-length copies and fragments of Tf2s are shown beneath the axis of each chromosome. These results indicate that the elements are distributed similarly throughout both arms of all three chromosomes. However, an analysis of all of the insertions revealed a surprising bias for chromosome III. The density of Tf insertions in chromosome III was an average of 1.37 insertions/50,000 bp of contig sequence, almost exactly twice that of the other two chromosomes, 0.657/50,000 bp and 0.643/50,000 for chromosomes I and II, respectively. Chromosome III is the smallest S. pombe chromosome and also contains approximately 0.5 Mbp of rDNA repeats on each end.
To further investigate the higher density of insertions on chromosome III,
we examined the types and sizes of intergenic regions found on this
chromosome. Having established that Tf elements were more often found in
intergenic regions between ORFs that are divergently transcribed, we
considered the possibility that chromosome III may be enriched in divergent
ORFs relative to the other two chromosomes. Divergent intergenic regions
account for 26% (584/2220), 27% (482/1784), and 26% (236/894) of the total
number of intergene regions in chromosomes I, II, and III, respectively (see
Suppl. Table 2). Divergent intergenics account for 37% (Chr I), 40% (Chr II),
and 37% (Chr III) of the total intergenic sequence from each chromosome. This
indicates that there was no overrepresentation of divergent regions in
chromosome III. We also looked at whether the average sizes of the intergenic
regions were larger in chromosome III. When calculated, the average intergenic
lengths are found to be 917 bp in chromosome I, 945 bp in chromosome II, and
1079 bp in chromosome III. This results in a gene density for chromosome III
of one gene every 2790 bp, slightly lower than that found in chromosomes I and
II, which contain one gene every 2483 bp and 2457 bp, respectively. However,
we can think of no simple means for which the slightly larger regions of
divergent intergenics could account for the twofold increase in insertions
into chromosome III. We provide one alternative explanation for the enrichment
of Tf LTRs into chromosome III below. Nevertheless, the enrichment of Tf LTRs
in chromosome III as observed in the genome sequence was likely due to target
preferences, because the same twofold bias for chromosome III was seen with
the 78 insertions generated de novo
(Singleton and Levin
2002
wtfs In Table 3 we provide a simplified nomenclature along with the original cosmid annotations for the 25 wtf sequences. Perhaps the most surprising feature of the wtfs was that when mapped onto the chromosomal contigs, 23 of the 25 copies were located on chromosome III. One explanation for this unusual association is that chromosome III of 972 may have originated from an isolated population of S. pombe that had wtfs distributed on all three chromosomes. The alternative is that wtfs expanded specifically on chromosome III.
In total, 21 wtfs were flanked by intergenic regions that
contained 28 solo LTRs or LTR fragments, albeit in various numbers, lengths,
and orientations with respect to the wtfs
(Fig. 7). The association of
many LTRs with the wtfs led us to further investigate the nature of
their association. We wondered whether the enrichment of LTRs on chromosome
III could be due to their association with the wtfs that were also
found primarily on chromosome III. When the LTRs adjacent to wtfs
were excluded from consideration, the density of LTRs on chromosome III
relative to the other two chromosomes was reduced from twofold to only
1.2-fold. This suggests that 80% of the enrichment of LTRs on chromosome III
may have been due to the LTRs that are adjacent to the wtfs. If true,
this implies that the association of wtfs with LTRs was perhaps due
to a preference by Tfs for insertion into intergenic sequences that flank
wtfs. However, a preference for insertion next to wtfs was
not observed in the previous analyses of the insertions resulting from the
induction of Tf1 (Behrens et al.
2000
To investigate regions flanking wtfs and to identify the associated LTRs, we made DNA alignments of the wtf genes and their flanking regions. The DNA alignment of the wtfs (data not shown) indicated that there was a stretch of several hundred nucleotides beginning upstream of the predicted start codon of the wtfs and continuing into the first exon that appeared to be the most conserved region among all of the copies of wtf. This region was found to have an average pairwise identity of 78.6% ± 13.7%. More importantly, we found that 11 of the wtfs had LTRs positioned just upstream of the highly conserved region (Fig. 7). This suggests that this region may be a "hot spot" for the insertion of Tf elements. However, at this point in the analysis, we were unable rule out the possibility that some of the LTR/wtf associations originated as gene duplication events from an initial LTR/wtf progenitor. If this were the case, one might expect that the LTRs associated with the wtfs would form a well supported monophyletic group within the LTR phylogeny shown in Figure 3. To this end, we colored in red the label of the LTRs that are associated with the wtfs (Fig. 3). It can be seen that the red LTRs are distributed throughout the phylogram and do not form a monophyletic clade. In fact, several LTRs that flank wtfs are closely related to the recently active Tf1 and Tf2 families. In one case, identical TSDs were found flanking the LTR, indicating that this was a recent integration event of a Tf element adjacent to a wtf. However, this does not rule out the possibility that some duplication events may have occurred quite some time ago and have subsequently diverged, leaving no hint of a phylogenetic connection.
The nature of the mechanism(s) that led to the expansion of the
wtf family is unclear at present. The predicted amino acid sequences
of wtfs have no similarity to any known class of transposable
elements. However, their association with Tf LTRs led to the hypothesis that
they may have been retrotransposed in trans by the endogenous Tf
elements (Wood et al. 2002 The mRNAs of wtfs have been predicted to be multiply spliced, and the proteins are predicted to be membrane-associated. The presence of multiple predicted introns in the wtfs suggests that they have not been retrotransposed in trans by the Tf machinery. One would expect the introns to be lost during reverse transcription, as is the case in many retrotransposed pseudogenes.
In a separate analysis of eukaryotic lineage-specific gene expansions
(LSEs), wtfs were characterized as the largest family of genes
specific for S. pombe and were predicted to be nonglobular proteins
(Lespinet et al. 2002
It is interesting to speculate whether the higher transcription level of
wtfs may have contributed to the accumulation of Tf insertions
nearby. The insertion pattern of HIV-1 integrations into the human genome was
recently reported (Schroder et al.
2002
Summary We have presented evidence that supports previous reports indicating that Tf elements prefer to integrate into the intergenic regions of the S. pombe genome, and we have narrowed down this region to the promoter-proximal region of several pol II-transcribed genes. More importantly, we presented evidence that recognition of polymerase II promoters was a key component of target preference.
It was recently reported that the sole variant of histone H3 in yeast,
H3.3, is deposited near actively transcribing genes in other eukaryotes
(Ahmad and Henikoff 2002
Tf Sequence Retrieval and Homology Searches Tf-derived sequences in the S. pombe genome were initially annotated as a part of the genome sequencing project performed at the Wellcome Trust Sanger Institute (Wood et al. 2002 To identify solo Tf LTR sequences and other potential nucleotide fragments that are not in the coding regions of the Tf genomes, nucleotide sequences of Tf1 (acc. no. L10324 [GenBank] ) and Tf2 (acc. no. M38526 [GenBank] ) elements were used as queries for local BLASTN searches. Individual Tf1 and Tf2 LTRs as well as the internal regions of the LTRs were used independently to query complete chromosomal contigs of S. pombe. This allowed the number and location of the Tf sequences to be mapped onto the individual chromosomes for interchromosomal comparisons. Virtual S. pombe chromosome contigs (release March 22, 2002) were retrieved from the Wellcome Trust Sanger Institute anonymous FTP site at the address ftp://ftp.sanger.ac.uk/pub/yeast/pombe/Chromosom_contigs/. These sequences were converted to fasta format and placed in a single list file to use as a database to perform local homology-based searches using the BLAST package from NCBI obtained from ftp://ncbi.nlm.nih.gov/toolbox/ncbi_tools/ncbi.tar.gz. The blastall source code was replaced with the PowerMac G4 optimized version obtained from ftp://ftp.apple.com/developer/Tool_Chest/AGBLAST/blastall.gz for use on a PowerPC G4, Macintosh OS X, Version 10.1.5. To increase the sensitivity of BLASTN, parameters that approach the default wublast (W. Gish, 19962002; http://blast.wustl.edu/blast/cparms.html) parameters were used for the local searches as follows: -p blastn -q -4 -r 5 -G 10 -E 10 -F F -e 3 -W 9.
Tf Data Assembly All individual LTRs were then locally searched against the S. pombe cosmid database retrieved from ftp://ftp.sanger.ac.uk/pub/yeast/pombe/Cosmid_sequences/pombe/pombe.dbs in order to assemble Supplemental Table 1. Perl scripts were constructed to retrieve the cosmid coordinates and orientations and to generate the nomenclature. Each individual entry was inspected manually. When identical LTRs were found, correct cosmid coordinates were retrieved by individual BLAST searches using the chromosome coordinates as references for the location of the LTRs.
Statistical Significance of BLASTN Results
Calculation of Intergenic Type and Distances to Nearest ORFs
Nomenclature
Sequence Alignment and Phylogenetic Analysis of Tf Sequences
wtf Analyses
We thank Dr. Daniel Voytas for reading the manuscript and making valuable suggestions. The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1191603. [Supplemental material is available at www.genome.org.]
5 Present address: Department of Genetics, University of Georgia, Athens,
GA 30602, USA.
6 Corresponding author.
Ahmad, K. and Henikoff, S. 2002. The histone variant H3.3 marks active chromatin by replication-independent nucleosome assembly. Mol. Cell 9: 11911200.[CrossRef][Medline]
Behrens, R., Hayles, J., and Nurse, P. 2000. Fission
yeast retrotransposon Tf1 integration is targeted to 5' ends of open
reading frames. Nucleic Acids Res.
28:
47094716. Berbee, M. L. and Taylor, J.W. 1993. Dating the evolutionary radiations of the true fungi. Canadian J. Botany-Revue Canadienne De Botanique 718: 11141127. Boeke, J.D. and Stoye, J.P. 1997. Retrotransposons, endogenous retroviruses, and the evolution of retroelements. In Retroviruses (eds. J.M. Coffin et al.), pp. 343436. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
Bowen, N.J. and McDonald, J.F. 1999. Genomic analysis
of Caenorhabditis elegans reveals ancient families of retroviral-like
elements. Genome Res. 9:
924935. ____. 2001. Drosophila euchromatic LTR retrotransposons are much younger than the host species in which they reside. Genome Res. 119: 15271540.
Britten, R.J. 1998. Precise sequence complementarity
between yeast chromosome ends and two classes of just-subtelomeric sequences.
Proc. Natl. Acad. Sci.
95:
59065912.
Bryk, M., Banerjee, M., Conte Jr., D., and Curcio, M.J.
2001. The Sgs1 helicase of Saccharomyces cerevisiae
inhibits retrotransposition of Ty1 multimeric arrays. Mol. Cell
Biol 21:
53745388.
Chalker, D.L. and Sandmeyer, S.B. 1992. Ty3 integrates
within the region of RNA polymerase III transcription initiation.
Genes & Dev. 6:
117128. Coffin, J.M., Hughes, S.H., and Varmus, H.E. eds. 1997. Retroviruses. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
Costas, J. and Naveira, H. 2000. Evolutionary history
of the human endogenous retrovirus family ERV9. Mol. Biol.
Evol. 17:
320330. Csink, A.K. and McDonald, J.F. 1995. Analysis of copia sequence variation within and between Drosophila species. Mol. Biol. Evol. 12: 8393.[Abstract]
Ganko, E.W., Fielman, K.T., and McDonald, J.F. 2001.
Evolutionary history of Cer elements and their impact on the C.
elegans genome. Genome Res.
11:
20662074. Genetics Computer Group, 1999. Wisconsin Package Version 10.0. Madison, WI.
Goodwin, T.J. and Poulter, R.T. 2000. Multiple
LTR-retrotransposon families in the asexual yeast Candida albicans.
Genome Res. 10:
174191. Grewal, S.I. and Elgin, S.C. 2002. Heterochromatin: new possibilities for the inheritance of structure. Curr. Opin. Genet. Dev. 12: 178187.[CrossRef][Medline]
Hoff, E.F., Levin, H.L., and Boeke, J.D. 1998.
Schizosaccharomyces pombe retrotransposon Tf2 mobilizes primarily
through homologous cDNA recombination. Mol. Cell Biol
18:
68396352.
Jordan, I.K. and McDonald, J.F. 1999. Tempo and mode
of Ty element evolution in Saccharomyces cerevisiae.
Genetics 151:
13411351. Kapitonov, V. and Jurka, J. 1996. The age of Alu subfamilies. J. Mol. Evol. 42: 5965.[CrossRef][Medline]
Kim, J.M., Vanguri, S., Boeke, J.D., Gabriel, A., and Voytas, D.F.
1998. Transposable elements and genome organization: A
comprehensive survey of retrotransposons revealed by the complete
Saccharomyces cerevisiae genome sequence. Genome
Res. 8:
464478. Kumar, S., Tamura, K., Jakobsen, I.B., and Nei, M. 2001. MEGA2: Molecular evolutionary genetics analysis software. Bioinformatics 1712: 12441245. Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. 2001. Initial sequencing and analysis of the human genome. Nature 409: 860921.[CrossRef][Medline]
Lespinet, O., Wolf, Y.I., Koonin, E.V., and Aravind, L.
2002. The role of lineage-specific gene family expansion in the
evolution of eukaryotes. Genome Res.
12:
10481059. Levin, H.L. 1995. A novel mechanism of self-primed reverse transcription defines a new family of retroelements. Mol. Cell. Biol. 15: 33103317.[Abstract] ____. 1996. An unusual mechanism of self-primed reverse transcription requires the RNase H domain of reverse transcriptase to cleave an RNA duplex. Mol. Cell Biol. 16: 56455654.[Abstract] Levin, H.L. and Boeke, J.D. 1992. Demonstration of retrotransposition of the Tf1 element in fission yeast. EMBO J. 11: 11451153.[Medline]
Levin, H.L., Weaver, D.C. and Boeke, J.D. 1990. Two
related families of retrotransposons from Schizosaccharomyces pombe.
Mol. Cell Biol. 10:
67916798. Lipman, D.J., Wilbur, W.J., Smith,T.F., and Waterman, M.S. 1984. On the statistical significance of nucleic acid similarities. Nucleic Acids Res. 12: 215226.
Malik, H.S. and Eickbush, T.H. 1999. Modular evolution
of the integrase domain in the Ty3/Gypsy class of LTR retrotransposons.
J. Virol. 73:
51865190. Mata, J., Lyne, R., Burns, G., and Bahler, J. 2002. The transcriptional program of meiosis and sporulation in fission yeast. Nat. Genet. 32: 143147.[CrossRef][Medline]
Medstrand, P., van de Lagemaat, L.N., and Mager, D.L.
2002. Retroelement distributions in the human genome: Variations
associated with age and proximity to genes. Genome
Res. 12:
14831495. Roeder, G.S. and Fink, G.R. 1983. Transposable elements in yeast, pp. 299326. Academic Press, New York.
Rutherford, K., Parkhill, J., Crook, J., Horsnell, T., Rice, P.,
Rajandream, M.A., and Barrell, B. 2000. Artemis: Sequence
visualization and annotation. Bioinformatics
16:
944945.
SanMiguel, P., Tikhonov, A., Jin, Y.K., Motchoulskaia, N.,
Zakharov, D., Melake-Berhan, A., Springer, P.S., Edwards, K.J., Lee, M.,
Avramova, Z. et al. 1996. Nested retrotransposons in the
intergenic regions of the maize genome. Science
274:
765768.
Scherthan, H., Bahler, J. and Kohli, J. 1994. Dynamics
of chromosome organization and pairing during meiotic prophase in fission
yeast. J. Cell Biol.
127:
273285. Schroder, A.R., Shinn, P., Chen, H., Berry, C., Ecker, J.R., and Bushman, F. 2002. HIV-1 integration in the human genome favors active genes and local hotspots. Cell 110: 521529.[CrossRef][Medline] Schuler, G.D., Altschul, S.F., and Lipman, D.J. 1991. A workbench for multiple alignment construction and analysis. Proteins 9: 180190.[CrossRef][Medline]
Singleton, T.L. and Levin, H.L. 2002. A long terminal
repeat retrotransposon of fission yeast has strong preferences for specific
sites of insertion. Eukaryotic Cell
1:
4455. Sipiczki, M. 2000. Where does fission yeast sit on the tree of life? Genome Biol. 1: REVIEWS1011. Stajich, J.E., Block, D., Boulez, K., Brenner, S.E., Chervitz, S.A., Dagdigian, C., Fuellen, G., Gilbert, J.G.R., Korf, I., Lapp, H., et al. 2002. The Bioperl toolkit: Perl modules for the life sciences. G |