|
|
|
|
Published online before print
March 5, 2007, 10.1101/gr.5826307 Genome Res. 17:422-432, 2007 ©2007 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/07 $5.00
Letter The evolutionary history of human DNA transposons: Evidence for intense activity in the primate lineageDepartment of Biology, University of Texas at Arlington, Arlington, Texas 76019, USA
Class 2, or DNA transposons, make up 3% of the human genome, yet the evolutionary history of these elements has been largely overlooked and remains poorly understood. Here we carried out the first comprehensive analysis of the activity of human DNA transposons over the course of primate evolution using three independent computational methods. First, we conducted an exhaustive search for human DNA transposons nested within L1 and Alu elements known to be primate specific. Second, we assessed the presence/absence of 794 human DNA transposons at orthologous positions in 10 mammalian species using sequence data generated by The ENCODE Project. These two approaches, which do not rely upon sequence divergence, allowed us to classify DNA transposons into three different categories: anthropoid specific (4063 My), primate specific (6480 My), and eutherian wide (81150 My). Finally, we used this data to calculate the substitution rates of DNA transposons for each category and refine the age of each family based on the average percent divergence of individual copies to their consensus. Based on these combined methods, we can confidently estimate that at least 40 human DNA transposon families, representing 98,000 elements ( 33 Mb) in the human genome, have been active in the primate lineage. There was a cessation in the transpositional activity of DNA transposons during the later phase of the primate radiation, with no evidence of elements younger than 37 My. This data points to intense activity of DNA transposons during the mammalian radiation and early primate evolution, followed, apparently, by their mass extinction in an anthropoid primate ancestor.
Transposable elements (TEs) are mobile repetitive sequences that make up large fractions of mammalian genomes, including at least 45% of the human genome (Lander et al. 2001
Information on human DNA transposons is currently very scarce. This type of element makes up 3% of our genome (Lander et al. 2001
In a seminal study dedicated to human DNA transposons, Smit and Riggs (1996)
The most comprehensive age analysis of human DNA transposons published to date appeared in the initial analysis of the human genome sequence. This study concluded that, "there is no evidence for DNA transposon activity in the past 50 My in the human genome" (Lander et al. 2001
Here we present the first detailed analysis of the age of the 125 DNA transposon families currently recognizable in the human genome. In particular, we sought to evaluate which human DNA transposon families were actively transposing during primate evolution. To this end, we used a combination of three independent computational methods, two of which do not rely upon sequence divergence. We estimate that at least 40 families of DNA transposons were active during the primate radiation. We conclude that
Census of DNA transposons in the human genome We began our investigation by assessing the diversity and copy number of all DNA transposon families currently recognized in the human genome. Copy numbers were calculated from the RepeatMasker tables of the May 2004 assembly of the human genome, available through the UCSC Genome Browser (http://genome.ucsc.edu). In agreement with previous reports (Smit and Riggs 1996
Analysis of DNA transposons nested into other elements In order to obtain a first assessment of human DNA transposon families that were active during the primate radiation, we took advantage of the fine-scale evolutionary histories of L1 and Alu elements in the primate lineage produced by others. We used these primate-specific families as historical markers for dating DNA transposons. We reasoned that any DNA transposon inserted or nested within a primate-specific L1 element (Khan et al. 2006
This analysis revealed the presence of elements representing 10 distinct DNA element families that were inserted into primate-specific L1s (Table 2). Each of these insertions were validated by visual inspection of expected target-site duplications (TSDs) flanking the DNA transposon insertion based on alignment to the family consensus sequence (one example per family is shown in Table 2). The only exception was MER107 elements, whose insertion into L1 created a short deletion (from 11 to 19 bp) at the integration site accompanied by the coinsertion of unrelated "filler" DNA (1127 bp), instead of the 8-bp TSD canonical of hAT elements (see Supplemental Fig. 1). The youngest L1 elements that suffered a DNA transposon insertion belong to the L1PA8A family, estimated to be between 42 and 50 My old (Khan et al. 2006
In each case, we found that the length of the nested DNA transposons was similar to the length of its respective consensus sequence and that the nucleotide divergence of the L1 copy to its consensus sequence was congruent with the age of the corresponding L1 subfamily. For example, four distinct MER85 elements were found inserted into L1PA10 elements (one example is reported in Table 2). A TSD of the tetranucleotide sequence TTAA, a hallmark of piggyBac transposons, could be associated with each insertion. The length of the nested MER85 elements ranged from 126 to 130 bp, which is in good agreement with the length of the family consensus sequence (140 bp). The nested MER85 copies differ by 6.3% to 9.3% from their consensus sequence, while the average divergence for the family is 7.3% (as calculated from the May 2004 Repeatmasker files from the UCSC Genome Browser, see Methods). The L1PA10 elements that suffered insertions by MER85 elements were 8.5% to 23.9% divergent from their consensus (with three of the four being 15.3% diverged or less), which is consistent with the average 14.7% pairwise divergence of the L1PA10 subfamily (Khan et al. 2006
We found that six of the 10 DNA element families that had copies inserted into primate-specific L1 elements also comprise copies nested into dimeric Alu elements (Table 2), all of which are known to be primate specific (Kapitonov and Jurka 1996
Though the evolutionary history of primate LTR elements has not been analyzed as fully as those of L1 and Alu elements, we drew upon the available literature to provide further validation for the nested insertion analysis. Smit (1993)
Cross-species genomic analysis of orthologous insertions Taking advantage of the ongoing ENCODE Project and of other genomic resources accessible through the UCSC Genome Browser, we assessed the presence/absence of 794 human DNA transposon copies at orthologous positions in at least eight (and up to 10) other mammalian species, including three primate species (see Methods). These elements represent 111 of 125 families known in the human genome.
We found members of 11 DNA element families that were present at orthologous positions in human, Rhesus macaque (Macaca mulatta), and marmoset (Callithrix jacchus), but absent in the galago (Otolemur garnettii), a prosimian primate. We were able to identify "empty sites" in the respective orthologous galago loci for members of each DNA element family (one example per family is shown in Table 3) except for MER75B, since there is only one copy of MER75B within an ENCODE region and there is a large deletion within the galago lineage at that locus. These were clear "empty sites," with only one copy of the TSD and no additional sequences indicative of transposon excision. DNA transposon excisions are typically imprecise in that they leave behind one of the two TSD and/or a few terminal nucleotides of the transposon at the excision site (for example, see Plasterk 1991
Members of the remaining 100 human DNA transposon families represented in the ENCODE regions were found at orthologous positions in human, marmoset, and galago. To investigate which of these families was primate specific, each of the 316 copies, along with at least 100 bp flanking both the 5' and 3' ends, was visually inspected in the UCSC Genome Browser for their presence/absence at orthologous positions in at least five nonprimate mammals using the ENCODE Comparative Genomics tracks. Phylogenetically, these five species form two separate outgroups to the primates, with the mouse/rat/rabbit lineage being closer to the primates than the cow/dog lineage (Murphy et al. 2004 The remaining 77 families were represented by elements present at orthologous sites in all the primates and at least one of the other eutherian mammals; thus, these families were classified as eutherian-wide. It should be noted, however, that several families (such as Charlie1 and Charlie1a) included some copies that were apparently primate specific, as well as copies present at orthologous positions in primates and at least one of the nonprimate mammals. It could well be that the activity of these families initiated prior to the emergence of the primates, but continued in a primate ancestor, generating primate-specific insertions. Alternatively, lineage-specific sorting of ancient alleles with or without the insertions could also account for these patterns. In the absence of further evidence to distinguish between these possibilities, we adopted a conservative classification of such families as eutherian-wide.
To determine whether the other 14 DNA element families not represented within the ENCODE regions were primate specific, we searched the finished mouse, rat, and dog genomes for the presence or absence of TEs at orthologous positions to the human DNA elements (see Methods). We found that six families of elements (MER6C, Ricksha, Ricksha_b, Ricksha_c, Tigger5a, and Tigger5b) were present in primates, but clearly absent in the mouse, rat, and dog. These families were additionally classified as primate specific. The other eight families were found to be present in the primates and in the mouse, rat, or dog genomes. These families were classified as eutherian-wide. Together, the cross-species analysis of orthologous insertions suggests that a minimum of 40 distinct DNA transposon families, which accounts for 98,300 DNA elements currently fixed in the human genome, were active in the primate lineage, i.e., within the last
Age of DNA transposons based on sequence divergence
This approach allowed us to calculate a substitution rate per million years that takes into account possible fluctuations in evolution rates during the different time periods and among the three types of elements (see Methods; Supplemental Table 2). The results reveal that DNA transposons, L1s, and Alus each have different average substitution rates in the different evolutionary periods (Table 4). There are statistically significant differences among the three types of TEs within the same period. For example, within the anthropoid-specific lineage (4163 My), which was the only period for which mutation rates could be estimated for all three types of TEs, the differences were statistically significant (P < 0.05, ANOVA). These differences may be attributed to several factors, such as biased genomic distributions (e.g., Alus preferentially accumulate in GC-rich regions), compositional biases, replication mechanisms (reverse transcriptase for RNA elements, DNA polymerase for DNA transposons), and amplification dynamics (subfamily structure). Regardless, this data provides a rationale for treating DNA transposons separately from the other types of TEs in the human genome in these calculations as opposed to using substitution rates estimated for other type of elements or for other neutrally evolving sequences such as pseudogenes (for example, see Robertson and Martos 1997
We next applied the substitution rate of each type of element for each period to estimate the age of individual families based on their average nucleotide divergence to the respective consensus sequence, excluding CpG sites (see Methods). This method differed slightly from other datings of TEs in which a single, constant substitution rate for the entire span of primate evolution was used (for examples, see Price et al. 2004
This data was used to generate a plot of the age of DNA transposons and L1s as a function of their copy number (Fig. 1). This representation shows that our dating of DNA transposons is in good agreement with the nested insertion analysis, but provides a better resolution of the age of an individual family. For example, MER85 and MADE1 are among the youngest DNA elements, with estimated ages of 37 and 46 Mya, respectively, and both families included members nested within L1PA10 elements (Table 2), a subfamily that we dated as 51-My old (in agreement with Khan et al. 2006Our dating of individual DNA transposon families based on sequence divergence and calculated age is also largely congruent with the cross-species analysis. All elements classified as eutherian-wide from this analysis were found to be older than 65 My based on sequence divergence, with all but one family (MER53) being older than 70 My. MER53 is an outlier due to its unusually high 70% AT content (Table 5). All DNA transposon families classified as primate specific by the cross-species analysis were estimated to be between 57 and 78 My based on sequence divergence.
Figure 1 reveals that there were two bursts of DNA transposon activity in the time period between the mammalian radiation and the split of New World Monkeys from the primate ancestor. The first peak is the most pronounced and involves primarily members of the hAT superfamily and spanned a period of
While the evolutionary history of human Alu and L1 retrotransposons has been studied intensively, the history of DNA transposons has largely been overlooked. In this study we have combined three different approaches to determine the average age of all 125 DNA transposon families known in the human genome. The results of the three approaches converge to reveal that a substantial fraction of human DNA transposon families (at least 40 and up to 69 families, see Tables 5, 6 98,000 elements in our genome, were transpositionally active in the primate lineage. Below we first discuss the value of combining various methods for estimating the age of TEs, then turn to the specific implications of our findings for primate genome evolution and for understanding the forces underlying the amplification dynamics of TEs in mammalian genomes.
Combined methods provide a detailed estimate of the age of human DNA transposons As new genome sequences are released, different methods for dating transposable elements are being developed that allow greater accuracy in estimating the age of TEs (Price et al. 2004 The first method, nested insertion analysis, capitalizes on the well-characterized history of L1 and Alu elements and provides an estimation of the relative age of TE families. This method does not rely on the molecular clock and can be performed even in the absence of genomic sequences for other closely related species. The shortcoming of this approach is that not every TE family member will necessarily suffer an insertion from every other TE family, thereby leading to gaps in the data, especially for TE families with low copy numbers. The second method (cross-species analysis of orthologous insertions) takes advantage of the large amount of sequence data recently generated for several mammalian species and is also independent of the molecular clock. These first two methods deliver a rough, yet robust evaluation of the time periods when the elements were active. This, in turn, provided us the means to calibrate the molecular clock at three different evolutionary time points, which allowed the refinement of the age of each human DNA transposon family based on nucleotide divergence of individual copies to their ancestral consensus sequence. There is only partial overlap between the results gathered by the three methods (Fig. 2), which emphasizes the value of combining the three methods to derive a reliable history of the entire population of human DNA transposons.
Of the 125 DNA element families currently recognizable in the human genome, a total of 11 families could be classified as primate specific by all three methods (Fig. 2). Since nested insertion analysis does not allow us to examine all DNA transposon families, but only a subset of them, one cannot expect a complete overlap between the results produced by the three different methods. Sixty-nine families of DNA transposons were predicted to be primate specific based on sequence divergence data and thus, calculated age, alone (Table 5), and 40 of these families were confirmed to be primate specific by at least one of the two alternative methods. Hence, we believe that this set of 40 families provides a reliable, yet conservative estimate of DNA transposons that were integrated during primate evolution. The corresponding families range in age from MER85 (37 My) to Tigger5c (78 My) and have contributed 98,300 elements (totaling 33 Mb of sequence) to the human genome (Fig. 2). Furthermore, the results of divergence and cross-species and/or nested insertion analysis confirm that nearly one-fourth of these transposons (23,462 elements, 5 Mb) have likely been inserted since the split of anthropoid primates from prosimian primates, or within the last 63 My (according to Goodman 1999Thirty families were predicted to be primate specific based on their sequence divergence and calculated age but were shown to be eutherian-wide by cross-species analysis. However, for some of these families, such as MER53, Charlie5, and MER33 (65, 72, 72 My, respectively), we could detect copies present at orthologous positions in all eutherian species examined, which strongly indicates that at least a subset of family members inserted prior to the divergence of placental mammals. Thus, sequence divergence alone may not always be an accurate measure of the age of TEs. It could be that, for various reasons, members of these families evolve more slowly overall than other families. A non-mutually exclusive explanation is that these families include a subfamily of primate-specific elements as well as older elements. Further analyses are required to distinguish between these possibilities.
A general extinction of DNA transposon activity in the anthropoid lineage
What could have provoked the extinction of DNA transposons that would not have affected the propagation of L1, which continued to thrive even after the emergence of new world monkeys (see Fig. 1; Khan et al. 2006
Contribution of DNA transposons to primate genome evolution
Prior to this study, the history of human DNA transposons has been largely neglected relative to those of retroelements (Alus and L1s). One reason for this is the common belief that active DNA transposon families have long been extinct and that they are currently only represented by very ancient molecular "fossils" immobilized in the genome. Indeed, unlike Alus and L1s (Deininger et al. 2003
Calculation of average percent divergence from RepeatMasker output The average percent divergence of each transposable element family was calculated using the RepeatMasker rmsk files from the UCSC Genome Browser for the May 2004 assembly. The percent divergence (milliDiv) of each distinct element within a transposable element family was weighted by the length of the element. The average percent divergence was weighted to account for the vast difference in sizes between currently recognized elements of the same family. To calculate the average percent divergence for each family, the percent divergence calculated by RepeatMasker was multiplied by the length of the element and the sum of all elements in each family was divided by the sum of the total length of all elements in the family.
Nested insertions
Cross-species genomic analysis of orthologous insertions
Calculation of substitution rates and dating according to sequence divergence To calculate the substitution rate, the corrected substitutions per site was divided by the median age for the class (anthropoid, primate, or eutherian specific). The rate for each TE within the age class was weighted by the percentage of the total number of bases that TE comprised of the total base length for the entire class. These weighted rates were then summed, giving a corrected substitution rate for the entire class. The age of the family was calculated by multiplying the corrected substitution rate by the corrected percent divergence for the family.
We thank Ellen Pritham for critical reading of the manuscript, Don Hucks for support with PAML, and members of the Genome Biology Group at UTA for stimulating discussions. This work was supported by funds from the University of Texas at Arlington.
1 Corresponding author.
E-mail cedric{at}uta.edu; fax (817) 272-2855. [Supplemental material is available online at www.genome.org.] Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.5826307
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215: 403410.[CrossRef][Medline] Auge-Gouillou, C., Bigot, Y., Pollet, N., Hamelin, M.H., Meunier-Rotival, M., and Periquet, G. 1995. Human and other mammalian genomes contain transposons of the mariner family. FEBS Lett. 368: 541546.[CrossRef][Medline] Caceres, M., Ranz, J.M., Barbadilla, A., Long, M., and Ruiz, A. 1999. Generation of a widespread Drosophila inversion by a transposable element. Science 285: 415418. Caspi, A. and Pachter, L. 2006. Identification of transposable elements using multiple alignments of related genomes. Genome Res. 16: 260270. Cordaux, R., Udit, S., Batzer, M.A., and Feschotte, C. 2006. Birth of a chimeric primate gene by capture of the transposase gene from a mobile element. Proc. Natl. Acad. Sci. 103: 81018106. Craig, N.L., Craigie, R., Gellert, M., and Lambowitz, A.M. 2002. Mobile DNA II. American Society for Microbiology Press, Washington, D.C. Deininger, P.L., Moran, J.V., Batzer, M.A., and Kazazian Jr., H.H. 2003. Mobile elements and mammalian genome evolution. Curr. Opin. Genet. Dev. 13: 651658.[CrossRef][Medline] Demattei, M.V., Auge-Gouillou, C., Pollet, N., Hamelin, M.H., Meunier-Rotival, M., and Bigot, Y. 2000. Features of the mammal mar1 transposons in the human, sheep, cow, and mouse genomes and implications for their evolution. Mamm. Genome 11: 11111116.[CrossRef][Medline] Eichler, E.E. and Sankoff, D. 2003. Structural dynamics of eukaryotic chromosome evolution. Science 301: 793797. Eickbush, T.H. and Malik, H.S. 2002. Origins and evolution of retrotransposons. In Mobile DNA II (eds. N.L. Craig et al.), pp. 11111144. ASM Press, Washington, D.C. Emerman, M. 2006. How TRIM5 Feschotte, C. 2004. Merlin, a new superfamily of DNA transposons identified in diverse animal genomes and related to bacterial IS1016 insertion sequences. Mol. Biol. Evol. 21: 17691780. Feschotte, C., Zhang, X., and Wessler, S.R. 2002. Miniature inverted-repeat transposable elements and their relationship to established DNA transposons. In Mobile DNA II (eds. N.L. Craig et al.), pp. 11471158. ASM Press, Washington, D.C. Feuk, L., Carson, A.R., and Scherer, S.W. 2006. Structural variation in the human genome. Nat. Rev. Genet. 7: 8597.[Medline] Goodman, M. 1985. Rates of molecular evolution: The hominoid slowdown. Bioessays 3: 914.[CrossRef][Medline] Goodman, M. 1999. The genomic record of humankinds evolutionary roots. Am. J. Hum. Genet. 64: 3139.[CrossRef][Medline] Gray, Y.H. 2000. It takes two transposons to tango: Transposable-element-mediated chromosomal rearrangements. Trends Genet. 16: 461468.[CrossRef][Medline] Hagemann, S. and Pinsker, W. 2001. Drosophila P transposons in the human genome? Mol. Biol. Evol. 18: 19791982. Hartl, D.L., Lohe, A.R., and Lozovskaya, E.R. 1997. Modern thoughts on an ancyent marinere: Function, evolution, regulation. Annu. Rev. Genet. 31: 337358.[CrossRef][Medline] Hickey, D.A. 1982. Selfish DNA: A sexually-transmitted nuclear parasite. Genetics 101: 519531. Inoue, K. and Lupski, J.R. 2002. Molecular mechanisms for genomic disorders. Annu. Rev. Genomics Hum. Genet. 3: 199242.[CrossRef][Medline] Jurka, J., Kapitonov, V.V., Pavlicek, A., Klonowski, P., Kohany, O., and Walichiewicz, J. 2005. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 110: 462467.[CrossRef][Medline] Kapitonov, V. and Jurka, J. 1996. The age of Alu subfamilies. J. Mol. Evol. 42: 5965.[CrossRef][Medline] Kapitonov, V.V. and Jurka, J. 2004. Harbinger transposons and an ancient HARBI1 gene derived from a transposase. DNA Cell Biol. 23: 311324.[CrossRef][Medline] Kazazian Jr., H.H. 2004. Mobile elements: Drivers of genome evolution. Science 303: 16261632. Khan, H., Smit, A., and Boissinot, S. 2006. Molecular evolution and tempo of amplification of human LINE-1 retrotransposons since the origin of primates. Genome Res. 16: 7887. Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16: 111120.[CrossRef][Medline] Kulpa, D.A. and Moran, J.V. 2006. Cis-preferential LINE-1 reverse transcriptase activity in ribonucleoprotein particles. Nat. Struct. Mol. Biol. 13: 655660.[CrossRef][Medline] Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. 2001. Initial sequencing and analysis of the human genome. Nature 409: 860921.[CrossRef][Medline] Lim, J.K. and Simmons, M.J. 1994. Gross chromosome rearrangements mediated by transposable elements in Drosophila melanogaster. Bioessays 16: 269275.[CrossRef][Medline] Lindblad-Toh, K., Wade, C.M., Mikkelsen, T.S., Karlsson, E.K., Jaffe, D.B., Kamal, M., Clamp, M., Change, J.L., Kulbokas III, E.J., Zody, M.C., et al. 2005. Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 438: 803819.[CrossRef][Medline] Morgan, G.T. 1995. Identification in the human genome of mobile elements spread by DNA-mediated transposition. J. Mol. Biol. 254: 15.[CrossRef][Medline] Murphy, W.J., Pevzner, P.A., and OBrien, S.J. 2004. Mammalian phylogenomics comes of age. Trends Genet. 20: 631639.[CrossRef][Medline] Oosumi, T., Belknap, W.R., and Garlick, B. 1995. Mariner transposons in humans. Nature 378: 672.[CrossRef][Medline] Plasterk, R.H. 1991. The origin of footprints of the Tc1 transposon of Caenorhabditis elegans. EMBO J. 10: 19191925.[Medline] Price, A.L., Eskin, E., and Pevzner, P.A. 2004. Whole-genome analysis of Ali repeat elements reveals complex evolutionary history. Genome Res. 14: 22452252. Robertson, H.M. 1996. Members of the pogo superfamily of DNA-mediated transposons in the human genome. Mol. Gen. Genet. 252: 761766.[Medline] Robertson, H.M. 2002. Evolution of DNA transposons in eukaryotes. In Mobile DNA II (eds. N.L. Craig et al.), pp. 10931110. ASM Press, Washington, D.C. Robertson, H.M. and Martos, R. 1997. Molecular evolution of the second ancient human mariner transposon, Hsmar2, illustrates patterns of neutral evolution in the human genome lineage. Gene 205: 219228.[CrossRef][Medline] Robertson, H.M. and Zumpano, K.L. 1997. Molecular evolution of an ancient mariner transposon, Hsmar1, in the human genome. Gene 205: 203217.[CrossRef][Medline] Salem, A., Ray, D.A., Hedges, D.J., Jurka, J., and Batzer, M.A. 2005. Analysis of the human Alu Ye lineage. BMC Evol. Biol. 5: 1827.[CrossRef][Medline] Silva, J.C. and Kidwell, M.G. 2000. Horizontal transfer and selection in the evolution of P elements. Mol. Biol. Evol. 17: 15421557. Sarkar, A., Sim, C., Hong, Y.S., Hogan, J.R., Fraser, M.J., Robertson, H.M., and Collins, F.H. 2003. Molecular evolutionary analysis of the widespread piggyBac transposon family and related "domesticated" sequences. Mol. Genet. Genomics 270: 173180.[CrossRef][Medline] Smit, A.F.A. 1993. Identification of a new, abundant superfamily of mammalian LTR-transposons. Nucleic Acids Res. 21: 18631872. Smit, A.F. 1999. Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr. Opin. Genet. Dev. 9: 657663.[CrossRef][Medline] Smit, A.F.A. and Riggs, A.D. 1996. Tiggers and DNA transposon fossils in the human genome. Proc. Natl. Acad. Sci. 93: 14431448. Smit, A.F.A., Toth, G., Riggs, A.D., and Jurka, J. 1995. Ancestral, mammalian-wide subfamilies of LINE-1 repetitive sequences. J. Mol. Biol. 246: 401417.[CrossRef][Medline] Szak, S.T., Pickeral, O.K., Makalowski, W., Boguski, M.S., Landsman, D., and Boeke, J.D. 2002. Molecular archeology of L1 insertions in the human genome. Genome Biol. 3: 118.[Medline] Tavare, S. 1986. Some probabilistic and statistical problems on the analysis of DNA sequences. In Lectures in mathematics in the life sciences, Vol. Vol 17, pp. 5786. American Mathematical Society, Providence, RI. Waterston, R.H., Lindblad-Toh, K., Bimey, E., Rogers, J., Abril, J.F., Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., An, P., et al. 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420: 520562.[CrossRef][Medline] Wei, W., Gilbert, N., Ooi, S.L., Lawler, J.F., Ostertag, E.M., Kazazian, H.H., Boeke, J.D., and Moran, J.V. 2001. Human L1 retrotransposition: cis preference versus trans complementation. Mol. Cell. Biol. 21: 14291439. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||