|
|
|
Published online before print
September 20, 2001, 10.1101/gr.198301
Vol. 11, Issue 10, 1677-1685, October 2001
LETTER
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |
ABSTRACT |
|---|
|
|
|---|
Unlike human L1 retrotransposons, the 5' UTR of mouse L1 elements contains tandem repeats of ~200 bp in length called monomers. Multiple L1 subfamilies exist in the mouse which are distinguished by their monomer sequences. We previously described a young subfamily, called the TF subfamily, which contains ~1800 active elements among its 3000 full-length members. Here we characterize a novel subfamily of mouse L1 elements, GF, which has unique monomer sequence and unusual patterns of monomer organization. A majority of these GF elements also have a unique length polymorphism in ORF1. Polymorphism analysis of GF elements in various mouse subspecies and laboratory strains revealed that, like TF, the GF subfamily is young and expanding. About 1500 full-length GF elements exist in the diploid mouse genome and, based on the results of a cell culture assay, ~400 GF elements are potentially capable of retrotransposition. We also tested 14 A-type subfamily elements in the assay and estimate that about 900 active A elements may be present in the mouse genome. Thus, it is now known that there are three large active subfamilies of mouse L1s; TF, A, and GF, and that in total ~3000 full-length elements are potentially capable of active retrotransposition. This number is in great excess to the number of L1 elements thought to be active in the human genome.
| |
INTRODUCTION |
|---|
|
|
|---|
Non-LTR LINE1 (L1) elements are non-long terminal repeat (non-LTR)
retrotransposons that are capable of autonomous
retrotransposition and have expanded to large copy numbers in mammalian
genomes. The mouse genome contains >100,000 L1s comprising ~10% of
the genomic DNA (Hutchison et al. 1989
). Most L1 elements are inactive, a consequence of 5'-end truncation, inversion, or mutation. The consensus sequence for full-length mouse L1s is ~7 kb and, like human
L1s, has two open reading frames (ORFs) and a 3' poly(A) tail, and is
flanked by short target site duplications (TSDs). Unlike human L1s,
mouse L1s have a bipartite 5' UTR consisting of tandemly repeated
sequences of ~200 bp called monomers, which are situated upstream of
single-copy, nonmonomeric sequence. By linking monomers to reporter
genes, it has been shown that they possess promoter activity and that
increasing the number of monomers increases the level of transcription
(Severynse et al. 1992
; DeBerardinis and Kazazian 1999
).
Phylogenetic analyses by Adey et al. (1994a)
suggest that mouse L1
evolution has been dominated by a single lineage which has spawned
several subfamilies of L1s differing in sequence at their 5' ends.
A-type subfamily and F-type subfamily members have monomer lengths of
208 bp and 206 bp, respectively (Fanning 1983
; Loeb et al. 1986
;
Padgett et. al. 1988
). Members of the more ancient V subfamily probably
lack a 5'-repeated sequence (Jubier-Maurin et al. 1992
). For some time
it was thought that all F elements and most A elements had been
rendered inactive by mutation, but that a small subset of A elements
remained active in the genome. Evidence supporting this hypothesis
includes the findings that some A-type elements are highly similar to
each other, possess two intact ORFs, and are transcribed. However,
their capacity for retrotransposition was not directly assayed (Shehee
et al. 1987
; Schichman et al. 1992
; Martin 1995
).
Unexpectedly, two disease-causing insertions, L1spa and
L1orl, were found to be members of a large, young, and
expanding subfamily of mouse L1s (Kingsmore et al. 1994
; Takahara et
al. 1996
; Naas et al. 1998
). The monomer sequence belonging to members
of this TF subfamily was on average 72% identical to the
consensus monomer sequence of F-type L1s. Based on the results of a
retrotransposition assay of 11 randomly cloned TF-type L1s in
cell culture, we estimated that a majority of full-length TF
elements are potentially active (DeBerardinis et al. 1998
).
Here we present evidence that a previously unreported subfamily of L1 elements exists in high copy number in the mouse genome and that many members of this novel subfamily are likely capable of retrotransposition.
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 USC section 1734 solely to indicate this fact.
| |
RESULTS |
|---|
|
|
|---|
A Unique 5' UTR Distinguishes a New Mouse L1 Subfamily
Examination of the GenBank database revealed the presence of a
previously undescribed subfamily of mouse L1 elements remarkable for
their 5' UTR regions. Like those of the previously described TF elements (DeBerardinis et al. 1998
), the 5' monomer arrays of L1s from this new subfamily are related to, but clearly distinct from, the monomers of F subfamily members. We have called these GF elements and have extracted from the database 17 members
for characterization (Table 1).
|
The 17 GF elements contain a variable number of monomers,
averaging 2.4 with a maximum of 6. We aligned the sequences of 34 full-length monomers to obtain a consensus sequence 206 bp in length
and 69% identical to the reconstructed F monomer consensus sequence
described by Adey et al. (1994b)
and 67% identical to the 212-bp
TF monomer consensus (DeBerardinis and Kazazian 1999
). Individual monomers ranged in size from 204 to 207 bp. The 3' end of
the GF consensus monomer sequence shown in Figure
1A corresponds to the point at which the
monomer array joins the nonmonomeric 5' UTR region 250 bp upstream of
the start of GF ORF1. The monomer array of a TF
element begins 258 bp upstream of TF ORF1. The nonmonomeric 5' UTRs of the GF consensus and TF consensus
sequences are 85% identical.
|
The organization of the 5' UTR of GF elements is atypical of mouse L1s. The nonmonomeric region of the 5' UTR of all GF elements contains a 47-nt sequence which is 82% identical to a portion of the F monomer consensus (nts 132-178) (Fig. 1B). The monomer arrays are organized in four distinct patterns. At the 3' end of many arrays are 5' truncated monomers which exist singly (patterns II and III) or as a pair (pattern I) and which consist of the last 64 bps (nts 143-206) of a full-length monomer. Pattern III also has two additional truncated monomers upstream of the first full-length monomer. All of the abbreviated monomers are the same length and only 77%-84% identical to the consensus GF monomer (69%-75% identical to the F consensus). Most GF elements fall within patterns I and II whose full-length monomers are greater than 95% identical to the GF consensus (~70% identical to the F consensus).
DeBerardinis and Kazazian (1999)
noted that many TF L1s were
truncated at their 5' ends in the vicinity of a predicted binding site for the transcription factor YY1 (GCCATCTT, Fig. 1A). A
YY1 site is also present in F monomers and in the human L1 5'
UTR (Becker et al. 1993
). Twenty-six of the 31 TF sequences
examined were truncated within 25 bp upstream or downstream of the
monomer YY1 binding sequence (DeBerardinis and Kazazian 1999
; J.L.
Goodier, unpubl.). All GF monomers have a single nucleotide
change which alters this sequence (GCCTTCTT), and this may
explain why GF elements do not tend to truncate near
this site (only 3 of 17 elements truncated within 25 nts of this sequence).
ORF1 Length Polymorphism Region (LPR)
All mouse L1 subfamilies contain an LPR within the N-terminal
one-third of ORF1, consisting of tandemly repeated blocks of 66 bp
and/or 42 bp that do not interrupt the reading frame of the protein.
These LPRs were defined by Schichman et al. (1992)
and Adey et al.
(1994a)
, and the system of nomenclature proposed by Mears and Hutchison
(2001)
is summarized in Figure 2A. Most A-type elements have 66-42 bp (Group I) or 66-42-42 bp (Group II)
repeat structures, although some also have a 66-66 bp (Group III)
pattern. F-type elements belong to Groups I-IV. Although almost all
TF elements have a Group II repeat structure (DeBerardinis et
al. 1998
; J.L. Goodier, unpubl.), Mears and Hutchison (2001)
identified
a novel TF Group I element. Thirty-five percent (6/17) of
GF L1s also belong to Group II. However, the majority of
GF elements have an extra 42-nt repeat (i.e., 66-42-42-42 nt), a pattern not previously reported for any mouse L1 element. We
have therefore placed GF elements in a new LPR group, Group
V. The four LPR repeats of the GF consensus sequence are
aligned in Figure 2B.
|
Features of the Body of GF Elements
We derived a nucleotide consensus sequence for the bodies of the 17 GF members of our dataset beginning at the nonmonomeric 5'
UTR. The bodies of the 17 GF elements are 92.6%-99.6%
identical to their consensus sequence (Table 1). In contrast, 11 randomly cloned TF elements were on average 99.8% identical
to their consensus sequence (DeBerardinis et al. 1998
). This suggested
that different subgroups of GF elements exist and that at
least some of these are older than TF elements. We therefore
performed neighbor-joining phylogenetic analyses which included both
the GF members of our dataset and F, A, and TF
elements of the testset used in the analysis by Mears and Hutchison
(2001)
. Results using a portion of the nonmonomeric 5' UTR defined as
region
by Mears and Hutchison (2000)
and corresponding to
nucleotides 1539-1740 of L1MdA2 (Loeb et al. 1986
) are shown in Figure
3A.
|
Three distinct GF clades (labeled G-I, G-II, and G-III) are
evident and are supported by 1000 rounds of bootstrap analysis. Most of
the elements (9 of 17) belong to clades G-I and are on average 99.1%
identical to each other and 94.0% identical to the TF
consensus (DeBerardinis et al. 1998
). These elements also have monomers
closest in sequence to the consensus and 5' UTR structures as depicted
in Figure 1B, patterns I and II. The existence of a subset of clade
G-I, designated G-I2, is weakly supported across the length
of the element. Members of clade G-II are on average 97.9% identical
to each other, and members of clade G-III are most dissimilar (94.8%)
and have 5' UTR patterns III and IV (Fig. 1B). Only GF68
(Clade G-II) and members of clade G-III belong to ORF1 LPR Group II;
all others belong to the novel LPR Group V (66-42-42-42 bp). Two
members of clade G-III, GF36 (Fnb1) and GF26
(Fmu1), were previously defined as representatives of a young clade of
F elements (F-II2) by Mears and Hutchison (2001)
. While these authors
did not describe the monomer structure of these elements, it is now
evident that the monomers are more closely related to the GF
(87% identity) than the F (74% identity) monomer consensus.
Phylogenetic analysis of a portion of the 3' UTR (nts 6701-7333 of
L1MdA2 and region
as defined by Mears and Hutchison 2001
) supports
clades G-I and G-II which were established by the 5' UTR analysis (Fig.
3B). However, clade G-III splits into two subgroups whose members have
undergone separate recombination events. As proposed by Mears and
Hutchison (2001)
, GF36 (Fnb1) and GF26 (Fmu1), together with GF251, have been formed by recombination with
either an A-type Group II LPR element or a young F-type element
donating the 3' end. On the other hand, GF84 and
GF253 group with older F-type elements at their 3' ends.
Phylogenetic analysis of the set of subsequences spanning the entire L1
used by Mears and Hutchison (2001)
reveal that the former recombination
breakpoint occurred in the central portion of ORF2 (regions
, nts
2798-4115, or
, nts 4116-5455), whereas the latter occurred closer
to the N-terminus of ORF2 (region
). Clades G-I and G-II are
supported by analyses of regions spanning the length of the L1, and
these clades do not contain recombinant members. Included in the
analysis of region
are L1s designated as members of a new "Z
subfamily" by Hardies et al. (2000)
. These L1s are not GF
elements and, as previously reported by Mears and Hutchison (2001)
, are
in fact members of the TF subfamily.
GF Elements Transduce 3' Flanking DNA
Retrotransposing L1s can occasionally transduce nonL1 DNA flanking
their 3' ends to new genomic locations. This is a consequence of the
inherent weakness of the L1 polyadenylation signal which during
transcription is occasionally bypassed in favor of a stronger signal
downstream. We previously identified (in the mouse genome database) two
transduction events involving A-type elements which in both instances
mobilized 3 kb of sequence (Goodier et al. 2000
). Two GF
elements in our dataset have transduced flanking DNA (Fig. 4). Element GF71 transduced 558 nts, while the 3' flank of element GF27 is the final product
of two consecutive transduction events, an initial movement of 24 nts
followed by a second retrotransposition event and the transduction of
an additional 1033 nts.
|
GF Elements Have Been Recently Active in the Mouse
The high degree of sequence similarity among some of the GF elements implies that these elements may have recently dispersed within the mouse genome. To test this, we examined the genomes of Mus subspecies and laboratory strains for the presence or absence of three GF elements (Fig. 5). We designed PCR primers flanking the elements which would amplify a 7.5-kb filled site or a 450-bp empty site. None of the GF elements were detected in Mus spretus or Mus musculus genomes. GF13 was detected in laboratory strains only, whereas GF21 was present in M. m. castaneus but absent from some laboratory strains. Surprisingly, GF46 was absent from all samples tested. However, it was possible to amplify GF46 DNA from the bacmid library clone sequenced for the GenBank entry. The library was generated from an embryonic stem cell library of strain 129/Sv. We confirmed the presence of the L1 sequence in this clone by PCR amplification and sequencing.
|
Some A and GF Subfamily Members Have Retrotransposition Capability
We previously tested A-type L1s in the cell culture
retrotransposition assay (DeBerardinis et al. 1998
). From phage library clones we PCR-amplified ten A-type elements without their monomers and
cloned these into a retrotransposition vector under transcriptional control of a CMV promoter (Moran et al. 1996
). One of the ten cloned
elements supported retrotransposition. However, the possibility existed
that PCR errors introduced during cloning could have inactivated otherwise potentially active elements. Therefore, we directly isolated
eight A-type elements (including four elements tested in the previous
experiment and four new elements) from bacteriophage and inserted each
without their monomers into a retrotransposition vector in which both
the monomer array of the TF element L1spa and the
CMV promoter directed expression. In all, two (A101 and A102) of a
total of 14 different A elements from both experiments (14%) were
capable of supporting retrotransposition (Table
2). Although transcription of these A-type
elements was not driven by their endogenous promoters, Severynse et al.
(1992)
showed that a single A monomer is sufficient to direct
transcription. Since A101 possesses 3.5 monomers and A102 possesses at
least two monomers, it is probable that these two elements are capable of transcription and retrotransposition in vivo. Saxton and Martin (1998)
estimated that the diploid genome of mouse strain 129/Sv contains 6500 full-length A-type elements. If 14% are active, then
over 900 potentially active A elements reside in the genome.
|
Seven of the 14 A-type elements tested in the retrotransposition assay
have a Group II ORF1 LPR (66-42-42 nts). Both active elements have a
Group I (66-42 nts) LPR and belong to a younger transcribed A-type
subset whose members possess >99.5% identity with each other in their
5' UTR and ORF1 regions (Schichman et al. 1992
). RNP-58B, a cDNA copy
of an L1 RNA isolated by Martin (1995)
from a ribonucleoprotein (RNP)
particle, is also a member of this subset. L1 RNPs are comprised of
ORF1 protein bound to L1 RNA and are thought to be intermediates in the
retrotransposition process. The ORF1 sequences of the active elements
A101 and A102 are identical, while RNP-58A differs at a single position
(R22C). Only four residues in the ORF2 of RNP-58A differ from those of both A101 and A102: S86L in the endonuclease domain, T359K situated between the endonuclease and reverse transcriptase domains, and R761W
and V/A928D, which are not conserved among mammalian species. Therefore, it is likely that RNP-58A is also an active L1 element.
Using PCR, we directly amplified DNA of three GF elements with two complete ORFs from the genomes of several Mus m. domesticus strains. We also tested GF27, which has a frameshift mutation 16 amino acid residues from the end of ORF1, but found this element to be inactive in human 143B cells and barely active when driven by a CMV promoter in HeLa cells. All elements belonged to clade G-I (Fig. 3A). They were cloned with their entire 5' UTRs intact into retrotransposition vectors both with and without an exogenous CMV promoter and tested for retrotransposition activity. All three of the cloned elements with intact ORFs were capable of retrotransposition in cell culture when linked with a CMV promoter (Table 2). GF21 and GF62 were clearly active when driven by their endogenous promoters alone, while GF13 may be active at a very low level. GF21 and the A-type element, A102, are the most active mouse L1s tested in the retrotransposition assay to date. After screening a mouse strain 129/OlaHsd genomic library with a GF monomer probe, we estimated that ~1500 full-length GF elements reside in the diploid mouse genome. This number was confirmed by probing GenBank's high-throughput genomic sequence (htgs) mouse database with amino acid sequence from the N-terminus of the GF consensus ORF1. It is likely that 440 full-length GF elements have two complete ORFs (since 5 of the 17 members of our dataset have intact ORFs). These data suggest that ~400 potentially active full-length GF elements may exist in the mouse genome.
| |
DISCUSSION |
|---|
|
|
|---|
GF elements belong to a novel subfamily of mouse L1s with
unique monomeric 5' UTR sequence. At different times in evolution, murine L1s have acquired novel 5' sequences, perhaps by recombination or by the capture of upstream promoter sequence. The shuffling of L1
fragments, whether by recombination or template switching, is a leit
motif in the evolution of murine subfamilies. In Rattus norvegicus, the youngest subfamily, L1mlvi2, acquired its ORF1 from
an ancestral L1 (Hayward et al. 1997
). Flood et al. (1998)
identified
unusual hybrid L1s in the mouse which possess fused 3' sequence
homologous to a fragment in the first intron of C
immunoglobulin. Five hybrid elements have poly(A) signals and A-rich 3'
ends, and one has TSDs, evidence of retrotransposition. (Since the
C-terminus of ORF2 was displaced by the fusion, transcomplementation by
ORF2 protein from an intact L1 may have allowed this hybrid element to
multiply). In the mouse, Saxton and Martin (1998)
and Mears and
Hutchison (2001)
have shown that TF elements are recombinant, deriving 5' UTR sequence from an "older" F-type element and 3' sequence from a "young" F-type or an A-type element. Similarly, by
phylogenetic reconstruction we show that 5 of the 17 GF
elements in our dataset are recombinant. Three of these elements,
GF36, GF26, and GF251, are members of a
subclade which Mears and Hutchison (2001)
designated F-II2
and estimated to be 1.4-2 million years old.
In addition to a novel 5' UTR, a majority of GF elements also
contain a unique length polymorphism in ORF1. The LPR is contained within the N-terminal one-third of ORF1, a region which is highly divergent in sequence among different mammalian L1 families. In this
region, human L1 ORF1 contains a putative leucine zipper domain
implicated in protein-protein interaction (Holmes et al. 1992
; Hohjoh
and Singer 1996
). Mouse elements do not contain a leucine zipper.
However, Martin et al. (2000)
predicted that the N-terminal region of
mouse ORF1 is capable of forming a coiled-coil structure, and they
demonstrated that this region mediates protein multimerization.
Furthermore, employing two-hybrid assays, those authors showed that
ORF1 protein of the TF element L1spa dimerizes more
strongly than does protein of an A-type element, and they speculate
that this might be due to the longer LPR region of the TF
ORF. It would be interesting to determine whether those GF elements with the long Group V LPR show even stronger ORF1
interactions. However, the finding that two very active A-type elements
have the short Group II LPR indicates that LPR length does not
correlate with retrotransposition capacity.
Three ORF1 protein variants, p41, p43, and p43.5, have been detected in
mouse F9 cells by Kolosha and Martin (1995)
. A-type and
TF-type elements have been identified with the first two
forms, respectively. Phosphatase treatment suggested that the 43.5 kD protein was a phosphorylated form of p43. Furthermore, the 43.5 kD form
is too small to represent ORF1 protein expressed from GF
elements with the Group V LPR (expected size 44.5 kD). Previous RT-PCR
analyses also failed to detect ORF1 sequence with a Group V LPR
(Schichman et al. 1992
; Kolosha and Martin 1995
), although the primers
used in those studies likely would not have detected GF RNA.
It would appear that the dominant active elements in the mouse genome
are members of the TF subfamily which have accounted for at
least five of the seven known mutagenic insertions. These include
L1spa (Kingsmore et al. 1994
; Mulhardt et al. 1994
),
L1orl (Takahara et al. 1996
), and insertions into the
beige (Perou et al. 1997
), black-eyed white
(Yajima et al. 1999
), and disabled1 (Kojima et al. 2000
)
genes. The disabled gene L1 lacks a poly(A) tail, 3' end,
and TSDs, and thus may not be a bona fide retrotransposition event. The
remaining insertions into the sodium channel gene Scn8a (Kohrman et al. 1996
) and the copper-transporting ATPase gene Atp7a (Cunliffe et al. 2001
) are too short to assign to a
subfamily (Kohrman et al. 1996
). However, the number of elements
characterized to date is still small, and we expect that A and
GF insertions will eventually be detected. Phylogenetic
analysis of several GF elements reveals their relatively
recent expansion in the Mus genus. All three GF
elements tested here were absent from the M. spretus and
M. m. musculus (CZECH II/Ei) genomes. M. spretus diverged from the M. musculus lineage 1-3 million years ago
(Thaler 1986
), and M. musculus subspecies shared a common
ancestor 350,000 years ago (She et al. 1990
). We have also shown that
element GF46 is a de novo insertion specific to a particular
mouse substrain or perhaps even a single individual.
Three distinct L1 subfamilies possessing unique promoter regions are
simultaneously capable of retrotransposition in the mouse genome.
Approximately 400 GF elements and 900 A elements are
potentially active in the mouse. DeBerardinis et al. (1998)
estimated
that mouse strain 129/OlaHsd contains ~4800 full-length TF
elements, while Saxton and Martin (1998)
reported that the diploid
genome of strain 129/Sv contains 2500 full-length elements. Examination of GenBank's mouse htgs database with amino acid sequence from the
N-terminus of ORF1 suggested ~3000 as the number of TF
elements in the mouse genome. Assuming that 64% of TF
elements are retrotranspositionally competent (DeBerardinis et al.
1998
), ~1800 of these elements may be present in the diploid genome.
Thus the total number of potentially active L1 retrotransposons from
the A, TF, and GF subfamilies in the diploid genome
is roughly 3000. This is in sharp contrast to 40-70 active L1s in the
human genome, most belonging to a single subfamily, Ta (Skowronski et
al. 1988
; Sassaman et al. 1997
).
The large number of potentially active elements in the mouse accounts
for the much higher rate of mutation due to L1 retrotransposition. It
has been estimated that L1 insertions make up ~2.5% of spontaneous mutations in the mouse, and only 0.07% in humans (Kazazian and Moran
1998
). Therefore, it is probable that retrotransposition has been a
factor in driving the high rate of evolution in the mouse (She et al. 1990
).
| |
METHODS |
|---|
|
|
|---|
Isolation of L1s, Subcloning and Retrotransposition Assay
Using GF monomer sequence to search GenBank entries, we examined the highest scoring contigs in both the nonredundant (score >150) and high-throughput genomic sequence (score >200) databases and extracted from these 17 GF elements (Table 1). Only full-length elements having obvious TSDs were considered.
To isolate full-length A elements, we screened a phage genomic library
of the strain 129/OlaHsd embryonic stem cell line E14TGa with a monomer
from the A-type element L1Md-A2 (Loeb et al. 1986
). A-type elements
were isolated from the phage and cloned by two methods. First, we
amplified ten elements by PCR (primers: L1ANOT5P, CGTACGCGGCCGC TGGTTCGAACACCAGATATCTGGG; L1ABSTZ3P,
ATACG TATACATTTCCAATGCTATACCAAAAG). These were directly cloned into
our retrotransposition vector under the control of a CMV promoter
(Moran et al. 1996
). Secondly, we excised eight A-element sequences
from purified phage DNA (four were duplicates of those cloned by PCR
amplification and four were unique) and inserted them into the
retrotransposition construct in the same manner as described for
cloning TF elements (DeBerardinis et al. 1998
). Briefly, we
swapped SmaI-SfiI fragments (including 296 bp of the A-type
nonmonomeric 5' UTR and 495 bp of the 3' UTR) for the corresponding
fragment of the TF element L1spa. Since the
resulting constructs contained both a CMV promoter and L1spa monomers, we could assay A-type element ORF protein activity only. Complete sequences of the two active A elements have been deposited in
GenBank (accession numbers AY053455 and AY053456).
To test GF elements, we selected from our dataset four
elements, three with two intact ORFs, and we PCR-amplified these from genomic DNA of several mouse strains using primers closely flanking the
TSDs. These elements were cloned into the retrotransposition vector
with or without a CMV promoter. All cell culture retrotransposition assays were performed in either human 143B TK
osteosarcoma
(ATCC# CRL-8303) or HeLa cells as described (Moran et al. 1996
). In two
previous papers (Naas et al. 1998
; DeBerardinis et al. 1998
) we
reported testing TF element retrotransposition in mouse
LTK
cells. It recently came to our attention that the cells
assayed were not mouse LTK
cells but rather human TK
143B cells. We have determined that L1s are also active in LTK
cells but at a level two orders of magnitude lower than reported.
Analysis of GF Polymorphism
Mouse genomic DNA was obtained from the Jackson Laboratories. Primers used for genomic amplification of the following elements were as follows. GF13: GTCTGCGTAAGGCCT GTGCTTGC (1AF1465P) and GCAAGTTTGATCTTCACCAT CAGG (2AF1463P); GF21: TTCCTGATATGAAGCCTATG TACC (3AC02163P5) and TCTCTGAATGTTACATGATTTGGC (4AC02163P3); GF46: GCCTGTGCTCTAAATCGCCAACAC (5AL049ES5P) and AGAGAAGTACCTGCGTGGCCCACC (6AL049ES3P). The M. spretus sample is SPRET/Ei, the M. m. castaneus sample is CAST/Ei, the M. m. musculus sample is CZECH II/Ei, and the M. m. domesticus sample is WSB/Ei. Amplifications were performed with the Expand Long PCR system (Roche), and PCR conditions were optimized for the generation of both empty site (~450 bp) and filled site (7.1-7.6 bp) products using an MJ Research PTC-200 Peltier Thermal Cycler. DNA mixing studies showed that our PCR conditions could detect in a single reaction both empty and filled sites (data not shown). Despite this ability, no individual mice heterozygotic for the presence and absence of an L1 were discovered.
Many PCR products were gel purified and directly sequenced to confirm the presence or absence of the L1. The L1 GF46 was also amplified from the bacmid clone originally used to generate the GenBank sequence entry (CITB/Research Genetics clone #437P9, originating from mouse strain 129/Sv).
Estimation of GF Copy Number
We screened the mouse genomic phage library with the monomer array
from element GF46. The array was PCR-amplified from the bacmid clone DNA (CITB/Research Genetics clone #437P9) using the 5'
primer, TAAGGAATTCCATC TATTTCGAGGGGGTAAAG (3AL049ECOR), and 3'
primer, TAAGCTCGAGTCCCAGAAGCTGTGTTGCTTTG (2PROBEGXHO). The GenBank
entry (GI# 7630118) shows GF46 to have 5.1 monomers. However, our PCR product was 600 bp longer than expected, suggesting that GF46 in reality contains about eight monomers and that the
GenBank sequence was misassembled. We confirmed that our probe
TF contained no non-L1 DNA by end-sequencing and by analysis
with restriction enzymes cutting only once within each monomer. The PCR
product was cloned into XhoI/EcoRI sites of pBS KS
and reexcised for use as a hybridization probe to screen nylon filters
lifted from five plates containing about 5 × 104 plaques
each. Hybridization and washing conditions were high-stringency under
standard conditions (Sambrook and Russell 2001
). We confirmed by slot
blot analysis that the probe would not cross-hybridize to the
TF element L1spa, an A-monomer probe from L1Md-A2
(accession #M13002; Loeb et al. 1986
), or to an F-monomer probe from
Padgett et al. (1988)
but would hybridize to clones GF21 and
GF62. Knowing the average insert size of the library (16 kb)
and the mouse diploid genome size (6 × 109), we estimated
the GF copy number to be ~1500 elements.
We also examined GenBank's high-throughput genomic (htgs) mouse database to confirm our estimate of the number of full-length GF or TF elements, by using the first 63 amino acid residues of consensus ORF1 as query sequence in a TBLASTN search. This region of ORF1 was used because it contains both GF and TF subfamily-specific residues and because L1s that contain the 5' end of the ORF1 coding region are likely to represent full-length elements. At the time the searches were performed, the htgs database represented 9.6% of the total mouse genome. After counting the number of sequence hits, we extrapolated to determine the total number of GF and TF elements in the diploid genome (1500 and 3000 elements, respectively).
DNA Sequence and Phylogenetic Analyses
Sequences were aligned with MacVector 6.53 (Oxford
Molecular Group) or ClustalW 1.8 and consensus sequences were determined with MacVector. Phylogenetic analyses were
performed with the ClustalW program using the neighbor joining algorithm of Saitou and Nei (1987)
with exclusion of gaps. Significance was determined by 1000 bootstrap analyses. Unrooted maximum parsimony analyses were also performed as confirmation (using
the DNAPARS program in PHYLIP ver. 3.5c; Felsenstein 1993
). Trees produced by the phylogenetic analyses were
viewed and manipulated with TreeView.
| |
ACKNOWLEDGMENTS |
|---|
We thank R.J. DeBerardinis for critical reading of the manuscript, E. Luning-Prak for mouse genomic DNA, and K. Kaestner for supplying the mouse genomic library. The work was supported by a grant from the NIH to H.H.K and a Howard Hughes Medical Institute Predoctoral Fellowship to E.M.O.
| |
FOOTNOTES |
|---|
1 Corresponding authors.
E-MAIL jgoodier{at}mail.med.upenn.edu, kazazian{at}mail.med.upenn.edu; FAX (215) 573-7760.
Article published on-line before print: Genome Res., 10.1101/gr.198301.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.198301.
| |
REFERENCES |
|---|
|
|
|---|
Received May 24, 2001; accepted in revised form July 25, 2001.
This article has been cited by other articles:
![]() |
K. Akagi, J. Li, R. M. Stephens, N. Volfovsky, and D. E. Symer Extensive variation between inbred mouse strains due to endogenous L1 retrotransposition Genome Res., June 1, 2008; 18(6): 869 - 880. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Kirilyuk, G. V. Tolstonog, A. Damert, U. Held, S. Hahn, R. Lower, C. Buschmann, A. V. Horn, P. Traub, and G. G. Schumann Functional endogenous LINE-1 retrotransposons are expressed and mobilized in rat chloroleukemia cells Nucleic Acids Res., February 2, 2008; 36(2): 648 - 665. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Teneng, V. Stribinskis, and K. S. Ramos Context-specific regulation of LINE-1 Genes Cells, October 1, 2007; 12(10): 1101 - 1110. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Januszyk, P. W.-l. Li, V. Villareal, D. Branciforte, H. Wu, Y. Xie, J. Feigon, J. A. Loo, S. L. Martin, and R. T. Clubb Identification and Solution Structure of a Highly Conserved C-terminal Domain within ORF1p Required for Retrotransposition of Long Interspersed Nuclear Element-1 J. Biol. Chem., August 24, 2007; 282(34): 24893 - 24904. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. L. Garcia-Perez, A. J. Doucet, A. Bucheton, J. V. Moran, and N. Gilbert Distinct mechanisms for trans-mediated mobilization of cellular RNAs by the LINE-1 reverse transcriptase Genome Res., May 1, 2007; 17(5): 602 - 611. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Lefebvre, J. Fan, S. Chevalier, R. Sullivan, E. Carmona, and P. Manjunath Genomic structure and tissue-specific expression of human and mouse genes encoding homologues of the major bovine seminal plasma proteins Mol. Hum. Reprod., January 1, 2007; 13(1): 45 - 53. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. An, J. S. Han, S. J. Wheelan, E. S. Davis, C. E. Coombes, P. Ye, C. Triplett, and J. D. Boeke Active retrotransposition by a synthetic L1 element in mice PNAS, December 5, 2006; 103(49): 18662 - 18667. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. M. Johnston, A. L. Wood, D. J. Bolland, and A. E. Corcoran Complete Sequence Assembly and Characterization of the C57BL/6 Mouse Ig Heavy Chain V Region J. Immunol., April 1, 2006; 176(7): 4221 - 4234. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. W.-L. Li, J. Li, S. L. Timmerman, L. A. Krushel, and S. L. Martin The dicistronic RNA from the mouse LINE-1 retrotransposon contains an internal ribosome entry site upstream of each ORF: implications for retrotransposition Nucleic Acids Res., February 6, 2006; 34(3): 853 - 864. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. S. Alisch, J. L. Garcia-Perez, A. R. Muotri, F. H. Gage, and J. V. Moran Unconventional translation of mammalian LINE-1 retrotransposons Genes & Dev., January 15, 2006; 20(2): 210 - 224. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Khan, A. Smit, and S. Boissinot Molecular evolution and tempo of amplification of human LINE-1 retrotransposons since the origin of primates Genome Res., January 1, 2006; 16(1): 78 - 87. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Perreault, J.-F. Noel, F. Briere, B. Cousineau, J.-F. Lucier, J.-P. Perreault, and G. Boire Retropseudogenes derived from the human Ro/SS-A autoantigen-associated hY RNAs Nucleic Acids Res., April 7, 2005; 33(6): 2032 - 2041. [Abstract] [Full Text] [PDF] |
||||