|
|
|
|
Vol. 11, Issue 12, 2066-2074, December 2001
LETTER
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |
ABSTRACT |
|---|
|
|
|---|
We report the results of sequence analysis and chromosomal distribution of all distinguishable long terminal repeat (LTR) retrotransposons (Cer elements) in the Caenorhabditis elegans genome. Included in this analysis are all readily recognizable full-length and fragmented elements, as well as solo LTRs. Our results indicate that there are 19 families of Cer elements, some of which display significant subfamily structure. Cer elements can be clustered based on their tRNA primer binding sites (PBSs). These clusters are in concordance with our reverse transcriptase- and LTR-based phylogenies. Although we find that most Cer elements are located in the gene depauperate chromosome ends, some elements are located in or near putative genes and may contribute to gene structure and function. The results of RT-PCR analyses are consistent with this prediction.
| |
INTRODUCTION |
|---|
|
|
|---|
Retrotransposons are an abundant and widely distributed
class of mobile repetitive elements that transpose
through an RNA intermediate (Berg and Howe 1989
). A significant portion
of eukaryotic genomes examined to date is comprised of
retrotransposons. For example, more than half of the maize (>50%,
SanMiguel et al. 1996
) and wheat (>90%, Flavell 1986
) genomes as well
as ~40% of the human genome (Yoder et al. 1997
) is made up of
retrotransposons. Long recognized as a major source of mutation (Green
1988
) and disease (Miki 1998
), retrotransposons have been implicated in the evolution of genome structure and function as well (e.g., McDonald
1995a
,b
; Britten 1997
; Brosius 1999
).
Genome sequencing of a variety of organisms is providing an
unprecedented opportunity to study the evolutionary history of retrotransposons and their contribution to genome structure and function. For example, recent surveys of retrotransposons within the
Caenorhabditis elegans genome have revealed the presence of no
fewer than 19 families of long terminal repeat (LTR) retrotransposons (Bowen and McDonald 1999
; Malik et al. 2000
; Frame et al. 2001
), including two families (Cer7 and Cer13) which display
features characteristic of infectious retroviruses (Bowen and McDonald 1999
).
We extend these findings by analyzing the sequence and identifying the
chromosomal distribution of all distinguishable Cer LTR
retrotransposon sequences present in the C. elegans genome. In
our analysis we group all distinguishable Cer elements into three distinct types: (1) full-length elements containing all of the
characteristic features of LTR retrotransposons including putative
gag, pol and, in some cases, env genes
flanked by LTRs; (2) partially deleted or fragmented elements which are
missing one or more of the characteristic features of full-length
elements; and (3) solo LTRs which are believed to be the products of
recombination events between the flanking LTRs of full-length elements
(Berg and Howe 1989
). Our results indicate that there are 19 Cer families represented within the sequenced (N2) C. elegans genome. All 19 families are either the gypsy/Ty3
or Bel class of retrotransposons. No copia/Ty1 type
elements are present in the C. elegans genome.
Although some full-length Cer elements were found to be members of extended families with well defined evolutionary histories, others appear to be single-element families with no detectable lineage within the (N2) genome. In contrast, several families of Cer elements were identified that are comprised of fragmented elements or solo LTRs exclusively. Cer elements can be grouped according to their tRNA binding sites into multiple clusters which are consistent with our reverse transcriptase (RT)- and LTR sequence-based phylogenies. We have also analyzed the inter- and intrachromosomal distribution of Cer elements in the N2 genome. Although most Cer elements are located in the gene depauperate chromosomal ends, some elements are located in or near putative genes and may have contributed to gene structure and function. The results of RT-PCR analyses are consistent with this prediction. Products consistent with processing of these transcripts and removal of predicted introns were observed. Cer LTR sequence could account for at least 12% to as much as 54% of the coding region within mRNAs transcribed from these loci.
| |
RESULTS |
|---|
|
|
|---|
The C. elegans Genome Consists of at Least 19 Families of LTR Retrotransposons
Closely related groups of full-length Cer LTR
retrotransposons display >90% amino acid similarity among their
respective reverse transcriptases (RTs) and have been designated as
families (Bowen and McDonald 1999
). Using this criterion, full-length
LTR retrotransposons representing 12 distinct families have been described
previously in C. elegans, Cer1-Cer12 (Bowen and McDonald
1999
). By searching for homology to envelop (ENV) proteins, Malik et al.
(2000)
discovered two additional families (Cer13 and Cer14).
More recently, Frame et al. (2001)
identified six additional putative
families. In the present study, we include fragmented elements and solo
LTRs in our analysis to add substructure to the Cer
phylogenetic tree. Using this approach, we independently identified a
total of 19 families (Cer1-Cer19) of Cer
elements within the essentially complete (>99%) N2 C. elegans genome (C. elegans Sequencing Consortium 1998
).
The number of Cer elements within families varies considerably
(Table 1). In general, full-length
Cer elements are in relatively low abundance within the
C. elegans genome. Only two of the 19 families (Cer9
and Cer20) contain three full-length elements, whereas three
families (Cer8, 15, and 16) contain two, and 12 families
(Cer1-Cer7, and Cer10, 12, 13, 17, and 19)
contain only one. Two families (Cer11 and Cer14)
contain no full-length elements, and another two families are comprised
of only a single full-length element (Cer4 and
Cer17). Fragmented elements are also in relatively low
abundance (<4 per family) in 15 of the 19 families (Cer2, 3, 5-7, 9-16, 19, and 20). Solo LTRs were detected in 13 of the 19 families (solo LTRs lacking in Cer4, 7, 11, 13, 14, and 17) ranging in number from 12 to 1 per family. Five of the families displayed subfamily structure. While members of Cer element
families share >90% RT sequence identity, within family sequence
identity values among the more rapidly evolving LTRs are more variable, ranging from 60 to 100% (Fig. 1, Tables 1
and 2).
|
|
|
Six of the eight solo LTRs or LTR-containing fragments within the Cer9 family were found to contain an ~100-bp sequence inserted into the center of their 3' LTRs (c56g3/c07d8, y57a10a, and k09e3 contain a 108 bp insert; f15a2, y59a8b, and c13b9 contain a 106 bp insert). Interestingly, none of the three Cer 9 full-length elements contain either insert within their LTRs. The ~100-bp inserts in these LTRs share 85% identity among themselves but display no significant homology to other sequences within the C. elegans (N2) genome. Aside from this size polymorphism, all of the Cer9 family members share a remarkable 95% LTR nucleotide sequence identity with one another.
Whereas the slowly evolving RT encoding region of LTR retrotransposons
is ideal for quantitating evolutionary distances among even distantly
related families of retroelements (Flavell 1986
; Xiong and Eickbush
1990
), analysis of differences among the more rapidly evolving LTRs is
better suited for the identification of phylogenetic substructure
within families of LTR retrotransposons. Phylogenetic trees based on
Cer element LTR sequences reveal the presence of significant
substructure within several Cer element families. Both
neighbor-joining and parsimony criteria support the existence of
distinct subgroups in the Cer2, 3, 12, 15, and 16 families of
elements. For example, the Cer 12 family is comprised of 15 elements (primarily solo LTRs) falling into two distinct subfamilies,
whereas the 14 elements comprising the Cer16 family of
elements fall into three distinct subfamilies (Fig. 2B).
|
Cer Element Families Share tRNA Primers
RT requires a primer strand to initiate minus-strand DNA synthesis.
Host-encoded tRNA is the primer used by most retroviruses and LTR
retrotransposons analyzed to date (Telsnitsky and Goff 1997
). In the
process of priming, the native tRNA molecule is partially unfolded such
that 18 bp at its 3' terminus is free to base pair with a complementary
sequence, termed the primer binding site (PBS), on the retroviral or
LTR retrotransposon RNA. Different tRNA primers are known to be used by
different families of retroviruses and LTR retrotransposons and have
been used as an indicator of evolutionary relationships (Vogt 1997
).
Primer binding sequences are located just 3' of the proviral 5' LTR.
Utilizing the C. elegans tRNA gene database
(http://rna.wustl.edu/GtRDB/Ce/), we identified putative Cer
primer binding sites by FASTA searches of 100 nucleotides
downstream of the 5' LTR of full-length Cer elements.
Consistent with the recent observations of Frame et al. (2001)
, we
found that full-length elements representing Cer families in
the Cer7/BEL clade share a binding site for the Gly-GCC-type
(Cer7-10, 12, 15, 16, and 19) or for Arg-type tRNAs (Cer13, 17, and 20). We also confirm the observation of Frame et al. (2001)
that Cer7 encodes its own 71 bp Gly-GCC-type
tRNA (CE-CHRV-1298_TRNA5-GLYGCC; Fig. 1).
Extending our alignment of putative PBSs and FASTA searches of the tRNA database to the Ty3/gypsy clade revealed that Cer2 and Cer3 share a PBS for Gly-ACC tRNA. In contrast, we found that Cer4, Cer5, and Cer6 share a Ser-GCT tRNA. Cer1 displays weak homology to the PBS for Thr-GGT-type tRNA.
Most Cer Elements are Located at Chromosome Ends
The chromosomal position of each Cer element was used to
analyze the distribution of Cer elements throughout the
genome. To test for interchromosomal clustering of Cer
elements, we employed the Kolmogov-Smirnov goodness-of-fit test (Zarr
1999
) to look for a deviation from a random distribution of elements
among chromosomes. The results indicate no significant deviation from
the null hypothesis (P = 0.91). The distribution of
individual families of Cer elements (Cer3, 5, 9, and
12) and family groups (Cer8 and Cer9,
P = 0.046; Cer12 and Cer16,
P = 0.51; Cer2 and Cer3,
P = 0.13) were tested separately and also found to be
distributed randomly among chromosomes.
Tests were carried out to determine whether the distribution of
Cer elements on individual chromosomes was also random. Our analysis rejected the random distribution hypothesis for all
chromosomes except chromosome III (Fig. 3).
Chromosomes I, II, IV, V, and X were found to display nonrandom
clustering of Cer elements on their chromosomal ends. This is
consistent with a previous report that DNA transposable elements in
C. elegans are clustered at chromosome ends (C.
elegans Sequencing Consortium 1998
; Surzycki and Belknap 2000
) and
the observation that the middle third of C. elegans
chromosomes are "gene rich." The ends of C. elegans chromosomes display a lower gene density and are associated with relatively high rates of recombination (Barnes et al. 1995
; Wilson 1999
).
|
Cer Elements May Contribute to C. elegans Gene Function
The results of our genomic positioning of Cer elements
indicates that a number of these elements lie within or proximal to genes. Previous studies of LTR retrotransposons in a variety of plant
and animal species have revealed that these elements may be coopted for
a variety of host gene functions, including promoter, splicing, and
terminator activities (e.g., Britten 1997
; Medstrand et al. 2001
). In
an initial effort to determine whether Cer elements contribute
to gene function, we screened C. elegans EST databases (dbEST-C. elegans) for homology to Cer elements. ESTs
with significant homology to Cer LTRs were identified. The
complete sequences of these ESTs were BLASTed against the
C. elegans genome database to identify the clones containing
the Cer LTRs and associated putative genes (F20B4, C56G3,
6R55, and F53E10).
The specific region of Cer element identity within the four clones (F20B4.6, C5663.2, 6R55.2, and F53E10.5) was overlaid on the existing annotation of each region. Our results indicate that these Cer elements are part of putative genes (Fig. 4). Although all four gene regions are putative in nature, they retain strong predictive computational support. In addition, multiple ESTs were found to map to the exon regions of these putative genes, adding further support. The results of TBLASTN searches indicate that two of the sites (F20B4.6 and C56G3.2) displayed significant homology (outside the Cer element sequence) to previously characterized genes. F20B4.6 exhibits homology with genes encoding ceramide glucosyl transferases; C56G3.2 displays homology with genes encoding aldo/keto reductases. The putative genes contained in regions 6R55.2 and F53E10.5 show no homology with genes thus far characterized (Fig. 4).
|
Transcribed and Processed mRNAs Contain Cer LTR Sequence
A series of reverse transcriptase polymerase chain reactions (RT-PCR) were performed to test the hypothesis that Cer elements contribute to the structure and function of some C. elegans genes. Sets of primers were designed to amplify predicted gene transcripts containing Cer element sequences. Because nascent RNA transcripts are typically in low abundance in standard RNA preparations, they are often underrepresented or undetectable in the products of RT-PCR reactions. For this reason, PCR of genomic DNA was also carried out for each set of primers as a positive control.
Primers designed for the 6R55.2 gene yielded RT-PCR products consistent with the expected sizes of the nascent (1514 bp) and processed (429 bp) transcripts (Fig. 5). A 6R55.2 transcript fully processed according to its predicted gene structure (Fig. 4A) would contain 16% LTR sequence from a Cer16-2 element in its coding region. If all exons represented by EST alignments (Fig. 4A) were present in the final processed transcript, 54% of its coding region would be LTR sequence.
|
Primers designed for the C56G3.2 gene yielded RT-PCR products consistent with the expected size of the nascent (634 bp) and processed (569 bp) transcripts (Fig. 5). The smaller RT-PCR product is consistent with excision of the intron predicted within the Cer9 LTR. It is intriguing to note that the position of the predicted intron within the Cer9 LTR overlaps with an approximate 100bp sequence missing in some of the solo LTRs identified in this study. A 6R55.2 transcript fully processed according to its predicted gene structure (Fig. 4B) would have first and second exons comprised of 100% and 40% of Cer9 LTR sequence, respectively. Thus, within its coding region the mRNA would be 36% LTR sequence. Alternate intron/exon structures (Fig. 4B) could generate transcripts ranging from 20% to 48% Cer9 LTR as mRNA coding sequence.
Primers designed for the F20B4.6 gene yielded a preferentially amplified RT-PCR product of ~213 bp. This product is consistent with excision of the Cer16-1 LTR from intron 1 (Fig. 4C, Fig. 5), although potential enhancer activity of the LTR cannot be excluded by this analysis. Two bands at ~380 and ~430 bp may represent unpredicted processing products or nonspecific priming, although they were also apparent in reactions performed at temperatures 10°C higher than the predicted optimum for the pair (data not shown).
Primers designed for the F53E10.5 yielded two RT-PCR products consistent with predicted processing of the nascent transcript (Fig. 5). A weakly amplified product at ~520 bp is consistent with mRNA processing and removal of intron 9 (Fig. 4D). The preferentially amplified product at ~449 bp is consistent with removal of introns 8 and 9. Exon 10, derived entirely from Cer2 LTR DNA, would contribute 12% coding sequence if the mRNA was fully processed as predicted.
In summary, RT-PCR analyses demonstrated that the inserted Cer elements were part of each gene transcript, thus providing molecular confirmation of our computational results (Fig. 5). Polyadenylated transcripts composed of retroelement sequence were produced from the three genes in which elements were part of the coding region. Furthermore, products consistent with processing of these transcripts and removal of predicted introns were observed.
| |
DISCUSSION |
|---|
|
|
|---|
The C. elegans Genome Contains Relatively Few Families of LTR Retrotransposons with Unusual Subfamily Structure
Nucleotide sequence divergence among LTR retrotransposons can be
used to establish phylogenetic relationships and other relevant information related to retrotransposon evolution. Our approach has been
to utilize RT sequence to establish families (defined as groups of LTR
retrotransposons sharing at least 90% RT sequence homology) and to
subsequently utilize the divergence among the more rapidly evolving
LTRs to establish subfamily structure. An alternative approach recently
employed by Frame et al. (2001)
to characterize the BEL-like class of
C. elegans LTR retrotransposons, is to base phylogenetic
relationships primarily on LTR sequences. A priori, both approaches
might be expected to give similar results. However, because the C. elegans genome contains relatively few full-length elements and
relatively more fragmented elements and solo LTRs lacking RT sequences,
the former approach will tend to identify fewer families of elements
with more substructure than the latter approach. For example, the
Cer16 and Cer18 families described by Frame et al.
(2001)
are collapsed in our analysis to a single family
(Cer16) with detailed subfamily structure. As more data become
available on the diversity of LTR retrotransposons present in other
strains of C. elegans, the results should converge on a single
picture of the evolutionary history of Cer elements.
Although our view of the phylogenetic structure of Cer
elements differs somewhat from that described by Frame et al. (2001)
, we find that many of the general features of the Cer 7/BEL
class of C. elegans LTR retrotransposons described by those
authors hold true for the Ty3/gypsy class as well. In general,
the C. elegans genome appears to have a relatively low
tolerance for LTR retrotransposons (<1%). Whereas we have identified
124 full-length, fragmented, or solo LTR Cer elements in the
sequenced (N2) C. elegans genome, >350 LTR retrotransposon
elements have been described in the yeast Candida albicans
(Goodwin and Poulter 2000
) and >300 in Saccharomyces
cerevisiae (Kim et al. 1998
), both species with genomes nearly an
order of magnitude smaller than C. elegans (C. elegans Sequencing Consortium 1998
).
Single-element groups add to the puzzle. Families represented by only one element (Cer 4, Cer11, and Cer17) have no detectable history in the C. elegans (N2) genome, suggesting that they may have been introduced by horizontal transfer. The fact that the Cer7 and Cer14 elements encode a putative env gene is consistent with the hypothesis that at least some Cerelements may have entered the N2 genome via horizontal transfer. However, additional information on the diversity of elements in other C. elegans strains and related Caenorhabditis species will be necessary to definitively test the horizontal transfer hypothesis.
A number of solo LTRs and LTR-containing fragments are nearly identical
in sequence despite the fact that related full-length putative
progenitor elements are not present in the genome. For example, the
Cer3-1 subfamily consists of 10 solo LTRs and one LTR-containing fragment with >94% identity. Similarly, the
Cer16-1 subfamily consists of six solo LTRs with >94%
identity. Despite the sequence similarity among these and other
subfamily LTRs, the sequences of Cer16-1 LTRs are distinctly
different from their most closely related full-length elements. One
possible explanation of this apparent paradox is that some mechanism
exists in C. elegans to rapidly remove full-length
transposable elements, as has been postulated in Drosophila
(Petrov et al. 1996
). Under this scenario, solo LTRs and LTR-containing
fragments are remnants of degraded full-length elements. Alternatively,
the high sequence similarity existing among families of solo LTRs and
LTR-containing fragments may be the product of gene conversion.
A third possible explanation is that at least some of the families
of solo LTRs and LTR-containing fragments represent footprints of
double-strand break (DSB) repair events (Garfinkel 1997
; Haber 2000
).
Teng et al. (1996)
and Yu and Gabriel (1999)
reported that a variety
of Ty1 LTR transcription intermediates have been used to
repair double-stranded breaks in Saccharomyces cerevisiae. If
such a mechanism exists in C. elegans, it is possible that at
least some subfamilies of LTRs displaying high sequence similarity were
copied off of the same master element during the process of DSB repair.
The Presence of tRNA Genes in Cer Elements May Be of Adaptive Significance
Putative tRNA PBSs have been identified for most full-length Cer elements. Matching tRNAs consist predominantly of glycine (TCC and ACC) types. The distribution of these different types of tRNA binding sites was found to be consistent with our RT-based phylogeny (Fig. 1).
It is interesting to speculate about the significance of the surprising
finding that a complete tRNA-Gly gene is located within the
untranslated leader region of Cer7. The observation that LTR retrotransposons are common in heterochromatic regions of genomes (Dimitri and Junakovic 1999
) has led to the speculation that the evolutionary origin of heterochromatin was as a defense mechanism against transposable elements (e.g., McDonald 1999
; Henikoff 2000
). tRNA genes are known to exclude nucleosomes and limit the spread of
heterochromatin (Morse 2000
). Thus, the inclusion of a tRNA gene in an
LTR retrotransposon may provide a selective advantage to an element
located in heterochromatic regions by preventing nucleosome
positioning. The consequent exclusion of surrounding chromatin may
permit access of transcription factors to promoter sequences within the
LTR and adjacent leader regions that would otherwise be inaccessible.
Although the C. elegans genome does not contain constitutive
heterochromatin, transient heterochromatin-like structures occur
during development (e.g., Jedrusik and Schulze 2001
). As analyses of
LTR retrotransposons are extended to additional plant and animal
species, it will be interesting to see if the presence of complete tRNA
genes in untranslated leader regions is a general feature of some
families of LTR retrotransposons.
Cer Elements May Contribute to C. elegans Gene Structure and Function
There is a growing body of evidence that transposable elements play
an important role in genome evolution by contributing to the structure
and/or function of genes (e.g., McDonald 1995a
,b
; Britten 1997
;
Medstrand et al. 2001
). For example, there are >100 reported examples
of essential gene structures and functions in mammals that are
attributable to retrotransposons or retrotransposon-derived sequences
(Brosius 1999
; see also
http://www.ncbi.nlm.nih.gov/Makalowski/ScrapYard/). LTRs are known
to possess promoter, polyadenylation, and enhancer functions (e.g.,
Medstrand et al. 2001
; Britten 1997
). For this reason, LTR
retrotransposon insertions in or near genes have been postulated to be
a significant factor in regulatory evolution in both plants and animals
(e.g., McDonald 1993
, 1995a
). The insertion of transposable elements in
or near introns can result in alternative splicing patterns. Such
events are also believed to have contributed to gene evolution (e.g.,
Kapitonov and Jurka 1999
). The insertion of transposable elements into
the coding region of genes is typically associated with loss of gene
function (Green 1988
). However, occasionally such events are associated
with alterations in gene sequence which may contribute to the evolution
of new gene functions (e.g., Banki et al. 1994
).
In an initial effort to address the possible contribution of Cer elements to C. elegans gene evolution, we screened C. elegans EST databases for the presence of Cer element LTRs. We identified four genes in which Cer elements may be involved in gene function. In three cases, LTR sequences appear to be incorporated into coding regions (Fig. 4A, B, D). In addition, we found that Cer LTRs map to putative gene splice acceptor/donor sequences and termination regions of genes (Fig. 4A, B, C). These results are intriguing and suggest that Cer LTRs may influence gene regulation and expression in the C. elegans N2 strain.
RT-PCR analyses confirmed that mRNAs containing Cer LTR sequence are actively transcribed from these loci. In three of the four loci, Cer element sequences mapped to coding regions of the genes. For each of these cases, polyadenylated transcripts were shown to be produced containing the expected Cer LTR (Fig. 5). Furthermore, products consistent with processing of these transcripts and removal of predicted introns were also observed. Cer LTR sequences could account for at least 12% to as much as 54% of the coding region within mRNAs transcribed from these loci.Detailed molecular analyses are currently underway in our laboratory to precisely define the contribution of Cer elements to the function of these genes in the N2 strain and to examine the functional significance of Cer element insertional polymorphisms at these and other loci among C. elegans strains.
| |
METHODS |
|---|
|
|
|---|
Sequence Identification and Retrieval
Sequence retrieval was initiated by performing BLASTN
searches (default parameters; Altschul et al. 1997
) against the
Wormbase (http://www.wormbase.org) and GenBank
(http://www.ncbi.nlm.nih.gov) databases using LTRs representing each
previously identified family of Cer elements (Bowen and
McDonald 1999
; Malik et al. 2000
). To insure that all families of
Cer LTRs were identified, we employed an iterative approach
whereby LTR sequences with relatively low homology (~70%) were used
as query sequences in subsequent BLAST searches to
identify putative distantly related subfamilies of LTRs. To be
considered an LTR in this study, a sequence had to display >60%
sequence homology to the LTR query sequence in a pairwise comparison
test (Tatusova and Madden 1999
) and have a size no smaller than 40% of
the LTR query sequence. Each Cer LTR identified by these
criteria was given the name of the Cer family to which it was
most homologous, followed by the number of the clone in which it was
found. For full-length elements with two LTRs, the 3' LTR was labeled
by a lowercase "b" following the clone number.
Alignments and Phylogenetic Analysis
Using the clone coordinates from the BLAST search, the
Cer LTR sequences were copied and placed into individual
files. Alignments were created with ClustalW and edited
with MacVector 7.0 (http://www.gcg.com). Both
ClustalX 1.8 (Thompson et al. 1997
) and PAUP
4.03b (Swofford 1999
) were used to generate neighbor-joining
(NJ) trees with bootstrap values. Trees were viewed with
TreeView 1.5.3 (Page 1996
).
tRNA Identification
The C. elegans tRNA database was downloaded
(http://rna.wustl.edu/tRNAdb/; Lowe and Eddy 1997
) for use as a local
FASTA database in conjunction with the GCG
software package (http://www.gcg.com) maintained by the Research
Computing Resource (RCR) at the University of Georgia. One hundred and
one nucleotides downstream of each 5' LTR (including the last
nucleotide of the LTR) were used as query sequences in
FASTA searches (default parameters) run against the tRNA
database to identify matching tRNA 3' ends complementary to putative
Cer PBSs (Goodwin and Poulter 2000
).
Chromosomal Position Analyses
The chromosomal position of the 5' end of each clone found to contain one or more Cer elements was obtained from Wormbase (www.wormbase.org). Endpoints of elements within clones were averaged to obtain a "position value" for each element within a clone. Combining position values of elements within a clone with the position of clones on chromosomes allowed us to assign a chromosomal location to each Cer element. The Kolmogov-Smirnov goodness-of-fit test was used to test the randomness of the distribution of Cer elements among chromosomes and within individual chromosomes. An exponential distribution was used to represent a random dispersal of elements within each chromosome. The observed distribution was calculated based on the base pair distance between sequential element positions along the chromosome.
Gene Annotation
The C. elegans EST database (dbEST-C. elegans) was BLASTed for homology to each Cer LTR sequence. ESTs with significant homology (E = <0.0001) to Cer LTRs were identified. The complete sequences of each EST were then BLASTed against the NCBI C. elegans genome database to identify the corresponding clone containing the LTR and associated gene. TBLASTN searches (default parameters) of these LTR associated genes were run to identify homology to previously characterized genes. GeneFinder (dot.imgen.bcm.tmc.edu) was used to delineate the exon boundaries of the putative genes.
RT-PCR
Total RNA was extracted with Tri Reagent (Molecular Research
Center) from C. elegans cultured under standard conditions on mixed life stage agar plates (Wood 1988
). DNA contamination was removed
using DNA-free (Ambion). Oligo dT20 primed reverse
transcription (RT) was performed on 1 µg of total RNA using the
ThermoScript RT-PCR system and protocol from Gibco BRL. RT (
) control
reactions to detect DNA contamination contained an equivalent volume of sterile distilled water in lieu of reverse transcriptase.
PCR primers designed with MacVector 7.0 and synthesized by
Integrated DNA Technologies were: 6R55.2 F 5'-ATG ACGATGAGCGGTGC-3', R 5'-AAAGTGAGATGTGATTGG GG-3'; C56G3.2 F 5'-CAGCAACCTTCCTACACGG-3', R
5'-CGCAACTCAGATGGAGCAG-3'; F20B4.6 F 5'-AAGGG TTGGGTTTGGTTGGAC-3', R
5'-TCAAGAACAGAACGCCTC GTCG-3'; and F53E10.5 F
5'-GCGATAGCGTTCTGCTCTT GTG-3', R 5'-GGCGAATAAATGAAATCACGGAGG-3' (Fig.
4). Within a locus, PCRs on genomic DNA and cDNAs were performed using
the same primer set. The 25 µL PCRs contained 2 µL RT reaction or
C. elegans genomic DNA, 30 pmol of each primer, 0.5 U
Taq polymerase (Pierce Chemical), 200 µM each dNTP, 1.5 mM
MgCl2, 50 mM KCl, and 10 mM Tris HCl, pH 9.0. DNA (
) PCR
controls to detect potential DNA contamination contained an equivalent
volume of sterile distilled water in lieu of genomic DNA. Following an
initial denaturation at 95°C/5 min, 35 cycles of 95°C/30 sec, 52°
to 56°C (primer dependent)/30 sec, 72°C/1-2 min (depending on
maximum expected product length), and a final cycle at 72°C for 10 min were performed on a Hot Top-equipped RoboCycler Gradient 96 (Stratagene). Reaction products (15 µL) and a 100 bp ladder (0.25 µg)(New England Biolabs) were separated on a 1.3% agarose gel in 0.5 × TBE running buffer containing 0.25 µg mL
1 ethidium
bromide. Gel images were visualized by UV transillumination and scanned
for image processing.
| |
ACKNOWLEDGMENTS |
|---|
We thank King Jordan for statistical advice and Nathan Bowen for reading and commenting on earlier versions of this manuscript. Our laboratory is supported by grants from the National Institutes of Health and the National Science Foundation.
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| |
FOOTNOTES |
|---|
1 Corresponding author.
E-MAIL mcgene{at}arches.uga.edu; FAX (706) 542-3910.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.196201.
| |
REFERENCES |
|---|
|
|
|---|
Received May 17, 2001; accepted in revised form October 10, 2001.
This article has been cited by other articles:
![]() |
Z. Xu and H. Wang LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons Nucleic Acids Res., July 13, 2007; 35(suppl_2): W265 - W268. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. F. Franchini, E. W. Ganko, and J. F. McDonald Retrotransposon-Gene Associations Are Widespread Among D. melanogaster Populations Mol. Biol. Evol., July 1, 2004; 21(7): 1323 - 1331. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Gorinsek, F. Gubensek, and D. Kordis Evolutionary Genomics of Chromoviruses in Eukaryotes Mol. Biol. Evol., May 1, 2004; 21(5): 781 - 798. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. GAO, E. R. HAVECKER, P. V. BARANOV, J. F. ATKINS, and D. F. VOYTAS Translational recoding signals between gag and pol in diverse LTR retrotransposons RNA, December 1, 2003; 9(12): 1422 - 1430. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. W. Ganko, V. Bhattacharjee, P. Schliekelman, and J. F. McDonald Evidence for the Contribution of LTR Retrotransposons to C. elegans Gene Evolution Mol. Biol. Evol., November 1, 2003; 20(11): 1925 - 1931. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. J. Bowen, I. K. Jordan, J. A. Epstein, V. Wood, and H. L. Levin Retrotransposons and Their Recognition of pol II Promoters: A Comprehensive Survey of the Transposable Elements From the Complete Genome Sequence of Schizosaccharomyces pombe Genome Res., September 1, 2003; 13(9): 1984 - 1997. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. E. J. Fischer, E. Wienholds, and R. H. A. Plasterk Continuous Exchange of Sequence Information Between Dispersed Tc1 Transposons in the Caenorhabditis elegans Genome Genetics, May 1, 2003; 164(1): 127 - 134. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Alonso-Gonzalez, A. Dominguez, and J. Albornoz Structural Heterogeneity and Genomic Distribution of Drosophila melanogaster LTR-Retrotransposons Mol. Biol. Evol., March 1, 2003; 20(3): 401 - 409. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||