|
|
|
Published online before print
February 15, 2002, 10.1101/gr.210802
Vol. 12, Issue 3, 400-407, March 2002
LETTER
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |
ABSTRACT |
|---|
|
|
|---|
We analyzed the distribution of 54 families of transposable elements (TEs; transposons, LTR retrotransposons, and non-LTR retrotransposons) in the chromosomes of Drosophila melanogaster, using data from the sequenced genome. The density of LTR and non-LTR retrotransposons (RNA-based elements) was high in regions with low recombination rates, but there was no clear tendency to parallel the recombination rate. However, the density of transposons (DNA-based elements) was significantly negatively correlated with recombination rate. The accumulation of TEs in regions of reduced recombination rate is compatible with selection acting against TEs, as selection is expected to be weaker in regions with lower recombination. The differences in the relationship between recombination rate and TE density that exist between chromosome arms suggest that TE distribution depends on specific characteristics of the chromosomes (chromatin structure, distribution of other sequences), the TEs themselves (transposition mechanism), and the species (reproductive system, effective population size, etc.), that have differing influences on the effect of natural selection acting against the TE insertions.
| |
INTRODUCTION |
|---|
|
|
|---|
Transposable elements (TEs) have been found in organisms
as different as bacteria, nematodes, yeast, plants,
fishes, and mammals including humans. Evidence is accumulating that
they are agents of genome restructuring and, as such, appear to be a
major constituent of genomes (Kidwell and Lisch 1997
; Shapiro 1999
;
Tomilin 1999
). TEs have a transposition capacity that enables them to
invade the genome, leading to insertional mutations and chromosomal
rearrangements. Therefore, organisms have developed various mechanisms
to limit their number. However, the relative importance of the forces
that control the dynamics of TEs in natural populations is still
controversial (Biémont et al. 1997
; Charlesworth et al. 1997
;
Kidwell and Evgen'ev 1999
; Nuzdhin 1999
). It has been proposed that
containing the number of TE copies must involve either selection
against rearrangements caused by ectopic recombination between TE
insertions (Langley et al. 1988
; Charlesworth et al. 1994
, 1997
; Zhang
and Peterson 1999
; Gray 2000
) or selection against TE-induced mutations
(Biémont et al. 1997
). There are data consistent with both of
these hypotheses. For example, in humans, ectopic exchange between Alu
sequences seems to be more important in producing deleterious mutations (0.2%-0.3% of diseases) than insertional mutagenicity (0.1%; Roy et
al. 1999
), whereas ~50% of mutations in Drosophila melanogaster are attributable to TE insertions (Finnegan 1992
). If the ectopic exchange in a region is proportional to the meiotic exchange in that
region (Langley et al. 1988
; Petes and Hill 1988
; Montgomery et al.
1991
; Goldman and Lichten 1996
, 2000
), then the number of TE insertions
should be negatively correlated with the recombination rate. The same
is true for selection against insertional mutations, because selection
is weaker in regions of low recombination (Hill and Robertson 1966
,
Charlesworth et al. 1993
; Kliman and Hey 1993
). Hence, the density of
FL L1 elements has been found to be negatively correlated with
recombination rate in humans (Boissinot et al. 2001
), suggesting that
purifying selection against TEs is occurring. However, the distribution
of TE insertion sites over the chromosomes of D. melanogaster
shows no evident relationship between the frequency of recombination
and TE density (Hoogland and Biémont 1996
), although it is well
known that TEs accumulate in heterochromatic regions such as the
chromocenter and the bases of chromosomes, which are characterized as
sites where there is little or low recombination (Charlesworth et al.
1992a
,b
). The TEs of Drosophila also seem to be more abundant
on chromosome 4 and within some inversions, both of which have a low
recombination rates (Montgomery et al. 1987
; Langley et al. 1988
; Eanes
et al. 1992
; Sniegowski and Charlesworth 1994
). In plants, many
elements are located in clusters at the paracentric heterochromatin
(Brandes et al. 1997
), the copia-like elements are
concentrated in the centromeric regions (Heslop-Harrison et al. 1997
),
and the regions flanking the centromeres are densely populated by TEs
(Copenhaver et al. 1999
). It is becoming increasingly evident, however,
that TEs are major constituents of the centromeric regions in
Drosophila (Pimpinelli et al. 1995
; Zuckerkandl and Hennig
1995
; Pardue et al. 1996
; Eissenberg and Hilliker 2000
), and that this
is not merely the result of passive accumulation in such regions caused
by the absence of strong forces tending to eliminate them.
A recent study in Caenorhabditis elegans produces even more
puzzling results, with a recombination rate that was found positively correlated with the amount of transposons (DNA-based elements), but not
with the amount of LTR and non-LTR retrotransposons (RNA-based elements; Duret et al. 2000
). The importance of the forces regulating TE distribution may thus vary according to the genome, indicating the
need for comparative studies (Kidwell and Evgen'ev 1999
). The
difference between the findings from studies of C. elegans and
Drosophila may also be attributable to the type of data used, population data in the case of Drosophila, data from the
genome of a single individual in that of C. elegans. Because
we now possess the sequence of the entire D. melanogaster
genome, we analyzed the distribution of its TEs in relation to the
recombination rate. LTR and non-LTR retrotransposons tend to accumulate
in regions of low recombination, but with no clear tendency to parallel
the recombination rate along the chromosomes. There is, however, a negative correlation between the recombination rate and the density of transposons.
| |
RESULTS |
|---|
|
|
|---|
Table 1 shows the reference used for
each element and the number of sequences retrieved from the D. melanogaster genome. Among the 54 TE families, 10 were transposons,
28 LTR retrotransposons, and 16 non-LTR retrotransposons. We thus
collected 1007 insertions, 185 transposons (DNA-based elements), 572 LTR retrotransposons, and 250 non-LTR retrotransposons (RNA-based
elements). No copies of P and HeT-A elements
were identified in the genome. Figure 1
reveals that TEs accumulate mainly in pericentromeric regions and in
chromosome 4, but not in the telomeric regions, which is consistent
with what is usually observed (Charlesworth et al. 1992a
,b
).
|
|
TE Density According to Recombination Rate
Because the accumulation of TEs in Drosophila
heterochromatin is likely to be a consequence of the peculiar
properties of the heterochromatic material (Pimpinelli et al. 1995
), it
is not possible to establish a direct relationship between the
accumulation of TEs in heterochromatin and the recombination rate
(Garcia Guerreiro and Fontdevila 2001
). As the pericentromeric regions
and chromosome 4 are heterochromatic regions, the relationship between
TE density and recombination rate was studied both with and without
these regions to avoid statistical bias. Moreover, as telomeric regions have peculiar evolutionary dynamics (Marais et al. 2001
), we also performed the analyses with and without these regions.
Figure 2 shows the relationship between the
mean density of TEs and the recombination rate for each TE type. The
density of LTR retrotransposons (Fig. 2a) was high for low values of
recombination, but was homogeneously distributed afterward, and the
density of non-LTR retrotransposons did not seem to follow any clear
tendency (Fig. 2b). However, the Spearman rank correlations were
significant (
=
0.13, P < 0.01 for LTR
retrotransposons,
=
0.15, P < 0.01 for non-LTR
retrotransposons), and the
2 values, calculated for four
classes of recombination each containing 25% of the values, were also
significant (
2 = 113.12, P < 0.0001 for LTR
retrotransposons,
2 = 52.46, P < 0.001 for
non-LTR retrotransposons), with accumulations of insertions in the
lower recombination rate class. We did the calculation again, this time
without the pericentromeric and telomeric regions and chromosome 4. The
Spearman rank correlations were no longer significant (
=
0.02,
P > 0.05 for LTR retrotransposons,
= 0.08,
P > 0.05 for non-LTR retrotransposons), although the chi2 statistics were still highly significant for LTR
retrotransposons (
2 = 34.42, P < 0.0001) and
to a lesser degree, significant for non-LTR retrotransposons
(
2 = 8.94, P = 0.03), as a result of
accumulations of insertions in the lowest recombination class value.
Therefore, we conclude that the densities of LTR and non-LTR
retrotransposons do not parallel the recombination rate along the
chromosomes, although these TEs do tend to accumulate in regions with
low recombination rates.
|
There was a significant negative correlation between the density of
transposons and the recombination rate (Spearman rank correlation
coefficient,
=
0.29, P < 0.0001; Fig. 2c), which remained significant after eliminating the pericentromeric, telomeric, and chromosome 4 regions (
=
0.19, P < 0.0001). The
chi2 tests on transposon density over the four recombination
classes defined above were highly significant
(
2 = 197.84, P < 0.0001) with an
accumulation of transposons in the lowest recombination rate class when
all of the genomic regions were taken into consideration. The
2 remained highly significant after eliminating the
telomeric and pericentromeric regions and chromosome 4 (
2 = 71.78, P < 0.0001). We thus conclude
that the density of transposons decreases when the recombination rate increases.
We performed a detailed analysis for the LTR retrotransposons on
individual chromosomes. Such analysis was not done for transposons or
non-LTR retrotransposons because of the lack of data for a reliable
statistical analysis. The telomeric, pericentromeric, and chromosome 4 regions were not included in the analysis. No significant linear
correlation between the recombination rate and the density of LTR
retrotransposons was found for any of the chromosome arms (Spearman
rank correlation,
=
0.13, P > 0.05 for 2L;
=
0.12, P > 0.05 for 2R;
=
0.16,
P > 0.05 for 3L;
=
0.09, P > 0.05 for
3R), although all values were negative. These negative correlation
values were due to an accumulation of LTR retrotransposon copies in
regions of low recombination, which was significant for the 2L arm
(Fig. 3a:
2 = 43.33,
P < 0.0001) and the 3R arm
(Fig. 3d:
2 = 14.56, P < 0.01), but not for
the 2R or 3L arms. The X chromosome gave different results. Because of
the narrow range of recombination values in the middle portion of this
chromosome, the distribution of LTR retrotransposon insertions was
analyzed along the entire chromosome by use of nonparametric
statistics. The X arm was split into 5000-bp fragments, which were
coded 1 if they contained at least one TE insertion and 0 if no
insertion was detected. This allowed us to calculate the Variance of
Ranks, VR, which detects aggregation in the middle of the sequence of 0 and 1 (low variance value) or at its ends (high variance value), and
the Multiple Pool, MP, which detects aggregation at various regions in
the sequence (see Aulard et al. 1995
, for details of these statistics). These tests (VR = 3.5, P < 0.001; MP = 3.73,
P < 0.001) showed that copies of LTR retrotransposons were
at least grouped near the X centromere. A one-sample Kolmogorov-Smirnov
test, which detects aggregation in a given region, revealed a
significant group of LTR retrotransposon copies around 3.33 Mb, outside
but not far from the telomeric region (KS = 0.15,
P < 0.01). Therefore, there was no tendency for
LTR retrotransposons to accumulate in the X telomere despite its low
recombination rate.
|
To make it easier to compare our data with that from the population
analysis (Hoogland and Biémont 1996
), we calculated the Spearman
rank correlation between recombination rates and the mean density found
in each recombination rate class of the elements roo/B104, mdg3, mdg1,
copia, 412, 297, I, and
hobo, for which we had reliable data from both studies. Once
again, the pericentromeric, telomeric, and chromosome 4 regions were
excluded. The
values were not significant, and fell within the
range of values of the population study (roo/B104,
=
0.14, P > 0.05; mdg3,
=
0.25, P > 0.05; mdg1,
= 0.16,
P > 0.05; copia,
= 0.25,
P > 0.05; 412,
=
0.10,
P > 0.05; 297,
=
0.21,
P > 0.05; I,
=
0.04, P > 0.05; hobo,
=
0.14,
P > 0.05), suggesting that there was no great difference
between the two approaches.
Gene Density According to Recombination Rate
No significant correlation was detected between gene density and
recombination rate (Spearman rank correlation,
=
0.09, P > 0.05), suggesting that the relationship between TE
density and recombination rate was not biased by the amount of genes.
TE Density on Autosomes and on the X Chromosome
We compared the TE density on autosomes and the X chromosome for
each class of TEs and for all of the TEs taken globally. When all
regions were considered, the chi2 statistics were significant
for transposons (
2 = 8.40, P < 0.01) and
non-LTR retrotransposons (
2 = 5.48,
P = 0.02), both of which showed a deficit in the number of
copies on the X chromosome.
2 was not significant for LTR
retrotransposons (
2 = 3.69, P > 0.05) or for
all of the TEs pooled (
2 = 0.92, P > 0.05).
Eliminating the pericentromeric, telomeric, and chromosome 4 regions
rendered the tests nonsignificant (
2 = 1.53,
P > 0.05 for non-LTR retrotransposons;
2 = 1.63, P > 0.05 for transposons;
2 = 1.58, P > 0.05 for all TEs pooled)
except for LTR retrotransposons (
2 = 9.41,
P = 0.002), which showed an accumulation of copies on the X chromosome.
| |
DISCUSSION |
|---|
|
|
|---|
As in all previous studies, our study confirms the accumulation of TEs in centromeric and pericentromeric regions, with a low TE density in subtelomeric regions. However, after eliminating these specific regions, the densities of LTR and non-LTR retrotransposons appeared to be high in regions with low recombination rates, but to have no direct, linear relationship with the recombination rate along the chromosome arms. In contrast, a negative correlation was found between transposon density and recombination rate. The accumulation of TEs in regions of reduced recombination rate is compatible both with selection acting against deleterious mutations caused by TE insertions, and with selection acting against chromosomic rearrangements caused by ectopic recombination between TE copies, because, in both cases, selection is expected to be weaker in regions of reduced recombination. The nonlinearity in the relationship between recombination and LTR and non-LTR retrotransposon densities might result from nonlinearity in Hill-Robertson effects and ectopic exchanges with meiotic recombination. However, the observations that the transposon density linearly decreased with recombination rate and that the accumulation of LTR retrotransposons was statistically significant only for a few chromosome arms suggest that selection is not sufficient to explain the distribution of TEs along chromosomes in the D. melanogaster genome.
An absence of linear correlation between retrotransposon density and
recombination rate when centromeric and pericentromeric regions were
eliminated from the calculation is congruent with previous population
studies (Hoogland and Biémont 1996
), in which, however, no
negative correlation was detected between transposon density and
recombination rate. A major difference between the two studies is that
the population approach involved only two transposons, P and
hobo. The present study relied on 10 transposons, without the
P element, which was not detected in the sequenced genome. The
significant negative correlation observed between transposons and
recombination rate, strongly supports the hypothesis that selection
acts against these TEs. But if this is so, then why is the relationship
between transposon density and recombination negative in
Drosophila and positive in the nematode (Duret et al. 2000
)?
Differences in meiotic pairing and recombination mechanisms could
account for the contrasting relationships between TEs and recombination
rate in these two species. But Drosophila and C. elegans use the same recombination-independent mechanisms to align homologs (McKim et al. 1998
), and ectopic recombination is several orders of magnitude less frequent than allelic recombination in both
organisms (Virgin and Bailey 1998
). Yeast, however, initiates homolog
colocalization and alignment by homology-dependent DNA-DNA interactions (Kleckner and Weiner 1993
), and shows only small, but
significant, differences between ectopic and allelic recombination frequencies (Kupiec and Petes 1988
; Goldman and Litchen 1996
). TEs are
not randomly distributed in yeast, but are mainly located in genes
transcribed by RNA polymerase III, such as tRNA genes (Kim et al.
1998
). It is, however, difficult to explain why the non-uniform homolog
pairing in Saccharomyes cerevisiae (Kleckner and Weiner 1993
),
as opposed to the close homolog alignment in Drosophila and
C. elegans, could account for the differences between TE
distribution in these two species and in the yeast genome.
Could a difference in breeding system account for the reverse
correlations between transposon density and recombination rates in
Drosophila and the nematode? If the deleterious effects of TEs
are mostly recessive, then selection against TEs should be most
effective in populations with high levels of homozygosity (Wright and
Schoen 1999
). In contrast, because ectopic exchanges occur
preferentially between heterozygous TE insertions (Montgomery et al.
1991
, Charlesworth and Charlesworth 1995
), according to this selective
model, selection should be most effective in out-crossing populations.
Because the C. elegans breeding system is presumed to be
mostly inbreeding (Baird et al. 1992
), unlike that of
Drosophila, the effects of selection can be expected to differ
in these two species. Moreover, because self-fertilization can
theoretically be expected to reduce the recombination rate (Morgan
2001
), studying the relationship between recombination rate and TE
density would not easily detect selection against TE-induced
mutations in C. elegans. However, the pres- ent
sequenced Drosophila genome is derived from a laboratory
strain that is undoubtedly homozygous, whereas the genomes of
individuals from natural populations are likely to be highly
heterozygous. Therefore, homozygosity may have interfered with the
mechanisms controlling the TE copy number in this specific genome
during its long history in the laboratory, during which the loss or
mobilization of specific TEs cannot be excluded (Biémont et al. 1987
).
The contrasting relationships between recombination rate and transposon
density in the nematode and Drosophila could be attributable to the very large population size in Drosophila. A large
population means that selection against TE insertions are likely to
outweigh drift (Charlesworth and Charlesworth 1995
), so that selection can interfere with recombination, leading to fewer transposon insertions in regions of high recombination in which selection is
strongest. If selection is a weaker force in the nematode, then it
should be possible to detect the preference for transposons to become
inserted in regions of high recombination rate, as a result of their
mechanisms of transposition in this species (Duret et al. 2000
).
Although it is difficult to compare effective population sizes between
two species, the specific reproductive system of the nematode, which is
assumed to be mostly inbreeding (Baird et al. 1992
), may have an effect
similar to that of a smaller population. This means that the
interaction between selection, effective population size, and
recombination is of great importance in the structuring of genomes.
The presence of high densities of TEs in regions of low recombination
and the significant negative correlation between transposons and
recombination rate suggest that selection may act against the insertion
of TEs. According to this hypothesis, fewer TE insertions can be
expected on the X chromosome than on the autosomes. Because the males
are hemizygous in Drosophila, deleterious TE insertions on the
X should be selected against to a greater extent than insertions on the
autosomes (Montgomery et al. 1987
; Langley et al. 1988
; Charlesworth et
al. 1994
). We did not detect any reduction in TE density on the X
chromosome compared with the autosomes for the TEs of the sequenced
genome. This observation and the fact that differences in TE amount
between the autosomes and the X chromosomes have been observed for some
elements, but not for others, in studies of populations of
Drosophila (Montgomery et al. 1987
; Biémont 1992
;
Charlesworth et al. 1992a
,b
; 1994
; Biémont et al. 1997
), suggest
that selection against the insertional effects of TEs is not the main
force controlling TE copy number. This is also consistent with the data
on C. elegans, which shows no evidence that there are fewer TE
insertions on the X chromosome than the autosomes (Duret et al. 2000
).
We must, however, consider the possibility that the TEs may have only
been mobilized recently within the genome of the Drosophila
stock used for sequencing, and that selection had not yet reduced the
TE copy number on the X chromosome.
We cannot rule out the possibility that the distribution of TEs is not directly associated with recombination, but depends on other factors that could themselves be associated, possibly fortuitously, with recombination, and so depends on the individual genome considered and the species. This is illustrated by the observations that LTR retrotransposons aggregated in a small region of the X chromosome, but were homogeneously distributed in regions of middle and high recombination rates, and accumulated in regions of low recombination other than pericentromeric regions only in the 2L and 3R chromosome arms.
The distribution of target sites for TE insertions could vary with the
DNA base composition (Sharp and Matassi 1994
), and this would account
partly for the distribution of TEs along the chromosomes. In the
C. elegans and Drosophila, the G+C content is
positively correlated with the recombination rate, both in noncoding
regions and in synonymous positions of codons (Marais et al. 2001
).
This might lead to a link between the distribution of target sites for
TE insertions and the recombination rate. Many TEs seem to be inserted
in AT-rich, late-replicating, DNA regions (Le et al. 2000
), and their
target insertion sites are often a succession of A and T. For example,
the human L1 elements show target specificity for TTTTAA,
which leads to a linear negative relationship between L1
density and GC richness. This has also been shown for LTR
retrotransposons 1731, 17.6 in D. melanogaster, TRIP in sea urchin, Mag of
Bombyx mori (Springer et al. 1995
), and certain retroviruses
(Bernardi et al. 1985
), but it has also been shown that TEs are
globally AT-rich (Sharp and Matassi 1994
; Lerat et al. 2000
, 2002
), so
that insertions of numerous TE copies in a region leads to a low GC
value. On the other hand, TEs could accumulate in low gene density
regions, as reported for the Arabidopsis genome (The
Arabidopsis genome initiative 2000
), and could be associated
with low GC content (Kumar and Bennetzen 1999
; for review, see Lin et
al. 1999
; Adams et al. 2000
; Jabbari and Bernardi 2000
), and reduced
recombination rate (for review, see Kliman and Hey 1993
; Charlesworth
1994
; Fullerton et al. 2001
; Marais et al. 2001
). In the present study,
however, we did not detect any relationship between gene density and
recombination rate. Moreover, the GC-rich SINE elements are located in
GC-rich regions (Korenberg and Rykowski 1988
; Boyle et al. 1990
; Jurka
1997
), and the region around P insertion sites in
Drosophila are GC-rich (Liao et al. 2000
). In humans, the
distributions of young and old copies of Alu elements have been found
to be different (Smit 1999
), suggesting that Alus integrate randomly
but are preferentially fixed in GC-rich DNA as the result of some force
of selection. These data do not allow us to conclude that there is any
specific or general relationship between base composition and TE distribution.
TEs may insert preferentially in regions in which other sequences are
already inserted, making the correlation between TE density and
recombination rate merely fortuitous and variable from genome to genome
and species to species, especially if such sequences could be
recombinogenic, as postulated for the CeRep sequences of C. elegans (Cangiano and La Volpe 1993
; Barnes et al. 1995
). If the
specific sequences in which the TEs are inserted vary according to the
TE family, then the association between recombination rate and TE
density will vary with the TE considered. For example, microsatellites
accumulate preferentially in genome regions in which recombination is
infrequent (Charlesworth et al. 1994
). However, the density of
microsatellites is not influenced by the recombination rate in D. melanogaster (Bachtrog et al. 1999
) and, there is no strong
evidence for the overall insertion of different TEs in specific
sequences, such as micro and mini-satellites. We have evidence,
however, that some of the microsatellite sequences may, in fact,
result from the TEs themselves (Nadir et al. 1996
; Jarne et al. 1998
;
Toth et al. 2000
), although a high density of microsatellites does not
always coincide with a high density of TEs, especially in
Arabidopsis thaliana (Schlötterer 2000
), even though
retrotransposons have been found near microsatellites in barley
(Kalendar et al. 1999
).
These findings suggest that various features (local genomic composition and structure, chromatin conformation, DNA nick repair, number of DNA replications, effective population size, reproductive system, and history of the host) could variously influence and even blur the impact of natural selec tion acting against the TE insertions along the chromosome.
| |
METHODS |
|---|
|
|
|---|
Sequence Data and Locations of Transposable Elements
The sequences of the D. melanogaster chromosome arms X,
2L, 2R, 3L, 3R, and 4 were retrieved from the unannotated version 1 of
the genome (Adams et al. 2000
; BDGP 2000). The retrieved data (all
tracks of N omitted) thus totaled 114.5 Mb, corresponding to 64% of
the whole genome sequence (the actual sequence represented 95% of the
total euchromatin). Therefore, all of the TE sequences collected were
from the euchromatic part of the genome, because most of the
heterochromatin, including the Y chromosome, was not sequenced.
Heterochromatin is composed of many transposable elements, mostly in an
inactive state and in the form of defective sequences. Hence, the TE
sequences studied here represent only a fraction of all the TE
sequences in the Drosophila genome. Our analysis is thus
pertinent for comparison with data on TE chromosomal locations obtained
from population analyses in which the in situ hybridization technique
used gives information on the copy number of the TEs that are inserted
in the euchromatin, along polytene chromosomes.
A bank of reference sequences for TE families was constituted with
sequences retrieved from the FlyBase database (FlyBase 1998;
http://flybase.bio.indiana.edu/). When the sequences from this database
corresponded to incomplete copies of LTR retrotransposons, we searched
for full-length copies with the BLAST program in the
high-quality genomic clone sequences from BDGP/EDGP
(http://www.fruitfly.org). The list was completed with sequences
retrieved from EBI (The European Bioinformatics Institute 2001;
http://www.ebi.ac.uk/index.html, sequence set:
ftp://ftp.ebi.ac.uk/pub/databases/edgp/sequence_sets) and with the
pilgrim sequence (Costas et al. 2001
). The
distributions of TE insertions in the whole shot-gunned genome sequence
were then analyzed by use of the above reference sequences and by the program RepeatMasker (A.F.A. Smit and P. Green, unpubl.; http://repeatmasker.genome.washington.edu/cgi-bin/RM2_req.pl). Because many TE insertions were consensus or mixed sequences, we only
worked on localization data of TEs. For the LTR retrotransposons, we
considered as insertions only the retrieved sequences with at least one
complete or one incomplete typical LTR to avoid wrongly attributing
very divergent copies to a given element. Each match was then checked
to decide whether different copies of a given LTR retrotransposon were
present, or only one copy detected by different matches. For non-LTR
retrotransposons and transposons, we considered those retrieved
sequences that were >400 bp, so as to eliminate short, deleted, highly
divergent sequences. Each match was also checked as above.
The TE density was estimated from the number of TE insertions per base pair in the sequence fragments considered, excluding the number of N. For each class of recombination rate (see below), all genome regions corresponding to a given range of recombination rate were pooled. To analyze the relationship between TE density and recombination rate, the Drosophila genome was cut into nonoverlapping fragments of 0.25 Mb. The sequences corresponding to the telomeric, centromeric, and chromosome 4 regions were defined by use of the Gadfly annotations (http://hedgehog.lbl.gov:7081/annot/). When these regions were to be excluded from the analysis, the genome was cut into nonoverlapping fragments after removing the corresponding sequences.
Estimation of Gene Densities along the Chromosomes
Because DNA sequences on each chromosome arm were not yet fully annotated, we searched for the chromosomal location of the 13376 known and predicted genes of the Drosophila genome by using BLAST and the data on transcribed gene sequences of release 1 (na.gadfly.dros.RELEASE1 at the Berkeley Drosophila Genome Project site: http://www.fruitfly.org). Gene density along the chromosomes was thus determined as described above for the TEs. To analyze the relationship between gene density and recombination rate, the Drosophila genome was cut into nonoverlapping fragments of 0.25 Mb.
Estimation of Recombination Rate
The rate of recombination along the chromosomes was determined by
use of a procedure similar to that described by Kliman and Hey (1993)
.
The D. melanogaster genetic map data was taken from FlyBase
(FlyBase 1998). We selected the 892 loci that had been located in both
the genetic map and the genomic sequence. The recombination rate was
estimated for each chromosome arm by taking the derivative of the best
fitting polynomial function of the genetic distance versus the
nucleotide coordinate in the genomic sequence. Second-degree polynomial
curves fitted the data set (r2 > 0.97) well for all chromosome arms.
The fitting by the polynomial function is clearly correct for the
points in the center of the curve, but deviates for the subtelomeric
(chromosomal sections 20, 40-41, 80-81) and pericentromeric
(chromosomal sections 1, 21, 60-61) regions, in which the
recombination rates are low (Kliman and Hey 1993
). To analyze the
relationships between recombination rate and TE density, and between
recombination rate and gene density, for each genomic fragment of 0.25 Mb (see above), the recombination rate was estimated from the value of
the derivative of the polynomial curve at the middle position of the
fragment. The 0.0 cM/Mb value was assigned to the fragments of the
telomeric and pericentromeric regions and of chromosome 4. We also
defined 49 classes of recombination rate from 0.0 to 4.9 cM/Mb at
intervals of 0.1 cM/Mb. The telomeric and pericentromeric regions, and
the chromosome 4 were assigned to the 0.0-0.1 cM/Mb class of recombination.
| |
WEB SITE REFERENCES |
|---|
|
|
|---|
http: //flybase.bio.indiana.edu/; the FlyBase database of the Drosophila Genome Projects and community literature. FlyBase 1998.
http: //hedgehog.lbl.gov:7081/annot/; the genome annotation database. GadFly 2000.
http: //repeatmasker.genome.washington.edu/cgi/RM2_req.pl; RepeatMasker Smit, A.F.A. and Green, P. 2000.
http: //www.ebi.ac.uk/index.html; the European Bioinformatics Institute 2001.
http: //www.fruitfly.org/index.html; the Berkeley Drosophila Genome Project 2000.
| |
ACKNOWLEDGMENTS |
|---|
We thank D. Chessel, L. Duret, and C. Vieira for their comments. This work was funded by the Centre National de la Recherche Scientifique (Programme génome, GDR 2157) and the Association pour la Recherche sur le Cancer (contract 5428 to C.B.).
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| |
FOOTNOTES |
|---|
1 Corresponding author.
E-MAIL biemont{at}biomserv.univ-lyon1.fr; FAX 33 4 78 89 27 19.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.210802. Article published online before print in February 2002.
| |
REFERENCES |
|---|
|
|
|---|
Chromosome karyotyping by fluorescence in situ hybridization.
Proc. Natl. Acad. Sci.
87:
7757-7761.Received August 15, 2001; accepted in revised form December 14, 2001.
This article has been cited by other articles:
![]() |
S. W. Schaeffer, A. Bhutkar, B. F. McAllister, M. Matsuda, L. M. Matzkin, P. M. O'Grady, C. Rohde, V. L. S. Valente, M. Aguade, W. W. Anderson, et al. Polytene Chromosomal Maps of 11 Drosophila Species: The Order of Genomic Scaffolds Inferred From Genetic and Physical Maps Genetics, July 1, 2008; 179(3): 1601 - 1655. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. S. Dolgin and B. Charlesworth The Effects of Recombination Rate on the Distribution and Abundance of Transposable Elements Genetics, April 1, 2008; 178(4): 2169 - 2177. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Bergero, A. Forrest, and D. Charlesworth Active Miniature Transposons From a Plant Genome and Its Nonrecombining Y Chromosome Genetics, February 1, 2008; 178(2): 1085 - 1092. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. M. Bergman and D. Bensasson Recent LTR retrotransposon insertion contrasts with waves of non-LTR insertion since speciation in Drosophila melanogaster PNAS, July 3, 2007; 104(27): 11340 - 11345. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-P. Yang, T.-L. Hung, T.-L. You, and T.-H. Yang Genomewide Comparative Analysis of the Highly Abundant Transposable Element DINE-1 Suggests a Recent Transpositional Burst in Drosophila yakuba Genetics, May 1, 2006; 173(1): 189 - 196. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Marais, P. Nouvellet, P. D. Keightley, and B. Charlesworth Intron Size and Exon Evol |