|
|
|
|
Vol. 12, Issue 1, 57-66, January 2002
LETTER
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |
ABSTRACT |
|---|
|
|
|---|
Genome evolution entails changes in the DNA sequence of genes and intergenic regions, changes in gene numbers, and also changes in gene order along the chromosomes. Genes are reshuffled by chromosomal rearrangements such as deletions/insertions, inversions, translocations, and transpositions. Here we report a comparative study of genome organization in the main African malaria vector, Anopheles gambiae, relative to the recently determined sequence of the Drosophila melanogaster genome. The ancestral lines of these two dipteran insects are thought to have separated ~250 Myr, a long period that makes this genome comparison especially interesting. Sequence comparisons have identified 113 pairs of putative orthologs of the two species. Chromosomal mapping of orthologous genes reveals that each polytene chromosome arm has a homolog in the other species. Between 41% and 73% of the known orthologous genes remain linked in the respective homologous chromosomal arms, with the remainder translocated to various nonhomologous arms. Within homologous arms, gene order is extensively reshuffled, but a limited degree of conserved local synteny (microsynteny) can be recognized.
| |
INTRODUCTION |
|---|
|
|
|---|
Modern genomics have revolutionized genetics and, consequently,
biology. The enormous acceleration of data
acquisition, in fields such as whole genome sequence determination and
genome-wide gene expression profiling, has opened novel possibilities
for the study of model organisms and organisms for which, until
recently, only rudimentary biological knowledge was available (orphan
organisms). For example, until a decade ago only a few tens of genes
had been identified in important insect disease vectors such as
Anopheles gambiae or Aedes aegypti, which now
number ~24,000 and 1700 entries, respectively, in the nucleic acid
databases. Many of these represent partial genomic sequences,
sequence tagged sites (STSs), and anonymous cDNAs or
expressed sequence tags (ESTs; for review, see Louis 1999
). Such
genetic and molecular genetic information may prove helpful in
designing new schemes to fight the diseases transmitted by these
mosquitoes, such as malaria and dengue (James et al. 1999
). Progress in
elucidating the genomic information of formerly orphan insect organisms
can be considerably accelerated by using the closest available model
organism, in this case Drosophila melanogaster, as a guide.
A. gambiae s.s. (sensu stricto) is a member of the African
A. gambiae species complex that consists of six distinct
sibling species and itself can be distinguished into a series of taxa or incipient species (Coluzzi et al. 1985
), all differing in vectorial capacity (see Touré et al. 1998
). The pioneering studies of
Coluzzi and his collaborators on the construction of polytene maps for this species complex and the documentation of both fixed and
polymorphic inversions can be viewed as the start of genomic research
on the malaria mosquito.
Over the past decade, knowledge of the molecular biology and genetics
of A. gambiae s.s. has vastly improved. For example, numerous
molecular studies of the interactions between Anopheles and
Plasmodium have radically improved our understanding of this vector-parasite system (for review, see Sinden 1999
). The molecular study of the genome was initiated with the construction of a first low-resolution physical map, linked to the polytene chromosomes (Zheng
et al. 1991
), followed by the construction of a detailed, microsatellite-based recombination map (Zheng et al. 1993
, 1996
). Integration of the genetic (recombinational), cytogenetic (polytene), and molecular (clone and sequence) maps has progressed rapidly; it
entails the genetic and cytogenetic mapping of random
amplified polymorphic DNA (RAPD) markers (Dimopoulos et al. 1996a
), the recombinational mapping of microsatellites, and the assignment of both
microsatellites and anonymous DNA markers to specific chromosomal
locations, using in situ hybridization to polytene chromosomes (della
Torre et al. 1996
; Dimopoulos et al. 1996a
; Zheng et al. 1996
; Wang et
al. 1999
). Microsatellites have been used successfully both for gene
mapping (Collins et al. 1997
; Zheng et al. 1997
; Ranson et al. 2000
)
and for studies of population biology (e.g., see Lanzaro et al. 1998
;
Kamau et al. 1999
; Wang et al. 1999
, 2001
). Finally, routine germline
transformation and thus reverse genetic studies of A. gambiae
can be expected soon, judging by the recent success in transforming
both anopheline (A. stephensi; Catteruccia et al. 2000
) and
aedine mosquitoes (Ae. aegypti; Coates et al. 1998
;
Jasinskiene et al. 1998
).
Important additional tools for comparative genomic studies of A. gambiae have become available recently. They include a collection of ESTs that may represent ~10% of the mosquito genes (Dimopoulos et
al. 2000
), and ~17,500 sequence-tagged ends of a bacterial artificial chromosomes (BAC) chromosomal library representing 14.5 Mb
or 7% of the expected euchromatic DNA sequence
(http://bioweb.pasteur.fr/BBMI; C. Roth and F.H. Collins, pers.
comm.). An experimental strategy that combines the identification of
orthologs by sequence similarity searches and their mapping to the
chromosomes or linkage groups of different species has proven to be
very informative in comparative genomic studies of both animals
(O'Brien et al. 1999
) and plants (Terryn et al. 1999
). An important
type of information derived from such studies is the degree of
conserved synteny: to what extent the chromosomal dynamics in evolution
permit linkage group conservation, that is, persistent linkage of most
genes in a given chromosome between compared species (long-range
synteny in homologous chromosomes). A second important issue is to what
extent originally neighboring genes remain clustered (local conserved
synteny, or microsynteny) rather than becoming randomized in terms of
their order within the homologous chromosome.
Here we used the essentially complete sequence information on the
D. melanogaster genome (Adams et al. 2000
), together with the
available A. gambiae genomic resources, to address the
questions of sequence conservation, long-range synteny, and local
microsynteny between the genomes of the mosquito and the fruit fly, two
distantly related diptera.
| |
RESULTS |
|---|
|
|
|---|
Chromosomal Distribution of A. gambiae Orthologs of Genes From Two D. melanogaster Chromosomal Regions
In a first set of experiments aiming at exploring long-range synteny and microsynteny, we identified, among the currently available A. gambiae sequences, putative orthologs of genes in which in D. melanogaster are clustered within two well-studied chromosomal regions, each nearly 3 Mb long. We then determined the genomic locations of these putative orthologs by hybridization to the mosquito polytene chromosomes.
The fruit fly genomic regions that were chosen for these experiments
have been completely sequenced and annotated, both in clone-by-clone
sequencing projects and as part of whole-genome shotgun sequencing. One
of these Drosophila regions is the autosomal Adh
region, covering 2.9 Mb on both sides of the Adh gene, in divisions 34B-35F of chromosomal arm 2L (Ashburner et al.
1999
). The other is the tip of chromosome X,
encompassing 2.6 Mb in polytene divisions 1-3 (Benos et al. 2000
,
2001
). Both of these regions were also covered by whole-genome shotgun
sequencing (Adams et al. 2000
).
The 256 genes from the tip of the X and the 219 genes from the
Adh region of Drosophila were used to query, by
TBLASTN, collections of both STSs and ESTs of
Anopheles: the 17,506 STSs representing end sequences of BAC
clones, and the 6012 ESTs that correspond to 2380 potential genes (cDNA
clone clusters from a subtracted normalized library; Dimopoulos et al.
2000
). To define genes as putative orthologs, hits that satisfied
criteria of high score of >40, probability P(N) of <1
e
10, and percentage of identical amino acid residues >30
over a long range were selected in a first round. From them, all
spurious hits that were caused by the presence of low complexity
segments were eliminated, and the remaining hits were confirmed by
BLASTX analysis as best bidirectional hits against a
database of 14,080 amino acid sequences of known and predicted
Drosophila genes (release 1.0; Adams et al. 2000
). Those that
passed this test were further verified by direct comparison to the
corresponding Drosophila entry, taking into account potential
intron-exon boundaries. Henceforth, these validated genes will be
referred to as orthologs for convenience (see also Discussion). These
procedures (see also Methods) identified 19 mosquito orthologs of
unique genes found in the tip region of the Drosophila X
chromosome and 31 orthologs of unique genes found in the
Drosophila Adh region. For greater accuracy, we eliminated from consideration additional probable orthologs (18 showing hits to
X-tip and nine showing hits to Adh region genes),
because they belong to chromosomally dispersed multigene families. This
was necessary because the true ortholog can not be chosen among the different members of a given gene family until both genomes are fully sequenced.
The 50 orthologs that were retained for further analysis were present
in 33 BAC and 37 cDNA clones (a number of them were detected by both
STSs and ESTs). Representative clones were used as probes for in situ
hybridization analysis to A. gambiae polytene chromosomes.
Tables 1 and 2
include the results of this analysis for the X-tip orthologs
and Adh region orthologs, respectively. Notably, the Tables
show cytogenetic and molecular locations of the 50 Drosophila
genes and the sequence identifiers and cytogenetic locations of the
corresponding A. gambiae orthologs. The distribution of the
X-Tip and Adh region orthologs among the five
polytene chromosome arms of A. gambiae are tabulated in Table
3, together with the results of
statistical analysis of these distributions using the binomial test of
significance, confirmed by the
2 test.
|
|
|
For the statistical analysis, we compared the number of orthologs
corresponding to each Drosophila region that were observed in
each chromosomal arm of the mosquito to the number expected if the
association were random according to chromosomal arm length. To
calculate the expected numbers, the lengths of the five mosquito chromosomal arms were estimated according to the number of their lettered subdivisions, as recognized in the map of Coluzzi and associates (22 subdivisions for X, 54 for 2R, 40 for
2L, 37 for 3R, and 31 for 3L; or 12.0%,
29.35%, 21.7%, 20.1%, and 16.85% of the total, respectively; the
map is accessible at http://www.anodb.gr/AnoDB/Cytomap/). The binomial
test is an exact probability test that is used to examine the
distribution of a single dichotomy in conditions when only a relatively
small sample is available, as is the case here. It provides a
one-sample test of the difference between the sampled distribution and
a given distribution. In this case, the given distribution is based on
the null hypothesis that the genes of each Drosophila
chromosomal region are randomly redistributed across all five
chromosome arms of A. gambiae according to their lengths. As
shown in Table 3 for the gene probes derived from the tip of chromosome
X, all P values are >0.05, and thus, the null
hypothesis cannot be rejected. Similarly, the
2 statistic
(not shown) is equal to 2.415, lower than the critical value
24[0.05] = 9.49; therefore, the null
hypothesis can not be rejected. By these criteria, none of the five
mosquito chromosomal arms is significantly enriched for orthologs of
the X-tip genes of Drosophila.
In contrast, the results of the same analysis strongly indicate that
the Drosophila arm 2L (at least its Adh
region) corresponds to the chromosome arm 3R of A. gambiae (P = 9.939 e
12). That mosquito arm
includes nearly fourfold as many genes as expected: It contains 24 (77%) of the currently available orthologs of the Drosophila
Adh region genes, whereas only 7 (23%) orthologs are scattered
over three other mosquito autosomal arms. Furthermore, in three out of
four remaining mosquito chromosomal arms (2R, 2L, and
3L), the prevalence of orthologs of Drosophila 2L
genes is statistically significantly lower than expected. Thus, the binomial test clearly rejects the null hypothesis of random
redistribution of Adh region genes, in terms of both positive
and negative correlations. Rejection is also supported by the
2 analysis, in which the statistic (not shown) is equal to
64.12 with the same critical value as before
(
24[0.05] = 9.49).
It should be noted from Table 3 that the mosquito orthologs of the Adh region genes are not evenly distributed within the mosquito 3R arm: Half of them are located within four chromosomal subdivisions (29C, 31C, 32B, and 33A), whereas the other half are scattered among the other 33 subdivisions of 3R. This apparent clustering may correspond to microsynteny, as will be discussed below.
Distribution of Randomly Selected A. gambiae Sequences and Their D. melanogaster Orthologs
A similar but reverse method was used in a second experiment addressing the question of long-range synteny. In this case, we started by mapping random A. gambiae STSs mapped on the polytene chromosomes and determined their orthologs and the respective cytogenetic locations in D. melanogaster.
Randomly selected BAC clones of A. gambiae from the library
that had been used to determine STS end sequences
(http://bioweb.pasteur.fr/BBMI; C. Roth and F.H. Collins, pers. comm.)
were mapped by in situ hybridization to mosquito polytene chromosomes.
A total of 1217 STS were available from 720 cytogenetically mapped
clones, and they were used for a BLASTX search of the
protein sequences corresponding to the 14,080 known and predicted
D. melanogaster genes (release 1.0; Adams et al. 2000
). This
search led to the identification of 49 mapped STS that were putative
orthologs of unique D. melanogaster genes. In addition,
A. gambiae genes of known cytogenetic location were used to
search the same Drosophila database, yielding 21 additional
hits. This number also included cecropin and ADP/ATP,
two A. gambiae genes, each of which is homologous to a
corresponding small multigene family in Drosophila, clustered at a single cytogenetic location. Table
4 lists these 70 mosquito gene sequences by cytogenetic location, together with their
Drosophila orthologs and their locations. Table
5 summarizes and correlates the chromosomal
locations of corresponding sequences in the two species. As in the
previous experiment, the binomial test and the confirming
2 analysis (not shown) used the numbers of orthologs
expected on each Drosophila chromosomal arm, in this case
according to a random distribution calculated on the basis of the
respective known DNA content of the Drosophila arms (Adams et
al. 2000
).
|
|
The data from this second experiment (Table 5) completely confirm and
extend the conclusions from the first experiment. They identify
statistically significant and unique chromosomal arm homologies with
the P values ranging from 0.0193 to 0.0009, as follows:
XAg/XDm,
2RAg/3RDm,
2LAg/3LDm,
3RAg/2LDm, and
3LAg/2RDm. Except for these, no
other pairs even approach statistical significance as homologs.
However, the dot chromosome 4 of Drosophila does not
exist in the mosquito, and the single known Anopheles homolog of a chromosome 4 gene is found on the mosquito X
chromosome. In this second experiment, as much as in the first, the
relative order of orthologous genes within the corresponding
chromosomal arms of the two species appeared to be scrambled. Again,
however, some residual microsynteny was detected (see below). For an
additional statistical analysis of the same data, we took as a starting
point the chromosomal distribution of the Drosophila orthologs
and compared the observed and expected distributions of
Anopheles genes; this inverse comparison corresponds to that
of the first experiment. As shown in Table
6, the inverse P values are all
significant, convincingly confirming the chromosomal arm homologies
established from Table 5.
|
Local Synteny of Adh Region Orthologs
As noted above, many genes are scrambled within the respective homologous chromosomal arms. However, a careful analysis of gene order between genes of the Adh region in the D. melanogaster 2L and their orthologs in the A. gambiae 3R gave a clear indication that a significant proportion, ~30%, remain locally clustered with the same neighboring gene. This local synteny may also be called microsynteny, in that it apparently only entails two or three genes at a time. The patterns of both gene scrambling and microsynteny are best displayed graphically, as in Figure 1. It should be noted that because of the availability of the genome sequence, the Drosophila Adh region genes are placed on both cytogenetic and DNA sequence scales; their orthologs in Anopheles can only be placed on the cytogenetic scale for now.
|
Of the 31 recognized mosquito orthologs of Adh region genes, 24 map to the Anopheles 3R chromosome, and 13 of these are found clustered in just four subdivisions, forming four cytogenetic clusters that are at least partially microsyntenic. In contrast, the remaining 11 mosquito orthologs are scattered individually amongst the remaining 33 chromosomal subdivisions of the Anopheles 3R chromosome.
The two distal-most mosquito cytogenetic clusters, on divisions 29C and
31C, are both derived from a tight cluster of 27 Drosophila genes that are located within ~150 kb at cytogenetic location 35F6-11
(Ashburner et al. 1999
). Of these 27 genes, 10 have known mosquito
orthologs, and seven of these map to the mosquito chromosome arm
3R; five are microsyntenic. The latter include two adjacent genes (the CG5861 and Sed5 orthologs) that map to the
31C cytogenetic cluster. The 29C cytogenetic cluster includes two
adjacent genes (the cact and l(2)35Fe orthologs) plus
one outlier (the BG:DS02740.4 ortholog). Each of these
clusters additionally encompasses one ortholog of a distant
Adh region gene (BG:BACR48E04.2 and lace, respectively).
Similarly, the mosquito 32B cytogenetic cluster includes three
Anopheles orthologs of genes BG:DS00797.7,
b, and Sop2 that in Drosophila are part of
an 16-gene cluster located within ~65 kb at 34D1-4 (Ashburner et al.
1999
). Two orthologs of other genes from the same cluster,
adat and RpII33, are known in the mosquito but do not
map at 32B; the orthologs of the 11 remaining genes in the 34D1-4
Drosophila cluster are as yet unknown.
Finally, the fourth mosquito cytogenetic cluster at 33C includes the
orthologs of adat from the Drosophila 34D1-4 region
(see above) plus two genes, beat-B and beat from the
Drosophila 35E1-F1 region. In Drosophila, the latter
two genes are paralogs with the same exon-intron structure and show
53% identity at the amino acid level. They are separated by ~100 kb,
a region that encompasses three other genes, BG:DS07486.2,
beat-C (also a paralog of beat-B and beat),
and Bic-C (Ashburner et al. 1999
); the orthologs of these three genes are not yet known in the mosquito. Interestingly, the
orthologs of beat-B and beat are from the STSs at the
two ends of the same mosquito BAC clone (03I12), and thus are also separated by ~120 kb. It would be interesting to sequence this clone
and thus discover whether the mosquito orthologs of the BG:DS07486.2, beat-C, and Bic-C genes are
also located in this interval.
| |
DISCUSSION |
|---|
|
|
|---|
The analysis presented here was made possible by the availability of
the essentially complete sequence of the D. melanogaster genome (Adams et al. 2000
) and is a clear example of comparative genomic research. It illustrates how full genomic information from a
model species can help provide considerable insight into the genomic
structure of even a rather distantly related and little-studied orphan
organism, when combined with bioinformatics analysis of partial
sequence information and physical mapping of clones representing ESTs
and STSs. It should be recalled that the fruit fly and the mosquito are
estimated to have diverged ~250 Myr (Yeates and Wiegmann, 1999
). The
study addresses three main questions.
The question of sequence divergence between orthologous genes of Drosophila and Anopheles relates to our ability to detect such genes. We have used rather stringent similarity criteria to accept genes as orthologs, and thus we expect that our reported collection of orthologs includes few if any false positives and excludes some widely divergent orthologs. Consistent with these expectations, the STS resource of BAC ends represents ~7% of the estimated euchromatic DNA of A. gambiae and yielded 26 (5.5%) orthologs of the 475 Drosophila genes present at the tip of the X and the Adh region of Drosophila. The EST resource includes 2380 cDNA clone clusters, but it is difficult to say how many actual genes are represented, because of the possibility of undetected overlaps. The EST resource yielded 24 of the orthologs or 5.1% of the genes in the Adh region and the tip of the X in Drosophila. Accepting the orthology of all genes shown in Tables 1, 2, and 4, we note that the detected orthologous exons show a range of 26% to 97% sequence identity at the amino acid level, with an average of 61.6% identity. If we consider only the most similar available exons, the orthologous genes have 31% to 97% local sequence identity, or 65.4% on average. This indicates that in most future cases, it should be possible to recognize orthologous genes in the two species using our criteria or to clone them by sequence homology.
The second question concerns the gross homology of chromosomes between the fruit fly and the mosquito. It is striking that both species have two major metacentric autosomes as well as an apparently telocentric X chromosome in the euchromatic polytene genome (five chromosomal arms in total). Only the very minor chromosome 4 (~1% of the genome in Drosophila) is absent from Anopheles. Taken together, our data show unequivocally that the five A. gambiae chromosome arms can be assigned a distinct homolog in the chromosomal complement of the fruit fly, and vice versa.
From Table 6, it can be seen that in different chromosomal arms,
between 27 and 59% of the genes have undergone interchromosomal translocation to nonhomologous arms since the last common ancestor of
D. melanogaster and A. gambiae. The extent to which
translocations occur varies for different arms (Table 6) and also
apparently for different chromosomal regions. Comparison between Tables
3 and 5 indicates that translocations have occurred more frequently for
genes that are now at the X-tip of Drosophila than
for the X as a whole; whereas translocations have occurred
less frequently for the Adh region than for that arm as a
whole. Overall, using Muller's definition of the chromosomal elements
of Drosophila (Muller 1940
), the A. gambiae
chromosome arms X, 2R, 2L, 3R, and 3L are homologous to the Drosophila elements
A, E, D, B, and C, respectively. Interestingly, in both species the arrangement of paired
elements is the same (A, B + C, D + E). The A. gambiae chromosomes 2 and 3 are homologous to the D. melanogaster chromosomes 3 and 2 respectively.
A dense collection of DNA markers from Aedes aegypti
(restriction fragment length polymorphisms) was used by Severson et al. (1994)
to evaluate genetic diversity and synteny among aedine mosquitoes and A. gambiae; however, synteny with
Drosophila was not examined. In a valuable earlier study,
Matthews and Munsterman (1994)
used 29 enzyme loci to study linkage
conservation amongst lower diptera (13 species of mosquitoes, not
including A. gambiae) and higher diptera (D. melanogaster). In different mosquito species five to 19 loci were
mapped. The investigators concluded that mosquito chromosomes are
modified by paracentric inversions and interchromosomal translocations.
They also noted that several amall groups of two to four enzyme loci
have been conserved in linkage in both mosquitoes and the fruit fly,
"although most traces of homology between the two dipteran linkages
have disappeared." In the present study, a much larger number of
orthologous gene sequences, mapped by in situ hybridization to polytene
chromosomes, permitted firmer conclusions: pairwise identification of
homologous polytene chromosomes in A. gambiae and D. melanogaster and quantification of the extent of nonhomologous arm
translocations between the fruit fly and the mosquito.
The third and final issue is the distribution of genes within broadly
homologous chromosomal arms, and the length of locally syntenic regions
conserved between these two dipteran species. Previous studies have
compared different distant Drosophila species to one another
by in situ hybridization of gene-specific probes or larger genomic
fragments usually derived from D. melanogaster. These studies
included a cross-comparison of D. melanogaster (as a reference
species) and several other species, including D. obscura, D. madeirensis, D. virilis, D. repleta,
D. buzzattii, and D. hydei (Loukas and Kafatos 1988
;
Whiting et al. 1989
; Segarra and Aguade 1992
; Lozovskaya et al. 1993
;
Segarra et al. 1995
; Nurminsky et al. 1996
; Vieira et al. 1997
; Ranz et
al. 1999
, 2000
; Gonzales et al. 2000
). These Drosophila
species were separated from D. melanogaster 25 to 60 Myr
(Beverley and Wilson 1984
; Russo et al. 1995
). The homologous
chromosome arms are usually easily identified by their gene content,
but the relative order and distances of the genes are considerably
reshuffled in the different species. Observed sizes of chromosomal
fragments conserved between species range from 20 to 600 kb (Ranz et
al. 1999
, 2000
; Gonzales et al. 2000
), although one cannot exclude
undetected small rearrangements within the larger fragments.
Calculations that take into consideration the number of inversion
breakpoints in several selected genomic regions and the divergence time
between species indicate that the frequency of breakpoints occurring in
the genus Drosophila may be as high as 0.05 to 0.08 per
megabase of sequence per million years (Ranz et al. 2000
). The lower
estimate of this frequency would imply that in the genome of A. gambiae, calculated to have a size of ~260 Mb, we may expect
microsyntenic regions conserved relative to Drosophila to have
an average DNA length of 50 to 80 kb of DNA. This is in striking
contrast to the frequency of breakpoints computed for a mouse-human
comparison (divergence time ~112 Myr; Kumar and Hedges 1998
), which
is about two orders of magnitude lower (Ranz et al. 2000
). The sizes of
conserved segments in these two species are estimated to be 24 kb to
90.5 Mb in length, averaging 15.6 Mb (Lander et al. 2001
). We have detected microsyntenic blocks of two to three genes each by cytological co-localization of these genes in the same Anopheles polytene chromosome lettered subdivision. It must be stressed that this evidence
neither establishes nor excludes that the genes are located next to
each other in the genome. As yet, we have a DNA distance estimate for
only one microsyntenic pair, beat and beat-B: 100 kb
in Drosophila and a BAC length (~120 kb average) in
Anopheles. However, our evidence strongly argues that locally
syntenic regions between the mosquito and the fruit fly are not long.
Microsynteny between Anopheles and Drosophila was
also detected by Romans et al. (1999)
, who isolated and characterized a 4.2-kb genomic fragment containing the Anopheles Bb,
TU37B2, and Dox-A2 genes. These are orthologs of the
Drosophila genes CG10655, CG10470, and
Dox-A2, respectively, all located within a 4.5-kb genome
region in the fruit fly (Adams et al. 2000
). Analysis of the molecular
organization of two mosquito chromosomal regions indicated the
occurrence of several rearrangements that changed both the position and
orientation of Bb and TU37B2 in comparison to their
Drosophila orthologs. We have confirmed these results and
found that the syntenic area does not extend much beyond the genes
mentioned (data not shown).
Taking these results together, the degree of observed microsynteny between Drosophila and Anopheles is not high and may be even lower than predicted. The degree of microsynteny is an important parameter for future efforts to use the D. melanogaster gene order to identify mosquito orthologs definitively, leading to functional hypotheses and to assays of these proposed functions in the genetically tractable fruit fly. Firm elucidation of the degree of microsynteny will be one of the major benefits expected from full sequencing of the A. gambiae genome, which is expected to begin shortly.
| |
METHODS |
|---|
|
|
|---|
Source of Sequence Data
Amino acid sequences of the genes in divisions 1-3 of chromosome
X of D. melanogaster can be obtained by anonymous FTP
from ftp://ftp.ebi.ac.uk/pub/databases/edgp/misc/ashburner/EG_genes.991229.pep.fa.gz (Benos et al. 2000
, 2001
), whereas amino acid sequences of the genes
identified in the Adh region are found in http://www.fruitfly.org/sequences/aa_Adh.dros (Ashburner et al. 1999
). Amino acid
sequences of all genes identified through the whole genome sequence
(release 1.0) are available at
http://www.fruitfly.org/sequence/dlMfasta.html (Adams et al. 2000
). For
A. gambiae, nucleotide sequences of ESTs from immune-competent cell line cDNA libraries (Dimopoulos et al. 2000
) and STSs from the BAC
genomic library (C. Roth and F.C. Collins, pers. comm.), as well as
other mosquito sequences with known cytological location, can be
BLAST-searched at AnoDB, the Anopheles database (http://konops.anodb.gr/cgi-bin/blast2.pl).
Computational Methods and Analysis of Results
For similarity searches, a locally installed WU-BLAST,
version 2.0a, suite of programs (Altschul et al. 1990
; W. Gish,
unpubl.) was used. D. melanogaster amino acid sequences of
genes from selected regions were compared to A. gambiae STS and EST databases using TBLASTN with standard default parameters. STS and EST sequences showing similarity with a high score
of >40, a probability P(N) of < e
10, and a
percentage of identical amino acids >30, were selected and checked as
best bidirectional hits after confirming the hit using
BLASTX with standard default parameters against a database
of 14,080 amino acid sequences of known and predicted Drosophila genes (release 1.0, http://www.fruitfly.org/sequence/dlMfasta.html#rel1; Adams et al.
2000
). Only STSs and ESTs that passed these criteria were selected, and
their alignments were further verified using the available exon-intron
structure of the corresponding D. melanogaster genes, as shown
in the National Center for Biotechnology Information (NCBI) version of
the D. melanogaster database
(http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/7227.html). The same
BLASTX search criteria were also used in the reciprocal
experiment, comparing A. gambiae nucleotide sequences of known
cytological location to protein encoding genes in D. melanogaster genes. The names and cytological locations of D. melanogaster genes were taken from FlyBase
(http://flybase.bio.indiana.edu; The FlyBase Consortium 1999
);
additional information and literature references on genes can also be
found there.
In Situ Hybridization to A. gambiae Polytene Chromosomes
BAC and cDNA clones were hybridized to preparations of A. gambiae polytene chromosomes essentially as described in Kumar and Collins (1994)
. The hybridization signals were localized according to
the cytological map of M. Coluzzi, A. Sabbatini, M.A. Di Deco, and V. Petrarca (unpubl., accessible at http://www.anodb.gr/AnoDB/Cytomap/).
| |
ACKNOWLEDGMENTS |
|---|
We are indebted to Drs. Frank Collins and Charles Roth for submitting their data to public databases before publication and to Drs. Mario Coluzzi and Igor Zhimulev for their support of the participation of their laboratories in the in situ hybridization analysis of A. gambiae sequences. We would also like to acknowledge the invaluable assistance of Drs. Poulikos Prastakos and Yannis Kamarianakis in the statistical analysis. This research was supported by grants from the UNDP/World Bank/World Health Organization Special Program for Research and Training in Tropical Diseases (TDR), the INCO programme of the European Union, the National Institutes of Health, the Hellenic Secretariat General for Research and Technology, and the John D. and Catherine T. McArthur Foundation.
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| |
FOOTNOTES |
|---|
6 Corresponding author.
E-MAIL louis{at}imbb.forth.gr; FAX 30-81-391104.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.196101.
| |
REFERENCES |
|---|
|
|
|---|
Received May 11, 2001; accepted in revised form October 26, 2001.
This article has been cited by other articles:
![]() |
J. Machado, P. Abdulla, W. J. B. Hanna, A. J. Hilliker, and I. R. Coe Genomic analysis of nucleoside transporters in Diptera and functional characterization of DmENT2, a Drosophila equilibrative nucleoside transporter Physiol Genomics, February 12, 2007; 28(3): 337 - 347. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. W. Severson, B. deBruyn, D. D. Lovin, S. E. Brown, D. L. Knudson, and I. Morlais Comparative Genome Analysis of the Yellow Fever Mosquito Aedes aegypti with Drosophila melanogaster and the Malaria Vector Mosquito Anopheles gambiae J. Hered., March 1, 2004; 95(2): 103 - 113. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. T. DOW and S. A. DAVIES Integrative Physiology and Functional Genomics of Epithelial Function in a Genetic Model Organism Physiol Rev, July 1, 2003; 83(3): 687 - 729. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. M. Zdobnov, C. von Mering, I. Letunic, D. Torrents, M. Suyama, R. R. Copley, G. K. Christophides, D. Thomasova, R. A. Holt, G. M. Subramanian, et al. Comparative Genome and Proteome Analysis of Anopheles gambiae and Drosophila melanogaster Science, October 4, 2002; 298(5591): 149 - 159. [Abstract] [Full Text] [PDF] |
||||