|
|
|
|
Vol. 12, Issue 11, 1792-1801, November 2002
METHODS
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |
ABSTRACT |
|---|
|
|
|---|
It is expected that one of the merits of comparative genomics lies in the transfer of structural and functional information from one genome to another. This is based on the observation that, although the number of chromosomal rearrangements that occur in genomes is extensive, different species still exhibit a certain degree of conservation regarding gene content and gene order. It is in this respect that we have developed a new software tool for the Automatic Detection of Homologous Regions (ADHoRe). ADHoRe was primarily developed to find large regions of microcolinearity, taking into account different types of microrearrangements such as tandem duplications, gene loss and translocations, and inversions. Such rearrangements often complicate the detection of colinearity, in particular when comparing more anciently diverged species. Application of ADHoRe to the complete genome of Arabidopsis and a large collection of concatenated rice BACs yields more than 20 regions showing statistically significant microcolinearity between both plant species. These regions comprise from 4 up to 11 conserved homologous gene pairs. We predict the number of homologous regions and the extent of microcolinearity to increase significantly once better annotations of the rice genome become available.
| |
INTRODUCTION |
|---|
|
|
|---|
Comparative genome analysis has demonstrated that across different
plant species, which diverged from a common ancestor
but currently tend to vary largely in genome sizes, gene content and order are often conserved. Especially, comparative genetic mapping in
the grasses revealed a high degree of conservation of markers within
large chromosomal segments (for reviews, see Gale and Devos 1998
;
Keller and Feuillet 2000
). Because, in general, different plant species
use homologous genes for similar functions, these observations have
great potential. Comparative genome mapping experiments can be a
powerful and efficient tool to transfer biological information from a
well-studied reference genome to related plant species. However, there
are some serious drawbacks when using comparative genetic maps based on
recombinational mapping of DNA markers. First, when the marker density
is low, small exceptions to colinearity will not be observed, and
second, the fact that most genes are organized in multigene families
makes it difficult to determine whether real orthologous loci are being
compared. Consequently, one can imagine that many experiments suffer
from a bias toward promoting colinear regions and miss exceptions to colinearity (Bennetzen 2000a
).
The various sequencing efforts over the last few years, such as the
complete genome sequence of the model plant Arabidopsis thaliana (Arabidopsis Genome Initiative 2000
), the YAC and BAC insert libraries of several grass genomes (Panstruga et al. 1998
; Feuillet and Keller 1999
) and the International Rice Genome Sequencing Project (Sasaki and Burr 2000
), make it possible to investigate whether
the degree of colinearity found in comparative genetic mapping
experiments is also observed at the gene level. The existence of
colinearity between model species and other plant species, even in a
limited number of small regions, could provide the opportunity to use
these model systems to identify candidate genes in other plants.
Comparative sequence analysis at the submegabase level indicates that
microcolinearity is abundant between closely related plant species,
although exceptions do appear (Chen and Bennetzen 1996
; Kilian et al.
1997
; Tikhonov et al. 1999
; Tarchini et al. 2000
). A high degree of
conservation of gene content and order between orthologous loci of
rice, maize, and sorghum has been reported (Chen et al. 1997
). These
grass species diverged from a common ancestor ~50 million years ago.
Also, within related dicots, microcolinearity can be observed. For
example, conserved gene content and order have been demonstrated
between tomato and Arabidopsis, which diverged ~112 million
years ago (Ku et al. 2000
), between Arabidopsis and soybean
(Grant et al. 2000
), and between tomato, Arabidopsis and
Capsella (Rossberg et al. 2001
). All of these comparative
studies revealed that rearrangements, such as inversions, deletions,
insertions, and tandem duplications, are an important mechanism
responsible for breaking up colinearity, and consequently, make it hard
to detect the remnants of colinearity. In addition, these rearrangement
processes appear to be more active in some plant lineages than in
others (Devos et al. 1993
; Devos and Gale 1997
; Schmidt 2000
).
When comparing more anciently diverged plant species, such as monocots
and dicots, more rearrangements are expected to have occurred and,
consequently, gene content and order to be less conserved. Recent DNA
sequence analysis seems to confirm this assumption and several lines of
evidence result in a plastic model in which the modern plant genome is
characterized by a series of nested duplications in addition to the
species-specific levels of rearrangements (Arabidopsis Genome
Initiative 2000
; Vision et al. 2000
; Wendel 2000
). Whether these
currently observed large-scale gene duplications are the result of
polyploidization, successive hyperploidizations, or a large number of
iteration events (entire genome duplication, entire chromosome
duplication, and generic duplications of unspecific DNA regions within
the same or between two chromosomes, respectively) is still highly
debated. Nevertheless, all of the different actors identified so far in
playing a role in the evolution of plant nuclear genomes make the
picture rather complicated. Consequently, solid conclusions about
genetic colinearity between Arabidopsis and rice, both
expected to have a great value as a model system for dicots and
monocots, respectively, are still missing, although several examples
showing traces of microcolinearity have been reported (Devos et al.
1999
; Van Dodeweerd et al. 1999
; Liu et al. 2001
; Mayer et al. 2001
).
To carefully study genome evolution using the massive amount of
sequence data that becomes available, we have developed a flexible
tool, called ADHoRe (Automatic Detection of Homologous Regions), that
detects genomic regions with statistically significant conserved gene
content and order. Particularly, ADHoRe was developed to find large
regions of colinearity, taking into account phenomena such as gene
loss, inversions, and tandem duplications. This general concept makes
it possible to use ADHoRe for analysis within one genome, that is, to
look for paralogous regions with duplicated genes (Raes et al. 2002
),
or for comparisons between genomes of different organisms, that is, to
look for synteny.
| |
RESULTS |
|---|
|
|
|---|
In this study, we have applied a new tool to estimate the frequency and significance of microcolinearity between distantly related plant species such as Arabidopsis and rice. Therefore, publicly available rice genomic sequences (as a series of BACs) from seven different chromosomes were used to compare with the complete Arabidopsis genome sequence. For both plant species, gene annotation was retrieved from public resources (see Methods). Important to note is that no prior information of macrocolinearity was incorporated into this analysis.
In total, using ADHoRe, we detected 105 cases of microcolinearity
between Arabidopsis and rice before removing nonsignificant colinear regions, from which 75 are between individual rice BACs and a
segment of the Arabidopsis genome and 30 are between
overlapping rice clones and an Arabidopsis genomic segment.
Applying the default 99% cut-off level, which retains all colinear
regions that have a probability to be generated by chance of <1%, 24 segments showing conserved gene content and order between
Arabidopsis and rice remain (listed in Table
1). Of these statistically significant regions, 18 (69%) show colinearity between an individual rice BAC and
an Arabidopsis genomic segment, whereas 8 (31%) show
colinearity between Arabidopsis and overlapping rice BACs. The
distributions of the number of conserved genes within these homologous
regions between Arabidopsis and rice for the different
significance levels are shown in Figure 1.
As expected for these classes of colinear regions characterized by a
small number of conserved genes and a large number of nonhomologous
intervening genes, the probability that they were generated by chance
is the highest. Consequently, applying more stringent conditions
reduces the number of these colinear regions. For all significance
levels, most of the statistically significant colinear segments are
characterized by four conserved genes (referred to as anchor points
hereafter).
|
|
The largest homologous segment between Arabidopsis and rice that ADHoRe could detect contained 11 conserved genes and is shown in Figure 2A. Detailed analysis showed that within this rice region on chromosome 1 (326.8 kb), originally 64 genes have been predicted, resulting in a gene density of one gene per 5.1 kb. The homologous Arabidopsis segment on chromosome 3 shows a gene density of one gene per 3.4 kb. However, validating the automatic rice gene prediction using Expressed Sequence Tag (EST) information and comparisons with putative homologs (see Methods) shows that only ~32 genes are present, resulting in a gene density of one gene per 10 kb. As a result, the number of nonhomologous intervening genes between the anchor points drastically decreases, and consequently, the biological significance or quality of the colinear region to be homologous increases (see Methods). An analogous approach was applied to determine whether all nonhomologous intervening Arabidopsis genes were real genes. If not, removing genes in the Arabidopsis genome could also result in a higher degree of conservation within a colinear area. However, no indications were found that some of these intervening nonhomologous Arabidopsis genes were falsely predicted.
|
Careful analysis of the long stretch of genomic sequence within the
rice BAC clone P0414E03, characterized by a low gene density and no
conservation with Arabidopsis, showed that multiple
transposable elements have been integrated into this particular region
(Fig. 2A). Analysis of putative genes and ORFs revealed high
similarities with proteins encoded by transposable elements (e.g., gag
protein, reverse transcriptase, integrases, RNaseH). In addition,
different sets of long repetitive elements were discovered, which
allowed us to reconstruct a number of distinct transposable elements
involved in plant gene and genome evolution (Grandbastien 1992
; Vicient et al. 2001
). On the basis of organization of the proteins encoded in
these transposons, three gypsy-like LTR-retrotransposons
(Bennetzen 2000b
) and one Mutator (Lisch et al. 2001
)
transposable element could be identified, together with other
transposon-like remnants. In the homologous Arabidopsis genome
segment, no retrotransposable elements were detected. Figure 2B shows
another colinear region between rice chromosome 1 and
Arabidopsis chromosome 3, characterized by eight anchor
points. Removing dubiously predicted rice genes results in a gene
density of one gene per 7.7 kb (or 42 genes on the stretch of 305.1-kb
rice genomic sequence). The probability of this colinear region to be
generated by chance is <1%. Several rearrangements can be clearly
observed; since the divergence of rice and Arabidopsis, two
genes have undergone tandem duplications in Arabidopsis,
whereas other genes have been inverted in Arabidopsis or in
rice. A more drastic rearrangement event is shown in Figure 3. This colinear region between rice
chromosome 1 and Arabidopsis chromosome 5 is characterized by
five pairs of homologs (anchor points). Within the rice genomic
fragment, a gypsy-like LTR-retrotransposon has been inserted,
resulting in a much longer rice segment (96.8 kb) compared with the
homologous Arabidopsis segment (39.8 kb). Next to the local
gene inversions observed in a number of colinear regions, this example
shows a more complex inversion event. Genes 03 and 06 located on rice
BAC B1088C09 are part of a segment colinear with Arabidopsis
chromosome 5, although their gene order and orientation are not
conserved compared with the other anchor points. Therefore, a
chromosomal segment encoding these two genes (or their
Arabidopsis orthologs) seems to have been inverted after both
species diverged from each other. However, reconstructing the history
leading to this configuration requires an additional inversion event.
Because for gene 06, in contrast to all other genes conserved within
this homologous region, the orientation compared with the homologous Arabidopsis gene is different (see twisted black band in Fig. 3), one extra gene inversion is required to explain the current gene
organization between these two genomic fragments. Finally, gene 06 experienced a tandem duplication resulting in gene 07, or vice versa.
|
| |
DISCUSSION |
|---|
|
|
|---|
It is estimated that rice and Arabidopsis have diverged
~200 million years ago (Yang et al. 1999
; Wikström et al. 2001
).
Nevertheless, applying our newly developed tool to detect homologous
regions between both plants revealed numerous examples of significant microcolinearity. On the other hand, of the total set of colinear regions present between rice and Arabidopsis, probably only a subset can be considered as genuine orthologous regions that originated from a common ancestral region. The major cause of this phenomenon is
the fact that many genes are organized in multigene families, and
consequently, the discrimination between paralogous and orthologous gene sequences is extremely difficult. Therefore, we incorporated a
routine in the ADHoRe algorithm to determine whether a colinear region
could have been generated by chance out of homologous gene couples. In
other words, it was tested whether a particular colinear region is a
homologous region or purely consists of homologous gene couples
organized in a colinear way by chance. Analysis of a number of colinear
regions characterized by a high probability to be generated by chance
showed that low overall-similarity signals, such as similarities
between DNA-binding sites, or badly conserved gene content and order
were detected (data not shown).
Combining numerous rice BACs resulted in a set of long genomic rice
stretches that could be investigated for colinearity with Arabidopsis. Although only a small fraction of the final rice genome sequence was used in this study (~38%, for which 62 MB was
organized in overlapping BACs), already >20 regions between rice and
Arabidopsis were found with biologically relevant colinearity, consisting of 4 up to 11 conserved genes. Because a large number of
short colinear regions are found between individual rice BAC clones and
an Arabidopsis genome segment, a major fraction of these
regions were removed because they could represent colinear regions
generated by chance. However, with more rice genomic sequence data
becoming freely accessible very fast, we expect that concatenation of
additional BACs will generate longer colinear stretches with Arabidopsis. Therefore, a number of colinear regions currently not retained in our final results could become statistically
significant when analyzed over longer distances. Consequently, the real
number of rice regions showing real microcolinearity with
Arabidopsis will most probably be higher than presented here.
Preliminary results on the draft sequence of the rice genome show that
larger colinear segments may exist between Arabidopsis and
rice (Goff et al. 2002
). However, as the annotation of the draft
sequence is not yet publicly available, a comparison with the results
described here remains difficult.
Detailed analysis of some colinear regions indicates that the quality
of the rice annotation used in this comparison is not outstanding.
Although the RiceGAAS system (Sakata et al. 2002
) tries to benefit from
combining a number of different gene prediction programs, a large
number of errors still seem to be present. The crude quality assessment
performed here to determine whether a predicted gene is a real gene
(i.e., sensitivity) revealed that a major fraction of the
protein-encoding genes were falsely predicted. Consequently, the
initial gene density determined by the gene prediction system decreased
drastically when removing unreliable predicted genes. In addition, a
number of genes were split (one gene predicted as two separate genes)
and some exons or complete genes were missing, which could be
demonstrated by incorporating EST information. Especially the large
number of ORFs predicted as genes poses a problem, because a small
number of these ORFs actually are confirmed by EST information, but the
major fraction was not. All of these annotation inaccuracies will
definitely have their repercussions on the correct interpretation of
the rice genome sequence, in a way similar to that faced in annotating the Arabidopsis genome sequence (Pavy et al. 1999
). Therefore, further improvement and retraining of rice gene prediction programs, together with newly developed extrinsic gene prediction methods seems
inevitable for fully exploiting the rice genome sequence (Rouzé et
al. 1999
; Bennetzen 2002
).
Next to the incorrectly predicted protein-encoding genes, a subset of
these erroneously predicted genes seems to correspond with transposable
elements. Although detailed analyses can unambiguously identify these
elements, the presence of these elements annotated as protein-encoding
genes is a major problem when performing genome-wide analyses such as
described here. Although in the Arabidopsis genome 2109 Class
I transposable elements have been described already (Arabidopsis Genome
Initiative 2000
), an additional screening reveals that within the
Arabidopsis proteome nearly 600 predicted protein-encoding
genes are present with high similarity to some retrotransposable
elements (data not shown). Furthermore, it should be noted that the
largest fraction of these genes resembling retrotransposable elements
has been identified on chromosomes 1, 2, and 4. Because chromosomes 2 and 4 have been sequenced and analyzed first within the
Arabidopsis sequencing project, an imperfect annotation
protocol for transposons at that moment could be an explanation for
this observation. For ~36% of these detected genes, an EST matches the structural annotation, which could explain why these genes have
been allocated as protein-encoding genes in the automatic annotation
protocols. Nevertheless, additional efforts seem most likely to
increase the quality of the current annotation on a full-genome level
toward transposable elements in both rice and Arabidopsis (Le
et al. 2000
).
Although transposable elements integrate and retrotransposons amplify
within plant genomes, when correctly annotated, they should not
interfere with the presented algorithm to detect homologous regions.
Consequently, this level of complexity generated by transposable elements can be masked in our method, if all transposable elements are
defined as such and not as protein-encoding genes in the genomic sequence. Analysis of multiple colinear regions showed that the number
of retrotransposable elements in rice was considerably higher than in
the homologous Arabidopsis segments, although the actual
number of retrotransposable elements in Arabidopsis is probably higher than described so far (Arabidopsis Genome Initiative 2000
). Accumulation of retrotransposons in plant genomes clearly seems
to be dependent of both the evolutionary lineage and the efficiency of
mechanisms repressing this activity (Bennetzen and Kellog 1997
;
Fedoroff 2000
).
It is clear that all sorts of rearrangements have occurred since rice and Arabidopsis diverged from each other ~200 million years ago. Detailed analysis of colinearity between Arabidopsis and rice identified tandem duplications and gene loss, as well as gene and block inversions, although the frequency of these detectable events is rather low. In other words, it is not possible to trace all rearrangements that are responsible for the nonhomologous genes present in colinear regions. The main driving force responsible for degrading colinearity is seemingly a complex evolutionary mechanism, consisting of species-specific levels of large and small rearrangements (due to duplications, inversions, insertions, and deletions), transposon activity, and perhaps other unknown mechanisms. Ideally, the continuous improvement of data sets, methods, and additional genome sequences from intervening species will give us further insight into these mechanisms and their frequencies within different species.
Finally, the question remains whether, after detecting colinearity
between genomes, the functions of the genes in one genome may be
transferred to the homologous genes of the other genome. One major
problem lies in the fact that a particular region of a chromosome can
be duplicated in rice as well as in Arabidopsis. Even more
drastically, complete genome duplication events may have occurred in
both Arabidopsis (e.g., Arabidopsis Genome Initiative 2000
;
Vision et al. 2000
) and rice (e.g., Goff et al. 2002
; Yu et al. 2002
).
Because after such a duplication event, all genes are present in
duplicate, one copy may degenerate through loss-of-function mutations,
or both duplicates may remain redundant, experience subfunctionalization, or diverge in function through positive Darwinian
selection (e.g., Ohno 1970
; Force et al. 1999
; Hughes 1999
; Van de Peer
et al. 2001
). This results in a situation in which one genomic segment
of one species maps with two or more different segments in the other
genome, or vice versa. Transferring functional annotations from one
genome to the other genome, thus, has to be done with caution, as genes
belonging to paralogous regions may have considerably diverged in function.
| |
METHODS |
|---|
|
|
|---|
The ADHoRe Algorithm
Detection of Homologous Genes
To detect chromosomal locations of colinear genes, one has to look for regions that can be paired up because they contain sets of similar genes. Therefore, a data set containing all gene products, their absolute or relative position on a genomic sequence, and their orientation is required. The whole procedure is controlled by two parameters as follows: the gap size G, which describes the maximal number of intervening, nonhomologous genes tolerated between two homologous genes within a colinear segment, and Q, the quality of the colinear regions (see below). Figure 4 presents a flowchart of the algorithm. For all gene products on two genomic fragments for which gene colinearity is to be detected, initially an all-against-all sequence similarity search is performed, using BLASTP (Altschul et al. 1990
|
|
Preprocessing of the Data
As discussed above, during the preprocessing step, the two genomic fragments are compared, and homologous gene pairs are determined using BLAST and HSSP, after which, these are stored in a matrix. The orientation of the two genes determines the value in the matrix, whereas nonhomologous pairs are represented as empty elements in the matrix. The next step during the preprocessing is the removal of irrelevant data points, which we designate negative filtering. During this step, all elements that cannot belong to a cluster because they are too far away from other elements in the matrix, are removed. The last step in the preprocessing is to remap tandem duplicated blocks. Because we are looking for diagonal regions in the matrix, purely horizontal or vertical regions due to tandem duplications are remapped. This is done by collapsing all tandem duplications of a gene with the same orientation and within a distance G. This way, it is easier to detect diagonal regions, as they are no longer interrupted by horizontal or vertical elements. At the end of the preprocessing, the elements in the matrix are separated according to their orientation, yielding the two orientation classes (see Figs. 4 and 5). This separation is made to facilitate the clustering and is based on the observation that colinear regions consist primarily of elements with the same orientation class. At the end of the process, both orientation classes are again combined, enabling the reconstruction of duplicated regions that have been subjected to small gene inversions.Clustering of Genes and Blocks of Genes
A colinear region is defined in the matrix representation as a number of points showing diagonal proximity. Therefore, a special distance function was used, yielding a shorter distance for points that are in diagonally closer proximity than points that are in horizontal or vertical proximity. The formula for this function is:
|
|
one of the
parameters of the algorithm
has been reached. During each iteration,
the gap size represents the maximal distance between two points in a
cluster. In each iteration, new clusters can be formed and existing
clusters can be extended. The algorithm details of the clustering step
are depicted in Figure 7. Starting with the
elements of either one of the two orientation classes (a set of
singletons, i.e., elements not yet clustered), the DPD function is used
to cluster the elements according to the initial gap size. By default,
the initial gap size is set to 3 and is then increased in 10 exponential steps until the final gap size G has been reached. This results in a set of clusters and a set of singletons.
|
Postprocessing
When all clusters have been compiled as described above, the fraction of colinear regions (clusters) that are not significant needs to be removed. The goal of this procedure is to determine the fraction of colinear regions that could have occurred purely by chance, and therefore are not biologically significant. This is implemented as a statistical test, sampling a large number of reshuffled data sets and calculating the probability that a colinear region, characterized by a number of conserved genes and an average gap size, can be found by chance. Using a default significance level of 99%, all regions with a probability to be generated by chance smaller than 1% are retained. The second step during postprocessing is to combine the results for the two sets of clusters with different orientations. First, we try to enrich clusters from one orientation class with singletons from the other orientation class. This step is similar to the third step in the clustering algorithm, in which clusters are extended without badly affecting the quality. Second, it is tested whether clusters from the two different orientation classes can be merged. By combining the results of both orientation classes, it is possible to reconstruct larger colinear regions that might have been subjected to one or more inversion events.The Rice Data Set
For rice chromosomes 1, 2, 4, 6, 7, 8, and 10 (a set of chromosomes for which a large fraction of the chromosome was already sequenced), the public data of the different centers was collected (status January 14, 2002). All BAC sequences for which map position information was available and that were linked to one chromosome only were downloaded from the different consortia websites, for which an overview can be found at http://www.tigr.org/tdb/e2k1/osa1/BACmapping/description.shtml.
Concatenation of Rice BACs
To obtain large stretches of genomic rice sequence to compare with Arabidopsis, we used a simple strategy to build rice contigs. Initially, for all BAC clones, the BAC extremities were compared with BAC ends of neighboring BACs using BLASTN (Altschul et al. 1990Annotation
For all rice BACs, gene annotation was performed using RiceGAAS (Sakata et al. 2002
|
Annotation of Transposable Elements
Initially, the genomic BAC sequence was screened for repetitive elements using REPuter (Kurtz and Schleiermacher 1999Arabidopsis Data Set
Genomic sequences and gene annotation for the complete Arabidopsis genome was downloaded from the TIGR Arabidopsis thaliana Database (version August 2001, http://www.tigr.org/tdb/e2k1/ath1/) and processed with in-house Perl scripts.| |
WEB SITE REFERENCES |
|---|
|
|
|---|
http://ricegaas.dna.affrc.go.jp/; RiceGAAS, Rice Genome Automated Annotation System.
http://www.psb.rug.ac.be/; Department homepage.
http://www.sanger.ac.uk/Software/Pfam/; PFAM, a collection of protein families and domains.
http://www.tigr.org/; The Institute for Genomic Research.
| |
ACKNOWLEDGMENTS |
|---|
We thank Stephane Rombauts and Pierre Rouzé for helpful discussions, and Martine De Cock for help in preparing the manuscript. K.V. and C.S. thank the Vlaams Instituut voor de Bevordering van het Wetenschappelijk-Technologisch Onderzoek in de Industrie for a predoctoral fellowship.
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| |
FOOTNOTES |
|---|
1 These authors contributed equally to this work.
2 Corresponding author.
E-MAIL yvdp{at}gengenp.rug.ac.be; FAX 32-9-264-5349.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.400202.
| |
REFERENCES |
|---|
|
|
|---|
Received May 8, 2002; accepted in revised form August 30, 2002.
This article has been cited by other articles:
![]() |
G. C. Conant and K. H. Wolfe Probabilistic Cross-Species Inference of Orthologous Genomic Regions Created by Whole-Genome Duplication in Yeast Genetics, July 1, 2008; 179(3): 1681 - 1692. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Salse, S. Bolot, M. Throude, V. Jouffe, B. Piegu, U. M. Quraishi, T. Calcagno, R. Cooke, M. Delseny, and C. Feuillet Identification and Characterization of Shared Duplications between Rice and Wheat Provide New Insight into Grass Genome Evolution PLANT CELL, January 1, 2008; 20(1): 11 - 24. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Ventura, C. Canchaya, A. Tauch, G. Chandra, G. F. Fitzgerald, K. F. Chater, and D. van Sinderen Genomics of Actinobacteria: Tracing the Evolutionary History of an Ancient Phylum Microbiol. Mol. Biol. Rev., September 1, 2007; 71(3): 495 - 548. [Abstract] [Full Text] [PDF] |
||||