Published online before print
February 8, 2001, 10.1101/gr.GR-1619R
Vol. 11, Issue 3, 356-372, March 2001
Genome Alignment, Evolution of Prokaryotic Genome Organization, and Prediction of Gene Function Using Genomic Context
Yuri I.
Wolf,
Igor B.
Rogozin,
Alexey S.
Kondrashov, and
Eugene V.
Koonin1
National Center for Biotechnology Information, National Library of
Medicine, National Institutes of Health,
Bethesda, Maryland 20894, USA
Gene order in prokaryotes is conserved to a much lesser extent than
protein sequences. Only several operons, primarily those that code for
physically interacting proteins, are conserved in all or most of the
bacterial and archaeal genomes. Nevertheless, even the limited
conservation of operon organization that is observed can provide
valuable evolutionary and functional clues through multiple genome
comparisons. A program for constructing gapped local alignments of
conserved gene strings in two genomes was developed. The statistical
significance of the local alignments was assessed using Monte Carlo
simulations. Sets of local alignments were generated for all pairs of
completely sequenced bacterial and archaeal genomes, and for each
genome a template-anchored multiple alignment was constructed. In most
pairwise genome comparisons, <10% of the genes in each genome
belonged to conserved gene strings. When closely related pairs of
species (i.e., two mycoplasmas) are excluded, the total coverage of
genomes by conserved gene strings ranged from <5% for the
cyanobacterium Synechocystis sp to 24% for the minimal genome
of Mycoplasma genitalium, and 23% in Thermotoga
maritima. The coverage of the archaeal genomes was only slightly
lower than that of bacterial genomes. The majority of the conserved
gene strings are known operons, with the ribosomal superoperon being
the top-scoring string in most genome comparisons. However, in some of
the bacterial-archaeal pairs, the superoperon is rearranged to the
extent that other operons, primarily those subject to horizontal
transfer, show the greatest level of conservation, such as the
archaeal-type H+-ATPase operon or ABC-type transport cassettes. The
level of gene order conservation among prokaryotic genomes was compared
to the cooccurrence of genomes in clusters of orthologous genes (COGs)
and to the conservation of protein sequences themselves. Only limited
correlation was observed between these evolutionary variables. Gene
order conservation shows a much lower variance than the cooccurrence of
genomes in COGs, which indicates that intragenome homogenization via
recombination occurs in evolution much faster than intergenome
homogenization via horizontal gene transfer and lineage-specific gene
loss. The potential of using template-anchored multiple-genome
alignments for predicting functions of uncharacterized genes was
quantitatively assessed. Functions were predicted or significantly
clarified for ~90 COGs (~4% of the total of 2414 analyzed COGs).
The most significant predictions were obtained for the poorly
characterized archaeal genomes; these include a previously
uncharacterized restriction-modification system, a nuclease-helicase
combination implicated in DNA repair, and the probable archaeal
counterpart of the eukaryotic exosome. Multiple genome alignments are a
resource for studies on operon rearrangement and disruption, which is
central to our understanding of the evolution of prokaryotic genomes.
Because of the rapid evolution of the gene order, the potential of
genome alignment for prediction of gene functions is limited, but
nevertheless, such predictions information significantly complements
the results obtained through protein sequence and structure analysis.
1
Corresponding author.
11:356-372 ©2001 by Cold Spring Harbor Laboratory Press ISSN 1088-9051/01 $5.00

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
B. Lehner and I. Lee
Network-guided genetic screening: building, testing and using gene networks to predict gene function
Brief Funct Genomic Proteomic,
May 1, 2008;
7(3):
217 - 227.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. W. W. Brouwer, O. P. Kuipers, and S. A. F. T. v. Hijum
The relative value of operon predictions
Brief Bioinform,
April 17, 2008;
(2008)
bbn019v1.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Ventura, C. Canchaya, A. Tauch, G. Chandra, G. F. Fitzgerald, K. F. Chater, and D. van Sinderen
Genomics of Actinobacteria: Tracing the Evolutionary History of an Ancient Phylum
Microbiol. Mol. Biol. Rev.,
September 1, 2007;
71(3):
495 - 548.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Fukami-Kobayashi, Y. Minezaki, Y. Tateno, and K. Nishikawa
A Tree of Life Based on Protein Domain Organizations
Mol. Biol. Evol.,
May 1, 2007;
24(5):
1181 - 1189.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. Wu, F. Mao, V. Olman, and Y. Xu
Hierarchical classification of functionally equivalent genes in prokaryotes
Nucleic Acids Res.,
April 1, 2007;
35(7):
2125 - 2140.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Makarova, A. Slesarev, Y. Wolf, A. Sorokin, B. Mirkin, E. Koonin, A. Pavlov, N. Pavlova, V. Karamychev, N. Polouchine, et al.
Comparative genomics of the lactic acid bacteria
PNAS,
October 17, 2006;
103(42):
15611 - 15616.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. F. DeLuca, I-H. Wu, J. Pu, T. Monaghan, L. Peshkin, S. Singh, and D. P. Wall
Roundup: a multi-genome repository of orthologs and evolutionary distances
Bioinformatics,
August 15, 2006;
22(16):
2044 - 2046.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Campillos, C. von Mering, L. J. Jensen, and P. Bork
Identification and analysis of evolutionarily cohesive functional modules in protein networks
Genome Res.,
March 1, 2006;
16(3):
374 - 382.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Snider, I. Gutsche, M. Lin, S. Baby, B. Cox, G. Butland, J. Greenblatt, A. Emili, and W. A. Houry
Formation of a Distinctive Complex between the Inducible Bacterial Lysine Decarboxylase and a Novel AAA+ ATPase
J. Biol. Chem.,
January 20, 2006;
281(3):
1532 - 1546.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. M. Iyer, E. V. Koonin, D. D. Leipe, and L. Aravind
Origin and evolution of the archaeo-eukaryotic primase superfamily and related palm-domain proteins: structural insights and new members
Nucleic Acids Res.,
July 15, 2005;
33(12):
3875 - 3896.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. T. Edwards, S. C. G. Rison, N. G. Stoker, and L. Wernisch
A universally applicable method of operon map prediction on minimally annotated genomes using conserved genomic context
Nucleic Acids Res.,
June 7, 2005;
33(10):
3253 - 3262.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. Wu, Z. Su, F. Mao, V. Olman, and Y. Xu
Prediction of functional modules based on comparative genome analysis and Gene Ontology application
Nucleic Acids Res.,
May 18, 2005;
33(9):
2822 - 2837.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. N. Price, K. H. Huang, E. J. Alm, and A. P. Arkin
A novel method for accurate operon predictions in all sequenced prokaryotes
Nucleic Acids Res.,
February 8, 2005;
33(3):
880 - 892.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Zhou, A. Kile, M. Bechner, M. Place, E. Kvikstad, W. Deng, J. Wei, J. Severin, R. Runnheim, C. Churas, et al.
Single-Molecule Approach to Bacterial Genomic Comparisons via Optical Mapping
J. Bacteriol.,
November 15, 2004;
186(22):
7773 - 7782.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Y. Galperin and E. V. Koonin
'Conserved hypothetical' proteins: prioritization of targets for experimental study
Nucleic Acids Res.,
October 12, 2004;
32(18):
5452 - 5463.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Wang, J. D. Trawick, R. Yamamoto, and C. Zamudio
Genome-wide operon prediction in Staphylococcus aureus
Nucleic Acids Res.,
July 13, 2004;
32(12):
3689 - 3702.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. Achaz, E. Coissac, P. Netter, and E. P. C. Rocha
Associations Between Inverted Repeats and the Structural Evolution of Bacterial Genomes
Genetics,
August 1, 2003;
164(4):
1279 - 1289.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Hampson, A. McLysaght, B. Gaut, and P. Baldi
LineUp: Statistical Detection of Chromosomal Homology With Application to Plant Comparative Genomics
Genome Res.,
May 1, 2003;
13(5):
999 - 1010.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. Pevzner and G. Tesler
Genome Rearrangements in Mammalian Evolution: Lessons From Human and Mouse Genomes
Genome Res.,
January 1, 2003;
13(1):
37 - 45.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
O. Lecompte, R. Ripp, J.-C. Thierry, D. Moras, and O. Poch
Comparative analysis of ribosomal proteins in complete genomes: an example of reductive evolution at the domain scale
Nucleic Acids Res.,
December 15, 2002;
30(24):
5382 - 5390.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. Mazumder, L. M. Iyer, S. Vasudevan, and L. Aravind
Detection of novel members, structure-function analysis and evolutionary classification of the 2H phosphoesterase superfamily
Nucleic Acids Res.,
December 1, 2002;
30(23):
5229 - 5243.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. P. Gogarten, W. F. Doolittle, and J. G. Lawrence
Prokaryotic Evolution in Light of Gene Transfer
Mol. Biol. Evol.,
December 1, 2002;
19(12):
2226 - 2238.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
I. B. Rogozin, K. S. Makarova, D. A. Natale, A. N. Spiridonov, R. L. Tatusov, Y. I. Wolf, J. Yin, and E. V. Koonin
Congruent evolution of different classes of non-coding DNA in prokaryotic genomes
Nucleic Acids Res.,
October 1, 2002;
30(19):
4264 - 4271.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. G. Vitreschak, D. A. Rodionov, A. A. Mironov, and M. S. Gelfand
Regulation of riboflavin biosynthesis and transport genes in bacteria by transcriptional and translational attenuation
Nucleic Acids Res.,
July 15, 2002;
30(14):
3141 - 3151.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
I. B. Rogozin, K. S. Makarova, J. Murvai, E. Czabarka, Y. I. Wolf, R. L. Tatusov, L. A. Szekely, and E. V. Koonin
Connected gene neighborhoods in prokaryotic genomes
Nucleic Acids Res.,
May 15, 2002;
30(10):
2212 - 2223.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B. Snel, P. Bork, and M. A. Huynen
The identification of functional modules from the genomic association of genes
PNAS,
April 30, 2002;
99(9):
5890 - 5895.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. I. Slesarev, K. V. Mezhevaya, K. S. Makarova, N. N. Polushin, O. V. Shcherbinina, V. V. Shakhova, G. I. Belova, L. Aravind, D. A. Natale, I. B. Rogozin, et al.
The complete genome of hyperthermophile Methanopyrus kandleri AV19 and monophyly of archaeal methanogens
PNAS,
April 2, 2002;
99(7):
4644 - 4649.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. O. Andersson and A. J. Roger
Evolutionary Analyses of the Small Subunit of Glutamate Synthase: Gene Order Conservation, Gene Fusions, and Prokaryote-to- Eukaryote Lateral Gene Transfers
Eukaryot. Cell,
April 1, 2002;
1(2):
304 - 310.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. S. Kondrashov and S. A. Shabalina
Classification of common conserved sequences in mammalian intergenic regions
Hum. Mol. Genet.,
March 1, 2002;
11(6):
669 - 674.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. S. Makarova, L. Aravind, N. V. Grishin, I. B. Rogozin, and E. V. Koonin
A DNA repair system specific for thermophilic Archaea and bacteria predicted by genomic context analysis
Nucleic Acids Res.,
January 15, 2002;
30(2):
482 - 496.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Aravind and E. V. Koonin
Prokaryotic Homologs of the Eukaryotic DNA-End-Binding Protein Ku, Novel Domains in the Ku Protein and Prediction of a Prokaryotic Double-Strand Break Repair System
Genome Res.,
August 1, 2001;
11(8):
1365 - 1374.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|