Genome Research Econo tag

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Sandberg, R.
Right arrow Articles by Cöster, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Sandberg, R.
Right arrow Articles by Cöster, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Vol. 11, Issue 8, 1404-1409, August 2001

METHODS
Capturing Whole-Genome Characteristics in Short Sequences Using a Naïve Bayesian Classifier

Rickard Sandberg,1,2,3 Gösta Winberg,1,2 Carl-Ivar Bränden,1 Alexander Kaske,2 Ingemar Ernberg,1 and Joakim Cöster2

1 Microbiology and Tumor Biology Center, Karolinska Institute, S-171 77 Stockholm, Sweden; 2 Virtual Genetics Laboratory AB, S-171 77 Stockholm, Sweden

Bacterial genomes have diverged during evolution, resulting in clearcut differences in their nucleotide composition, such as their GC content. The analysis of complete sequences of bacterial genomes also reveals the presence of nonrandom sequence variation, manifest in the frequency profile of specific short oligonucleotides. These frequency profiles constitute highly specific genomic signatures. Based on these differences in oligonucleotide frequency between bacterial genomes, we investigated the possibility of predicting the genome of origin for a specific genomic sequence. To this end, we developed a naïve Bayesian classifier and systematically analyzed 28 eubacterial and archaeal genomes. We found that sequences as short as 400 bases could be correctly classified with an accuracy of 85%. We then applied the classifier to the identification of horizontal gene transfer events in whole-genome sequences and demonstrated the validity of our approach by correctly predicting the transfer of both the superoxide dismutase (sodC) and the bioC gene from Haemophilus influenzae to Neisseria meningitis, correctly identifying both the donor and recipient species. We believe that this classification methodology could be a valuable tool in biodiversity studies.


3 Corresponding author.


11:1404-1409 ©2001 by Cold Spring Harbor Laboratory Press  ISSN 1088-9051/01 $5.00

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
I. Rajan, S. Aravamuthan, and S. S. Mande
Identification of compositionally distinct regions in genomes using the centroid method
Bioinformatics, October 15, 2007; 23(20): 2672 - 2677.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
C. Reed, V. Fofanov, C. Putonti, S. Chumakov, T. Slezak, and Y. Fofanov
Effect of the mutation rate and background size on the quality of pathogen identification
Bioinformatics, October 15, 2007; 23(20): 2665 - 2671.
[Abstract] [Full Text] [PDF]


Home page
Appl. Environ. Microbiol.Home page
Q. Wang, G. M. Garrity, J. M. Tiedje, and J. R. Cole
Naive Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy
Appl. Envir. Microbiol., August 15, 2007; 73(16): 5261 - 5267.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. S. Vernikos and J. Parkhill
Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands
Bioinformatics, September 15, 2006; 22(18): 2196 - 2203.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Dalevi, D. Dubhashi, and M. Hermansson
Bayesian classifiers for detecting HGT using fixed and variable order markov models of genomic signatures
Bioinformatics, March 1, 2006; 22(5): 517 - 522.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
B. Fertil, M. Massin, S. Lespinats, C. Devic, P. Dumee, and A. Giron
GENSTYLE: exploration and analysis of DNA sequences with genomic signature
Nucleic Acids Res., July 1, 2005; 33(suppl_2): W512 - W515.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. W. J. van Passel, A. C. M. Luyf, A. H. C. van Kampen, A. Bart, and A. van der Ende
{delta}{rho}-Web, an online tool to assess composition similarity of individual nucleic acid sequences
Bioinformatics, July 1, 2005; 21(13): 3053 - 3055.
[Abstract] [Full Text] [PDF]


Home page
Appl. Environ. Microbiol.Home page
C. Regeard, J. Maillard, C. Dufraigne, P. Deschavanne, and C. Holliger
Indications for Acquisition of Reductive Dehalogenase Genes through Horizontal Gene Transfer by Dehalococcoides ethenogenes Strain 195
Appl. Envir. Microbiol., June 1, 2005; 71(6): 2955 - 2961.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Tsirigos and I. Rigoutsos
A new computational method for the detection of horizontal gene transfer events
Nucleic Acids Res., February 16, 2005; 33(3): 922 - 933.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Dufraigne, B. Fertil, S. Lespinats, A. Giron, and P. Deschavanne
Detection and characterization of horizontal transfers in prokaryotes using genomic signature
Nucleic Acids Res., January 13, 2005; 33(1): e6 - e6.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. R. Zabarovsky, L. Petrenko, A. Protopopov, O. Vorontsova, A. S. Kutsenko, Y. Zhao, G. Kilosanidze, V. Zabarovska, E. Rakhmanaliev, B. Pettersson, et al.
Restriction site tagged (RST) microarrays: a novel technique to study the species composition of complex microbial systems
Nucleic Acids Res., August 15, 2003; 31(16): e95 - e95.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. Zabarovska, A. S. Kutsenko, L. Petrenko, G. Kilosanidze, O. Ljungqvist, E. Norin, T. Midtvedt, G. Winberg, R. Mollby, V. I. Kashuba, et al.
NotI passporting to identify species composition of complex microbial systems
Nucleic Acids Res., January 15, 2003; 31(2): e5 - e5.
[Abstract] [Full Text] [PDF]




Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.