Genome Research

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Erratum (v11,p1315)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Rogic, S.
Right arrow Articles by Ouellette, F. B.F.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Rogic, S.
Right arrow Articles by Ouellette, F. B.F.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Vol. 11, Issue 5, 817-832, May 2001

LETTER
Evaluation of Gene-Finding Programs on Mammalian Sequences

Sanja Rogic,1,5 Alan K. Mackworth,2 and Francis B.F. Ouellette3

1 Computer Science Department, The University of California at Santa Cruz, Santa Cruz 95064, California; 2 Computer Science Department, The University of British Columbia, Vancouver, BC V6T 1Z4, Canada; 3 Centre for Molecular Medicine and Therapeutics, Vancouver, BC V5Z 4H4, Canada

We present an independent comparative analysis of seven recently developed gene-finding programs: FGENES, GeneMark.hmm, Genie, Genscan, HMMgene, Morgan, and MZEF. For evaluation purposes we developed a new, thoroughly filtered, and biologically validated dataset of mammalian genomic sequences that does not overlap with the training sets of the programs analyzed. Our analysis shows that the new generation of programs has substantially better results than the programs analyzed in previous studies. The accuracy of the programs was also examined as a function of various sequence and prediction features, such as G + C content of the sequence, length and type of exons, signal type, and score of the exon prediction. This approach pinpoints the strengths and weaknesses of each individual program as well as those of computational gene-finding in general. The dataset used in this analysis (HMR195) as well as the tables with the complete results are available at http://www.cs.ubc.ca/~rogic/evaluation/.


5 Corresponding author.


11:817-832 ©2001 by Cold Spring Harbor Laboratory Press  ISSN 1088-9051/01 $5.00

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
A. Coghlan and R. Durbin
Genomix: a method for combining gene-finders' predictions, which uses evolutionary conservation of sequence and intron exon structure
Bioinformatics, June 15, 2007; 23(12): 1468 - 1475.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. B. Gerstein, C. Bruce, J. S. Rozowsky, D. Zheng, J. Du, J. O. Korbel, O. Emanuelsson, Z. D. Zhang, S. Weissman, and M. Snyder
What is a gene, post-ENCODE? History and updated definition
Genome Res., June 1, 2007; 17(6): 669 - 681.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
B. A. Peters, B. St. Croix, T. Sjoblom, J. M. Cummins, N. Silliman, J. Ptak, S. Saha, K. W. Kinzler, C. Hatzis, and V. E. Velculescu
Large-scale identification of novel transcripts in the human genome
Genome Res., March 1, 2007; 17(3): 287 - 292.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Knapp and Y.-P. P. Chen
An evaluation of contemporary hidden Markov model genefinders with a predicted exon taxonomy
Nucleic Acids Res., January 12, 2007; 35(1): 317 - 324.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
D. V. Babushok, E. M. Ostertag, C. E. Courtney, J. M. Choi, and H. H. Kazazian Jr.
L1 integration in a transgenic mouse model
Genome Res., February 1, 2006; 16(2): 240 - 250.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Hu, B. Li, and D. Kihara
Limitations and potentials of current motif discovery algorithms
Nucleic Acids Res., September 2, 2005; 33(15): 4899 - 4913.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
J. M. O. Fernandes, M. G. Mackenzie, G. Elgar, Y. Suzuki, S. Watabe, J. R. Kinghorn, and I. A. Johnston
A genomic approach to reveal novel genes associated with myotube formation in the model teleost, Takifugu rubripes
Physiol Genomics, August 11, 2005; 22(3): 327 - 338.
[Abstract] [Full Text] [PDF]


Home page
IOVSHome page
A. M. Ozyildirim, G. J. Wistow, J. Gao, J. Wang, D. P. Dickinson, H. F. Frierson Jr, and G. W. Laurie
The Lacrimal Gland Transcriptome Is an Unusually Rich Source of Rare and Poorly Characterized Gene Transcripts
Invest. Ophthalmol. Vis. Sci., May 1, 2005; 46(5): 1572 - 1580.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. B. Wahl, U. Heinzmann, and K. Imai
LongSAGE analysis significantly improves genome annotation: identifications of novel genes and alternative transcripts in the mouse
Bioinformatics, April 15, 2005; 21(8): 1393 - 1400.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
L. Florea, V. Di Francesco, J. Miller, R. Turner, A. Yao, M. Harris, B. Walenz, C. Mobarry, G. V. Merkulov, R. Charlab, et al.
Gene and alternative splicing annotation with AIR
Genome Res., January 1, 2005; 15(1): 54 - 66.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
B. Issac and G. P. S. Raghava
EGPred: Prediction of Eukaryotic Genes Using Ab Initio Methods After Combining With Sequence Similarity Approaches
Genome Res., September 1, 2004; 14(9): 1756 - 1766.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Stanke, R. Steinkamp, S. Waack, and B. Morgenstern
AUGUSTUS: a web server for gene finding in eukaryotes
Nucleic Acids Res., July 1, 2004; 32(suppl_2): W309 - W312.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
L.-H. Li, J.-C. Li, Y.-F. Lin, C.-Y. Lin, C.-Y. Chen, and S.-F. Tsai
Genomic shotgun array: a procedure linking large-scale DNA sequencing with regional transcript mapping
Nucleic Acids Res., February 11, 2004; 32(3): e27 - e27.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
E. H. Margulies, M. Blanchette, NISC Comparative Sequencing Program, D. Haussler, and E. D. Green
Identification and Characterization of Multi-Species Conserved Sequences
Genome Res., December 1, 2003; 13(12): 2507 - 2518.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
X. H-F. Zhang, K. A. Heller, I. Hefter, C. S. Leslie, and L. A. Chasin
Sequence Information for the Splicing of Human Pre-mRNA Identified by Support Vector Machine Classification
Genome Res., December 1, 2003; 13(12): 2637 - 2650.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. S. Clark, Y. J.K. Edwards, D. Peterson, S. W. Clifton, A. J. Thompson, M. Sasaki, Y. Suzuki, K. Kikuchi, S. Watabe, K. Kawakami, et al.
Fugu ESTs: New Resources for Transcription Analysis and Genome Annotation
Genome Res., December 1, 2003; 13(12): 2747 - 2753.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
F. Mignone, G. Grillo, S. Liuni, and G. Pesole
Computational identification of protein coding potential of conserved sequence tags through cross-species evolutionary analysis
Nucleic Acids Res., August 1, 2003; 31(15): 4639 - 4645.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Nekrutenko, W.-Y. Chung, and W.-H. Li
ETOPE: evolutionary test of predicted exons
Nucleic Acids Res., July 1, 2003; 31(13): 3564 - 3567.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
E. D. Pleasance, M. A. Marra, and S. J.M. Jones
Assessment of SAGE in Transcript Identification
Genome Res., June 1, 2003; 13(6): 1203 - 1215.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
L. Zhang, V. Pavlovic, C. R Cantor, and S. Kasif
Human-Mouse Gene Identification by Comparative Evidence Integration and Evolutionary Analysis
Genome Res., June 1, 2003; 13(6): 1190 - 1202.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
G. Parra, P. Agarwal, J. F. Abril, T. Wiehe, J. W. Fickett, and R. Guigo
Comparative Gene Prediction in Human and Mouse
Genome Res., January 1, 2003; 13(1): 108 - 117.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
Mammalian Gene Collection Program Team*, R. L. Strausberg, E. A. Feingold, L. H. Grouse, J. G. Derge, R. D. Klausner, F. S. Collins, L. Wagner, C. M. Shenmen, G. D. Schuler, et al.
Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences
PNAS, December 24, 2002; 99(26): 16899 - 16903.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Mathe, M.-F. Sagot, T. Schiex, and P. Rouze
Current methods of gene prediction, their strengths and weaknesses
Nucleic Acids Res., October 1, 2002; 30(19): 4103 - 4117.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
G. K.-S. Wong, J. Wang, L. Tao, J. Tan, J. Zhang, D. A. Passey, and J. Yu
Compositional Gradients in Gramineae Genes
Genome Res., June 1, 2002; 12(6): 851 - 856.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
J. Yu, S. Hu, J. Wang, G. K.-S. Wong, S. Li, B. Liu, Y. Deng, L. Dai, Y. Zhou, X. Zhang, et al.
A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. indica)
Science, April 5, 2002; 296(5565): 79 - 92.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
E. D. Green and A. Chakravarti
The Human Genome Sequence Expedition: Views from the "Base Camp"
Genome Res., May 1, 2001; 11(5): 645 - 651.
[Full Text]


Home page
Genome Res.Home page
A. Nekrutenko, K. D. Makova, and W.-H. Li
The KA/KS Ratio Test for Assessing the Protein-Coding Potential of Genomic Regions: An Empirical and Simulation Study
Genome Res., January 1, 2002; 12(1): 198 - 202.
[Abstract] [Full Text] [PDF]




Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.