Genome Research scroll

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


Genome Res. 15:1777-1786, 2005
©2005 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/05 $5.00
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Brent, M. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Brent, M. R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Perspective

Genome annotation past, present, and future: How to define an ORF at each locus

Michael R. Brent

Laboratory for Computational Genomics and Department of Computer Science, Washington University, St. Louis, Missouri 63130, USA

Driven by competition, automation, and technology, the genomics community has far exceeded its ambition to sequence the human genome by 2005. By analyzing mammalian genomes, we have shed light on the history of our DNA sequence, determined that alternatively spliced RNAs and retroposed pseudogenes are incredibly abundant, and glimpsed the apparently huge number of non-coding RNAs that play significant roles in gene regulation. Ultimately, genome science is likely to provide comprehensive catalogs of these elements. However, the methods we have been using for most of the last 10 years will not yield even one complete open reading frame (ORF) for every gene—the first plateau on the long climb toward a comprehensive catalog. These strategies—sequencing randomly selected cDNA clones, aligning protein sequences identified in other organisms, sequencing more genomes, and manual curation—will have to be supplemented by large-scale amplification and sequencing of specific predicted mRNAs. The steady improvements in gene prediction that have occurred over the last 10 years have increased the efficacy of this approach and decreased its cost. In this Perspective, I review the state of gene prediction roughly 10 years ago, summarize the progress that has been made since, argue that the primary ORF identification methods we have relied on so far are inadequate, and recommend a path toward completing the Catalog of Protein Coding Genes, Version 1.0.


E-mail brent{at}cse.wustl.edu; fax (314) 935-7302.

Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.3866105.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Genome Res.Home page
M. F. Lin, J. W. Carlson, M. A. Crosby, B. B. Matthews, C. Yu, S. Park, K. H. Wan, A. J. Schroeder, L. S. Gramates, S. E. St. Pierre, et al.
Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes
Genome Res., December 1, 2007; 17(12): 1823 - 1836.
[Abstract] [Full Text] [PDF]


Home page
CirculationHome page
R. J.A. Frost and S. Engelhardt
A Secretion Trap Screen in Yeast Identifies Protease Inhibitor 16 as a Novel Antihypertrophic Protein Secreted From the Heart
Circulation, October 16, 2007; 116(16): 1768 - 1775.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Coghlan and R. Durbin
Genomix: a method for combining gene-finders' predictions, which uses evolutionary conservation of sequence and intron exon structure
Bioinformatics, June 15, 2007; 23(12): 1468 - 1475.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. Parra, K. Bradnam, and I. Korf
CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes
Bioinformatics, May 1, 2007; 23(9): 1061 - 1067.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
Z. Zhang, J. R. Hesselberth, and S. Fields
Genome-wide identification of spliced introns using a tiling microarray
Genome Res., April 1, 2007; 17(4): 503 - 509.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
Q. M. Mitrovich, B. B. Tuch, C. Guthrie, and A. D. Johnson
Computational and experimental approaches double the number of known introns in the pathogenic yeast Candida albicans
Genome Res., April 1, 2007; 17(4): 492 - 502.
[Abstract] [Full Text] [PDF]


Home page
MicrobiologyHome page
S. C. G. Rison, J. Mattow, P. R. Jungblut, and N. G. Stoker
Experimental determination of translational starts using peptide mass mapping and tandem mass spectrometry within the proteome of Mycobacterium tuberculosis
Microbiology, February 1, 2007; 153(2): 521 - 528.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
G. Temple, P. Lamesch, S. Milstein, D. E. Hill, L. Wagner, T. Moore, and M. Vidal
From genome to proteome: developing expression clone resources for the human genome.
Hum. Mol. Genet., April 15, 2006; 15(suppl_1): R31 - R43.
[Abstract] [Full Text] [PDF]




Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.
Copyright © 2005 by Cold Spring Harbor Laboratory Press.