Genome Research songbird

Home Help [Feedback] [For Subscribers] [Archive] [Search] --
 QUICK SEARCH:   [advanced]


     


Published online before print November 7, 2007
Genome Research, DOI: 10.1101/gr.7128207
This Article
Right arrow Full Text (PDF)
Right arrow Supplemental Research Data
Right arrow All Versions of this Article:
gr.7128207v1
17/12/1763    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Siepel, A.
Right arrow Articles by Brent, M. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Siepel, A.
Right arrow Articles by Brent, M. R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Methods

Targeted discovery of novel human exons by comparative genomics

Adam Siepel1,9, Mark Diekhans2, Brona Brejová1, Laura Langton3, Michael Stevens3, Charles L.G. Comstock3, Colleen Davis4, Brent Ewing4, Shelly Oommen5, Christopher Lau5, Hung-Chun Yu5, Jianfeng Li5, Bruce A. Roe5, Phil Green4, Daniela S. Gerhard6, Gary Temple7, David Haussler2,8, and Michael R. Brent3

1 Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853, USA; 2 Center for Biomolecular Science and Engineering, University of California, Santa Cruz, California 95064, USA; 3 Laboratory for Computational Genomics, Washington University, Saint Louis, Missouri 63130, USA; 4 Howard Hughes Medical Institute and Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; 5 Departments of Chemistry and Biochemistry, University of Oklahoma, Norman, Oklahoma 73109, USA; 6 National Cancer Institute, Bethesda, Maryland 20892, USA; 7 National Human Genome Research Institute, Bethesda, Maryland 20892, USA; 8 Howard Hughes Medical Institute, University of California, Santa Cruz, California 95064, USA

A complete and accurate set of human protein-coding gene annotations is perhaps the single most important resource for genomic research after the human-genome sequence itself, yet the major gene catalogs remain incomplete and imperfect. Here we describe a genome-wide effort, carried out as part of the Mammalian Gene Collection (MGC) project, to identify human genes not yet in the gene catalogs. Our approach was to produce gene predictions by algorithms that rely on comparative sequence data but do not require direct cDNA evidence, then to test predicted novel genes by RT–PCR. We have identified 734 novel gene fragments (NGFs) containing 2188 exons with, at most, weak prior cDNA support. These NGFs correspond to an estimated 563 distinct genes, of which >160 are completely absent from the major gene catalogs, while hundreds of others represent significant extensions of known genes. The NGFs appear to be predominantly protein-coding genes rather than noncoding RNAs, unlike novel transcribed sequences identified by technologies such as tiling arrays and CAGE. They tend to be expressed at low levels and in a tissue-specific manner, and they are enriched for roles in motor activity, cell adhesion, connective tissue, and central nervous system development. Our results demonstrate that many important genes and gene fragments have been missed by traditional approaches to gene discovery but can be identified by their evolutionary signatures using comparative sequence data. However, they suggest that hundreds—not thousands—of protein-coding genes are completely missing from the current gene catalogs.


9 Corresponding author.

E-mail acs4{at}cornell.edu; fax (607) 255-4698.

[Supplemental material is available online at www.genome.org.]

Article published online before print. Article and publication date are online at http://www.genome.org/cgi/doi/10.1101/gr.7128207


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
M. Stanke, M. Diekhans, R. Baertsch, and D. Haussler
Using native and syntenically mapped cDNA alignments to improve de novo gene finding
Bioinformatics, March 1, 2008; 24(5): 637 - 644.
[Abstract] [Full Text] [PDF]




Home Help [Feedback] [For Subscribers] [Archive] [Search] --
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.
Copyright © 2007 by Cold Spring Harbor Laboratory Press.