Genome Research cityscape

Home Help [Feedback] [For Subscribers] [Archive] [Search] --
 QUICK SEARCH:   [advanced]


     


Published online before print November 7, 2007
Genome Research, DOI: 10.1101/gr.6679507
This Article
Right arrow Full Text (PDF)
Right arrow Drosophila 12 Genomes
Right arrow Supplemental Research Data
Right arrow All Versions of this Article:
gr.6679507v1
17/12/1823    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Lin, M. F.
Right arrow Articles by Kellis, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lin, M. F.
Right arrow Articles by Kellis, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

12 Drosophila Genomes/Letter

Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes

Michael F. Lin1, Joseph W. Carlson2, Madeline A. Crosby3, Beverley B. Matthews3, Charles Yu2, Soo Park2, Kenneth H. Wan2, Andrew J. Schroeder3, L. Sian Gramates3, Susan E. St. Pierre3, Margaret Roark3, Kenneth L. Wiley, Jr.4, Rob J. Kulathinal3, Peili Zhang3, Kyl V. Myrick4, Jerry V. Antone4, Susan E. Celniker2, William M. Gelbart3,4, and Manolis Kellis1,5,6

1 Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02139, USA; 2 Berkeley Drosophila Genome Project, Department of Genome Biology, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA; 3 FlyBase, The Biological Laboratories, Harvard University, Cambridge, Massachusetts 02138, USA; 4 Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138, USA; 5 MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts 02139, USA

The availability of sequenced genomes from 12 Drosophila species has enabled the use of comparative genomics for the systematic discovery of functional elements conserved within this genus. We have developed quantitative metrics for the evolutionary signatures specific to protein-coding regions and applied them genome-wide, resulting in 1193 candidate new protein-coding exons in the D. melanogaster genome. We have reviewed these predictions by manual curation and validated a subset by directed cDNA screening and sequencing, revealing both new genes and new alternative splice forms of known genes. We also used these evolutionary signatures to evaluate existing gene annotations, resulting in the validation of 87% of genes lacking descriptive names and identifying 414 poorly conserved genes that are likely to be spurious predictions, noncoding, or species-specific genes. Furthermore, our methods suggest a variety of refinements to hundreds of existing gene models, such as modifications to translation start codons and exon splice boundaries. Finally, we performed directed genome-wide searches for unusual protein-coding structures, discovering 149 possible examples of stop codon readthrough, 125 new candidate ORFs of polycistronic mRNAs, and several candidate translational frameshifts. These results affect >10% of annotated fly genes and demonstrate the power of comparative genomics to enhance our understanding of genome organization, even in a model organism as intensively studied as Drosophila melanogaster.


6 Corresponding author.

E-mail manoli{at}mit.edu; fax (617) 262-6121.

[Supplemental material is available online at www.genome.org. Additional supplemental materials are available online at http://compbio.mit.edu/fly/genes/. Full-length cDNA sequence data from this study have been submitted to GenBank under accession nos. BT029554–BT029635, BT029637–BT029727, BT029940–BT029957, BT030133– BT030144, BT030416–BT030421, and BT030448–BT030452. RT-PCR amplicon and primer sequence data have been submitted to GenBank under accession nos. ES439769–ES439782.]

Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.6679507


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Proc. Natl. Acad. Sci. USAHome page
M. Clamp, B. Fry, M. Kamal, X. Xie, J. Cuff, M. F. Lin, M. Kellis, K. Lindblad-Toh, and E. S. Lander
From the Cover: Distinguishing protein-coding and noncoding genes in the human genome
PNAS, December 4, 2007; 104(49): 19428 - 19433.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
A. Stark, P. Kheradpour, L. Parts, J. Brennecke, E. Hodges, G. J. Hannon, and M. Kellis
Systematic discovery and characterization of fly microRNAs using 12 Drosophila genomes
Genome Res., December 1, 2007; 17(12): 1865 - 1879.
[Abstract] [Full Text] [PDF]




Home Help [Feedback] [For Subscribers] [Archive] [Search] --
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.
Copyright © 2007 by Cold Spring Harbor Laboratory Press.