Genome Research

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Reese, M. G.
Right arrow Articles by Haussler, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Reese, M. G.
Right arrow Articles by Haussler, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Vol. 10, Issue 4, 529-538, April 2000

METHODS
Genie---Gene Finding in Drosophila melanogaster

Martin G. Reese,1,2,4 David Kulp,2 Hari Tammana,2 and David Haussler2,3

1 Berkeley Drosophila Genome Project, Department of Molecular and Cell Biology, University of California, Berkeley, California 94720-3200 USA; 2 Neomorphic, Inc., Berkeley, California 94710 USA; 3 Computer Science Department, University of California, Santa Cruz, California 95064 USA

A hidden Markov model-based gene-finding system called Genie was applied to the genomic Adh region in Drosophila melanogaster as a part of the Genome Annotation Assessment Project (GASP). Predictions from three versions of the Genie gene-finding system were submitted, one based on statistical properties of coding genes, a second included EST alignment information, and a third that integrated protein sequence homology information. All three programs were trained on the provided Drosophila training data. In addition, promoter assignments from an integrated neural network were submitted. The gene assignments overlapped >90% of the 222 annotated genes and 26 possibly novel genes were predicted, of which some might be overpredictions. The system correctly identified the exon boundaries of 70% of the exons in cDNA-confirmed genes and 77% of the exons with the addition of EST sequence alignments. The best of the three Genie submissions predicted 19 of the annotated 43 gene structures entirely correct (44%). In the promoter category, only 30% of the transcription start sites could be detected, but by integrating this program as a sensor into Genie the false-positive rate could be dropped to 1/16,786 (0.006%). The results of the experiment on the long contiguous genomic sequence revealed some problems concerning gene assembly in Genie. The results were used to improve the system. We show that Genie is a robust hidden Markov model system that allows for a generalized integration of information from different sources such as signal sensors (splice sites, start codon, etc.), content sensors (exons, introns, intergenic) and alignments of mRNA, EST, and peptide sequences. The assessment showed that Genie could effectively be used for the annotation of complete genomes from higher organisms.


4 Corresponding author.


10:529-538 ©2000 by Cold Spring Harbor Laboratory Press  ISSN 1088-9051/00 $5.00

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Genome Res.Home page
M. F. Lin, J. W. Carlson, M. A. Crosby, B. B. Matthews, C. Yu, S. Park, K. H. Wan, A. J. Schroeder, L. S. Gramates, S. E. St. Pierre, et al.
Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes
Genome Res., December 1, 2007; 17(12): 1823 - 1836.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. R. Brent
Genome annotation past, present, and future: How to define an ORF at each locus
Genome Res., December 1, 2005; 15(12): 1777 - 1786.
[Abstract] [Full Text] [PDF]


Home page
GENES CELLSHome page
S. Inagaki, K. Numata, T. Kondo, M. Tomita, K. Yasuda, A. Kanai, and Y. Kageyama
Identification and expression analysis of putative mRNA-like non-coding RNA in Drosophila
Genes Cells, December 1, 2005; 10(12): 1163 - 1173.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Lomsadze, V. Ter-Hovhannisyan, Y. O. Chernoff, and M. Borodovsky
Gene identification in novel eukaryotic genomes by self-training algorithm
Nucleic Acids Res., November 28, 2005; 33(20): 6494 - 6506.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. L. Mueller, K. R. Ram, L. A. McGraw, M. C. Bloch Qazi, E. D. Siggia, A. G. Clark, C. F. Aquadro, and M. F. Wolfner
Cross-Species Comparison of Drosophila Male Accessory Gland Protein Genes
Genetics, September 1, 2005; 171(1): 131 - 143.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Stanke and B. Morgenstern
AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints
Nucleic Acids Res., July 1, 2005; 33(suppl_2): W465 - W467.
[Abstract] [Full Text] [PDF]


Home page
J. Med. Genet.Home page
J Lin, K M Nishiguchi, M Nakamura, T P Dryja, E L Berson, and Y Miyake
Recessive mutations in the CYP4V2 gene in East Asian and Middle Eastern patients with Bietti crystalline corneoretinal dystrophy
J. Med. Genet., June 1, 2005; 42(6): e38 - e38.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. D. Wu and C. K. Watanabe
GMAP: a genomic mapping and alignment program for mRNA and EST sequences
Bioinformatics, May 1, 2005; 21(9): 1859 - 1875.
[Abstract] [Full Text] [PDF]


Home page
Mol. Endocrinol.Home page
A. Sandelin and W. W. Wasserman
Prediction of Nuclear Hormone Receptor Response Elements
Mol. Endocrinol., March 1, 2005; 19(3): 595 - 606.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Stanke, R. Steinkamp, S. Waack, and B. Morgenstern
AUGUSTUS: a web server for gene finding in eukaryotes
Nucleic Acids Res., July 1, 2004; 32(suppl_2): W309 - W312.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
C. Dewey, J. Q. Wu, S. Cawley, M. Alexandersson, R. Gibbs, and L. Pachter
Accurate Identification of Novel Human Genes Through Simultaneous Gene Prediction in Human, Mouse, and Rat
Genome Res., April 1, 2004; 14(4): 661 - 664.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. Biol.Home page
L. Jiang and S. T. Crews
The Drosophila dysfusion Basic Helix-Loop-Helix (bHLH)-PAS Gene Controls Tracheal Fusion and Levels of the Trachealess bHLH-PAS Protein
Mol. Cell. Biol., August 15, 2003; 23(16): 5625 - 5637.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Cawley, L. Pachter, and M. Alexandersson
SLAM web server for comparative gene finding and alignment
Nucleic Acids Res., July 1, 2003; 31(13): 3507 - 3509.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
T.-J. Chuang, W.-C. Lin, H.-C. Lee, C.-W. Wang, K.-L. Hsiao, Z.-H. Wang, D. Shieh, S. C. Lin, and L.-Y. Ch'ang
A Complexity Reduction Algorithm for Analysis and Annotation of Large Genomic Sequences
Genome Res., February 1, 2003; 13(2): 313 - 322.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
The FlyBase database of the Drosophila genome projects and community literature
Nucleic Acids Res., January 1, 2003; 31(1): 172 - 175.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
K. L. Howe, T. Chothia, and R. Durbin
GAZE: A Generic Framework for the Integration of Gene-Prediction Data by Dynamic Programming
Genome Res., September 1, 2002; 12(9): 1418 - 1427.
[Abstract] [Full Text] [PDF]


Home page
Endocr. Rev.Home page
C. P. Leo, S. Y. Hsu, and A. J. W. Hsueh
Hormonal Genomics
Endocr. Rev., June 1, 2002; 23(3): 369 - 381.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. L. Bulyk, P. L. F. Johnson, and G. M. Church
Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors
Nucleic Acids Res., March 1, 2002; 30(5): 1255 - 1261.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
W. J. Swanson, A. G. Clark, H. M. Waldrip-Dail, M. F. Wolfner, and C. F. Aquadro
Evolutionary EST analysis identifies rapidly evolving male reproductive proteins in Drosophila
PNAS, June 7, 2001; (2001) 131568198.
[Abstract] [Full Text] [PDF]


Home page
Mol. Endocrinol.Home page
U. A. Vitt, S. Y. Hsu, and A. J. W. Hsueh
Evolution and Classification of Cystine Knot-Containing Hormones and Related Extracellular Signaling Molecules
Mol. Endocrinol., May 1, 2001; 15(5): 681 - 694.
[Abstract] [Full Text]


Home page
Genome Res.Home page
R.-F. Yeh, L. P. Lim, and C. B. Burge
Computational Inference of Homologous Gene Structures in the Human Genome
Genome Res., May 1, 2001; 11(5): 803 - 816.
[Abstract] [Full Text]


Home page
Genome Res.Home page
J. Andrews, G. G. Bouffard, C. Cheadle, J. Lü, K. G. Becker, and B. Oliver
Gene Discovery Using Computational and Microarray Analysis of Transcription in the Drosophila melanogaster Testis
Genome Res., December 1, 2000; 10(12): 2030 - 2043.
[Abstract] [Full Text]


Home page
J. Biol. Chem.Home page
R. J. Siviter, G. M. Coast, A. M. E. Winther, R. J. Nachman, C. A. M. Taylor, A. D. Shirras, D. Coates, R. E. Isaac, and D. R. Nassel
Expression and Functional Characterization of a Drosophila Neuropeptide Precursor with Homology to Mammalian Preprotachykinin A
J. Biol. Chem., July 21, 2000; 275(30): 23273 - 23280.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
W. J. Swanson, A. G. Clark, H. M. Waldrip-Dail, M. F. Wolfner, and C. F. Aquadro
Evolutionary EST analysis identifies rapidly evolving male reproductive proteins in Drosophila
PNAS, June 19, 2001; 98(13): 7375 - 7379.
[Abstract] [Full Text] [PDF]




Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.