Genome Research cityscape

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Parra, G.
Right arrow Articles by Guigó, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Parra, G.
Right arrow Articles by Guigó, R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
Vol 13, Issue 1, 108-117, January 2003

METHODS

Comparative Gene Prediction in Human and Mouse

Genís Parra1, Pankaj Agarwal2, Josep F. Abril1, Thomas Wiehe3, James W. Fickett4 and Roderic Guigó1,5

1Grup de Recerca en Informàtica Biomèdica. Institut Municipal d'Investigació Medica / Universitat Pompeu Fabra / Centre de Regulació Genòmica 08003 Barcelona, Catalonia, Spain; 2GlaxoSmithKline, King of Prussia, Pennsylvania 19406, USA; 3Freie Universität Berlin and Berlin Center for Genome Based Bioinformatics (BCB), 14195 Berlin, Germany; 4AstraZeneca R&D Boston, Waltham, Massachusetts 02451, USA

The completion of the sequencing of the mouse genome promises to help predict human genes with greater accuracy. While current ab initio gene prediction programs are remarkably sensitive (i.e., they predict at least a fragment of most genes), their specificity is often low, predicting a large number of false-positive genes in the human genome. Sequence conservation at the protein level with the mouse genome can help eliminate some of those false positives. Here we describe SGP2, a gene prediction program that combines ab initio gene prediction with TBLASTX searches between two genome sequences to provide both sensitive and specific gene predictions. The accuracy of SGP2 when used to predict genes by comparing the human and mouse genomes is assessed on a number of data sets, including single-gene data sets, the highly curated human chromosome 22 predictions, and entire genome predictions from ENSEMBL. Results indicate that SGP2 outperforms purely ab initio gene prediction methods. Results also indicate that SGP2 works about as well with 3x shotgun data as it does with fully assembled genomes. SGP2 provides a high enough specificity that its predictions can be experimentally verified at a reasonable cost. SGP2 was used to generate a complete set of gene predictions on both the human and mouse by comparing the genomes of these two species. Our results suggest that another few thousand human and mouse genes currently not in ENSEMBL are worth verifying experimentally.


5 Corresponding author.E-MAIL rguigo{at}imim.es; FAX 34 93 224-0875.

Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.871403.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Brief Funct Genomic ProteomicHome page
C. Ansong, S. O. Purvine, J. N. Adkins, M. S. Lipton, and R. D. Smith
Proteogenomics: needs and roles to be filled by proteomics in genome annotation
Brief Funct Genomic Proteomic, March 10, 2008; (2008) eln010v1.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. J. Fullwood, J. J. S. Tan, P. W. P. Ng, K. P. Chiu, J. Liu, C. L. Wei, and Y. Ruan
The use of multiple displacement amplification to amplify complex DNA libraries
Nucleic Acids Res., March 1, 2008; 36(5): e32 - e32.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
A. Siepel, M. Diekhans, B. Brejova, L. Langton, M. Stevens, C. L.G. Comstock, C. Davis, B. Ewing, S. Oommen, C. Lau, et al.
Targeted discovery of novel human exons by comparative genomics
Genome Res., December 1, 2007; 17(12): 1763 - 1773.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. M. Andres, C. de Hemptinne, and J. Bertranpetit
Heterogeneous Rate of Protein Evolution in Serotonin Genes
Mol. Biol. Evol., December 1, 2007; 24(12): 2707 - 2715.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
R. Lyle, P. Prandini, K. Osoegawa, B. ten Hallers, S. Humphray, B. Zhu, E. Eyras, R. Castelo, C. P. Bird, S. Gagos, et al.
Islands of euchromatin-like sequence and expressed polymorphic sequences within the short arm of human chromosome 21
Genome Res., November 1, 2007; 17(11): 1690 - 1696.
[Abstract] [Full Text] [PDF]


Home page
Poult. Sci.Home page
L. A. Cogburn, T. E. Porter, M. J. Duclos, J. Simon, S. C. Burgess, J. J. Zhu, H. H. Cheng, J. B. Dodgson, and J. Burnside
Functional Genomics of the Chicken A Model Organism
Poult. Sci., October 1, 2007; 86(10): 2059 - 2094.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Coghlan and R. Durbin
Genomix: a method for combining gene-finders' predictions, which uses evolutionary conservation of sequence and intron exon structure
Bioinformatics, June 15, 2007; 23(12): 1468 - 1475.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
T. R. Gingeras
Origin of phenotypes: Genes and transcripts
Genome Res., June 1, 2007; 17(6): 682 - 690.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. Parra, K. Bradnam, and I. Korf
CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes
Bioinformatics, May 1, 2007; 23(9): 1061 - 1067.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Mix, A. V. Lobanov, and V. N. Gladyshev
SECIS elements in the coding regions of selenoprotein transcripts are functional in higher eukaryotes
Nucleic Acids Res., January 28, 2007; 35(2): 414 - 423.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. J. Hsieh, C. Y. Lin, N. H. Liu, W. Y. Chow, and C. Y. Tang
GeneAlign: a coding exon prediction tool based on phylogenetical comparisons.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W280 - W284.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Agrawal and G. D. Stormo
Using mRNAs lengths to accurately predict the alternatively spliced gene products in Caenorhabditis elegans
Bioinformatics, May 15, 2006; 22(10): 1239 - 1244.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. J. van Baren and M. R. Brent
Iterative gene prediction and pseudogene removal improves genome annotation.
Genome Res., May 1, 2006; 16(5): 678 - 685.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
D. W. Burt
Chicken genome: Current status and future opportunities
Genome Res., December 1, 2005; 15(12): 1692 - 1698.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
J. L. Guenet
The mouse genome
Genome Res., December 1, 2005; 15(12): 1729 - 1740.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. R. Brent
Genome annotation past, present, and future: How to define an ORF at each locus
Genome Res., December 1, 2005; 15(12): 1777 - 1786.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. E. Allen and S. L. Salzberg
JIGSAW: integration of multiple sources of evidence for gene prediction
Bioinformatics, September 15, 2005; 21(18): 3596 - 3603.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Castelo, A. Reymond, C. Wyss, F. Camara, G. Parra, S. E. Antonarakis, R. Guigo, and E. Eyras
Comparative gene finding in chicken indicates that we are closing in on the set of multi-exonic widely expressed human genes
Nucleic Acids Res., April 4, 2005; 33(6): 1935 - 1939.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
L. Ding, A. Sabo, N. Berkowicz, R. R. Meyer, Y. Shotland, M. R. Johnson, K. H. Pepin, R. K. Wilson, and J. Spieth
EAnnot: A genome annotation tool using experimental evidence
Genome Res., December 1, 2004; 14(12): 2503 - 2509.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
L. Taher, O. Rinner, S. Garg, A. Sczyrba, and B. Morgenstern
AGenDA: gene prediction by cross-species sequence comparison
Nucleic Acids Res., July 1, 2004; 32(suppl_2): W305 - W308.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Stanke, R. Steinkamp, S. Waack, and B. Morgenstern
AUGUSTUS: a web server for gene finding in eukaryotes
Nucleic Acids Res., July 1, 2004; 32(suppl_2): W309 - W312.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
E. Birney, M. Clamp, and R. Durbin
GeneWise and Genomewise
Genome Res., May 1, 2004; 14(5): 988 - 995.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
C. Dewey, J. Q. Wu, S. Cawley, M. Alexandersson, R. Gibbs, and L. Pachter
Accurate Identification of Novel Human Genes Through Simultaneous Gene Prediction in Human, Mouse, and Rat
Genome Res., April 1, 2004; 14(4): 661 - 664.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
J. Q. Wu, D. Shteynberg, M. Arumugam, R. A. Gibbs, and M. R. Brent
Identification of Rat Genes by TWINSCAN Gene Prediction, RT-PCR, and Direct Sequencing
Genome Res., April 1, 2004; 14(4): 665 - 671.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
K. Chakrabarti and L. Pachter
Visualization of Multiple Genome Annotations and Alignments With the K-BROWSER
Genome Res., April 1, 2004; 14(4): 716 - 720.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
I. M. Meyer and R. Durbin
Gene structure conservation aids similarity based gene prediction
Nucleic Acids Res., February 4, 2004; 32(2): 776 - 783.
[Abstract] [Full Text] [PDF]


Home page
Crit. Rev. Oral Biol. Med.Home page
D.F. Kinane and T.C. Hart
GENES AND GENE POLYMORPHISMS ASSOCIATED WITH PERIODONTAL DISEASE
Crit. Rev. Oral. Biol. Med., November 1, 2003; 14(6): 430 - 449.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Cawley, L. Pachter, and M. Alexandersson
SLAM web server for comparative gene finding and alignment
Nucleic Acids Res., July 1, 2003; 31(13): 3507 - 3509.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
R. Guigo, E. T. Dermitzakis, P. Agarwal, C. P. Ponting, G. Parra, A. Reymond, J. F. Abril, E. Keibler, R. Lyle, C. Ucla, et al.
Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes
PNAS, February 4, 2003; 100(3): 1140 - 1145.
[Abstract] [Full Text] [PDF]




Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.