Genome Research Econo tag

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Yeh, R.-F.
Right arrow Articles by Burge, C. B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Yeh, R.-F.
Right arrow Articles by Burge, C. B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Vol. 11, Issue 5, 803-816, May 2001

LETTER
Computational Inference of Homologous Gene Structures in the Human Genome

Ru-Fang Yeh,1 Lee P. Lim,1,2 and Christopher B. Burge1,3

1  Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; 2  Center for Cancer Research, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA

With the human genome sequence approaching completion, a major challenge is to identify the locations and encoded protein sequences of all human genes. To address this problem we have developed a new gene identification algorithm, GenomeScan, which combines exon-intron and splice signal models with similarity to known protein sequences in an integrated model. Extensive testing shows that GenomeScan can accurately identify the exon-intron structures of genes in finished or draft human genome sequence with a low rate of false-positives. Application of GenomeScan to 2.7 billion bases of human genomic DNA identified at least 20,000-25,000 human genes out of an estimated 30,000-40,000 present in the genome. The results show an accurate and efficient automated approach for identifying genes in higher eukaryotic genomes and provide a first-level annotation of the draft human genome.


3 Corresponding author.


11:803-816 ©2001 by Cold Spring Harbor Laboratory Press  ISSN 1088-9051/01 $5.00

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
RNAHome page
D. Palakodeti, M. Smielewska, Y.-C. Lu, G. W. Yeo, and B. R. Graveley
The PIWI proteins SMEDWI-2 and SMEDWI-3 are required for stem cell function and piRNA expression in planarians
RNA, June 1, 2008; 14(6): 1174 - 1186.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
C. M. Whittington, A. T. Papenfuss, P. Bansal, A. M. Torres, E. S.W. Wong, J. E. Deakin, T. Graves, A. Alsop, K. Schatzkamer, C. Kremitzki, et al.
Defensins and the convergent evolution of platypus and reptile venom genes
Genome Res., June 1, 2008; 18(6): 986 - 994.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
D. DeCaprio, J. P. Vinson, M. D. Pearson, P. Montgomery, M. Doherty, and J. E. Galagan
Conrad: Gene prediction using conditional random fields
Genome Res., September 1, 2007; 17(9): 1389 - 1398.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
K. Belov, C. E. Sanderson, J. E. Deakin, E. S.W. Wong, D. Assange, K. A. McColl, A. Gout, B. de Bono, A. D. Barrow, T. P. Speed, et al.
Characterization of the opossum immune genome provides insights into the evolution of the mammalian immune system
Genome Res., July 1, 2007; 17(7): 982 - 991.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. Parra, K. Bradnam, and I. Korf
CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes
Bioinformatics, May 1, 2007; 23(9): 1061 - 1067.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Knapp and Y.-P. P. Chen
An evaluation of contemporary hidden Markov model genefinders with a predicted exon taxonomy
Nucleic Acids Res., January 12, 2007; 35(1): 317 - 324.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
S. Vilanova, M. L. Badenes, L. Burgos, J. Martinez-Calvo, G. Llacer, and C. Romero
Self-Compatibility of Two Apricot Selections Is Associated with Two Pollen-Part Mutations of Different Nature
Plant Physiology, October 1, 2006; 142(2): 629 - 641.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Yao, R. Charlab, and P. Li
Systematic identification of pseudogenes through whole genome expression evidence profiling
Nucleic Acids Res., September 11, 2006; 34(16): 4477 - 4485.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Virol.Home page
Q. Zhang, X. Su, S. Gong, Q. Zeng, B. Zhu, Z. Wu, T. Peng, C. Zhang, and R. Zhou
Comparative genomic analysis of two strains of human adenovirus type 3 isolated from children with acute respiratory infection in southern China
J. Gen. Virol., June 1, 2006; 87(6): 1531 - 1541.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
P. A. Antinozzi, A. Garcia-Diaz, C. Hu, and J. E. Rothman
Functional mapping of disease susceptibility loci using cell biology.
PNAS, March 7, 2006; 103(10): 3698 - 3703.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
Y. Xiao, Y. Cai, Y. R. Bommineni, S. C. Fernando, O. Prakash, S. E. Gilliland, and G. Zhang
Identification and Functional Characterization of Three Chicken Cathelicidins with Potent Antimicrobial Activity
J. Biol. Chem., February 3, 2006; 281(5): 2858 - 2867.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. R. Brent
Genome annotation past, present, and future: How to define an ORF at each locus
Genome Res., December 1, 2005; 15(12): 1777 - 1786.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
A. A. Patil, Y. Cai, Y. Sang, F. Blecha, and G. Zhang
Cross-species analysis of the mammalian {beta}-defensin gene family: presence of syntenic gene clusters and preferential expression in the male reproductive tract
Physiol Genomics, September 21, 2005; 23(1): 5 - 17.
[Abstract] [Full Text] [PDF]


Home page
J. Clin. Microbiol.Home page
A. Purkayastha, J. Su, J. McGraw, S. E. Ditty, T. L. Hadfield, J. Seto, K. L. Russell, C. Tibbetts, and D. Seto
Genomic and Bioinformatics Analyses of HAdV-4vac and HAdV-7vac, Two Human Adenovirus (HAdV) Strains That Constituted Original Prophylaxis against HAdV-Related Acute Respiratory Disease, a Reemerging Epidemic Disease
J. Clin. Microbiol., July 1, 2005; 43(7): 3083 - 3094.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Conklin, B. Haldeman, and Z. Gao
Gene finding for the helical cytokines
Bioinformatics, May 1, 2005; 21(9): 1776 - 1781.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. S. Katari, V. Balija, R. K. Wilson, R. A. Martienssen, and W. R. McCombie
Comparing low coverage random shotgun sequence data from Brassica oleracea and Oryza sativa genome sequence for their ability to add to the annotation of Arabidopsis thaliana
Genome Res., April 1, 2005; 15(4): 496 - 504.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
N. Nikolaidis, I. Makalowska, D. Chalkia, W. Makalowski, J. Klein, and M. Nei
Origin and evolution of the chicken leukocyte receptor complex
PNAS, March 15, 2005; 102(11): 4057 - 4062.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
A. Purkayastha, S. E. Ditty, J. Su, J. McGraw, T. L. Hadfield, C. Tibbetts, and D. Seto
Genomic and Bioinformatics Analysis of HAdV-4, a Human Adenovirus Causing Acute Respiratory Disease: Implications for Gene Therapy and Vaccine Vector Development
J. Virol., February 15, 2005; 79(4): 2559 - 2572.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
L. Florea, V. Di Francesco, J. Miller, R. Turner, A. Yao, M. Harris, B. Walenz, C. Mobarry, G. V. Merkulov, R. Charlab, et al.
Gene and alternative splicing annotation with AIR
Genome Res., January 1, 2005; 15(1): 54 - 66.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
A. Patil, A. L. Hughes, and G. Zhang
Rapid evolution and diversification of mammalian {alpha}-defensins as revealed by comparative analysis of rodent and primate genes
Physiol Genomics, December 15, 2004; 20(1): 1 - 11.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
S. Dike, V. S. Balija, L. U. Nascimento, Z. Xuan, J. Ou, T. Zutavern, L. E. Palmer, G. Hannon, M. Q. Zhang, and W. R. McCombie
The mouse genome: Experimental examination of gene predictions and transcriptional start sites
Genome Res., December 1, 2004; 14(12): 2424 - 2429.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Virol.Home page
K. P. Lauer, I. Llorente, E. Blair, J. Seto, V. Krasnov, A. Purkayastha, S. E. Ditty, T. L. Hadfield, C. Buck, C. Tibbetts, et al.
Natural variation among human adenoviruses: genome sequence and annotation of human adenovirus serotype 1
J. Gen. Virol., September 1, 2004; 85(9): 2615 - 2625.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
B. Issac and G. P. S. Raghava
EGPred: Prediction of Eukaryotic Genes Using Ab Initio Methods After Combining With Sequence Similarity Approaches
Genome Res., September 1, 2004; 14(9): 1756 - 1766.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
E. Birney, M. Clamp, and R. Durbin
GeneWise and Genomewise
Genome Res., May 1, 2004; 14(5): 988 - 995.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
I. M. Meyer and R. Durbin
Gene structure conservation aids similarity based gene prediction
Nucleic Acids Res., February 4, 2004; 32(2): 776 - 783.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
J. E. Allen, M. Pertea, and S. L. Salzberg
Computational Gene Prediction Using Multiple Sources of Evidence
Genome Res., January 1, 2004; 14(1): 142 - 148.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
G. Sengle, B. Kobbe, M. Morgelin, M. Paulsson, and R. Wagener
Identification and Characterization of AMACO, a New Member of the von Willebrand Factor A-like Domain Protein Superfamily with a Regulated Expression in the Kidney
J. Biol. Chem., December 12, 2003; 278(50): 50240 - 50249.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
Z. Zhang, P. M. Harrison, Y. Liu, and M. Gerstein
Millions of Years of Evolution Preserved: A Comprehensive Catalog of the Processed Pseudogenes in the Human Genome
Genome Res., December 1, 2003; 13(12): 2541 - 2558.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
E. M. Majerus, X. Zheng, E. A. Tuley, and J. E. Sadler
Cleavage of the ADAMTS13 Propeptide Is Not Required for Protease Activity
J. Biol. Chem., November 21, 2003; 278(47): 46643 - 46648.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
B. J. Haas, A. L. Delcher, S. M. Mount, J. R. Wortman, R. K. Smith Jr, L. I. Hannick, R. Maiti, C. M. Ronning, D. B. Rusch, C. D. Town, et al.
Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies
Nucleic Acids Res., October 1, 2003; 31(19): 5654 - 5666.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
L. Zhang, V. Pavlovic, C. R Cantor, and S. Kasif
Human-Mouse Gene Identification by Comparative Evidence Integration and Evolutionary Analysis
Genome Res., June 1, 2003; 13(6): 1190 - 1202.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
W. Zhu, S. D. Schlueter, and V. Brendel
Refined Annotation of the Arabidopsis Genome by Complete Expressed Sequence Tag Mapping
Plant Physiology, June 1, 2003; 132(2): 469 - 484.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
R. Guigo, E. T. Dermitzakis, P. Agarwal, C. P. Ponting, G. Parra, A. Reymond, J. F. Abril, E. Keibler, R. Lyle, C. Ucla, et al.
Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes
PNAS, February 4, 2003; 100(3): 1140 - 1145.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
T.-J. Chuang, W.-C. Lin, H.-C. Lee, C.-W. Wang, K.-L. Hsiao, Z.-H. Wang, D. Shieh, S. C. Lin, and L.-Y. Ch'ang
A Complexity Reduction Algorithm for Analysis and Annotation of Large Genomic Sequences
Genome Res., February 1, 2003; 13(2): 313 - 322.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
P. Flicek, E. Keibler, P. Hu, I. Korf, and M. R. Brent
Leveraging the Mouse Genome for Gene Prediction in Human: From Whole-Genome Shotgun Reads to a Global Synteny Map
Genome Res., January 1, 2003; 13(1): 46 - 54.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
G. Parra, P. Agarwal, J. F. Abril, T. Wiehe, J. W. Fickett, and R. Guigo
Comparative Gene Prediction in Human and Mouse
Genome Res., January 1, 2003; 13(1): 108 - 117.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
Mammalian Gene Collection Program Team*, R. L. Strausberg, E. A. Feingold, L. H. Grouse, J. G. Derge, R. D. Klausner, F. S. Collins, L. Wagner, C. M. Shenmen, G. D. Schuler, et al.
Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences
PNAS, December 24, 2002; 99(26): 16899 - 16903.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Mathe, M.-F. Sagot, T. Schiex, and P. Rouze
Current methods of gene prediction, their strengths and weaknesses
Nucleic Acids Res., October 1, 2002; 30(19): 4103 - 4117.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
Z. Zhang, P. Harrison, and M. Gerstein
Identification and Analysis of Over 2000 Ribosomal Protein Pseudogenes in the Human Genome
Genome Res., October 1, 2002; 12(10): 1466 - 1482.
[Abstract] [Full Text] [PDF]


Home page
Mol. Biol. CellHome page
C. A. Whittaker and R. O. Hynes
Distribution and Evolution of von Willebrand/Integrin A Domains: Widely Dispersed Domains with Roles in Cell Adhesion and Elsewhere
Mol. Biol. Cell, October 1, 2002; 13(10): 3369 - 3387.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
K. L. Howe, T. Chothia, and R. Durbin
GAZE: A Generic Framework for the Integration of Gene-Prediction Data by Dynamic Programming
Genome Res., September 1, 2002; 12(9): 1418 - 1427.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Walker, V. Pavlovic, and S. Kasif
A comparative genomic method for computational identification of prokaryotic translation initiation sites
Nucleic Acids Res., July 15, 2002; 30(14): 3181 - 3191.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. Echols, P. Harrison, S. Balasubramanian, N. M. Luscombe, P. Bertone, Z. Zhang, and M. Gerstein
Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes
Nucleic Acids Res., June 1, 2002; 30(11): 2515 - 2523.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
T. K. Bera, C. Iavarone, V. Kumar, S. Lee, B. Lee, and I. Pastan
MRP9, an unusual truncated member of the ABC transporter superfamily, is highly expressed in breast cancer
PNAS, May 14, 2002; 99(10): 6997 - 7002.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
I. Ibanez-Tallon, S. Gorokhova, and N. Heintz
Loss of function of axonemal dynein Mdnah5 causes primary ciliary dyskinesia and hydrocephalus
Hum. Mol. Genet., March 1, 2002; 11(6): 715 - 721.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. M. Harrison, A. Kumar, N. Lang, M. Snyder, and M. Gerstein
A question of size: the eukaryotic proteome and the problems in defining it
Nucleic Acids Res., March 1, 2002; 30(5): 1083 - 1090.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
P. M. Harrison, H. Hegyi, S. Balasubramanian, N. M. Luscombe, P. Bertone, N. Echols, T. Johnson, and M. Gerstein
Molecular Fossils in the Human Genome: Identification and Analysis of the Pseudogenes in Chromosomes 21 and 22
Genome Res., February 1, 2002; 12(2): 272 - 280.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
D. Greenbaum, N. M. Luscombe, R. Jansen, J. Qian, and M. Gerstein
Interrelating Different Types of Genomic Data, from Proteome to Secretome: 'Oming in on Function
Genome Res., September 1, 2001; 11(9): 1463 - 1468.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
E. D. Green and A. Chakravarti
The Human Genome Sequence Expedition: Views from the "Base Camp"
Genome Res., May 1, 2001; 11(5): 645 - 651.
[Full Text]




Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.