Genome Research

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


Genome Res. 13:2541-2558, 2003
©2003 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/03 $5.00
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Zhang, Z.
Right arrow Articles by Gerstein, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zhang, Z.
Right arrow Articles by Gerstein, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Letter

Millions of Years of Evolution Preserved: A Comprehensive Catalog of the Processed Pseudogenes in the Human Genome

Zhaolei Zhang, Paul M. Harrison, Yin Liu and Mark Gerstein1

Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520-8114, USA

Processed pseudogenes were created by reverse-transcription of mRNAs; they provide snapshots of ancient genes existing millions of years ago in the genome. To find them in the present-day human, we developed a pipeline using features such as intron-absence, frame-disruption, polyadenylation, and truncation. This has enabled us to identify in recent genome drafts ~8000 processed pseudogenes (distributed from http://pseudogene.org). Overall, processed pseudogenes are very similar to their closest corresponding human gene, being 94% complete in coding regions, with sequence similarity of 75% for amino acids and 86% for nucleotides. Their chromosomal distribution appears random and dispersed, with the numbers on chromosomes proportional to length, suggesting sustained "bombardment" over evolution. However, it does vary with GC-content: Processed pseudogenes occur mostly in intermediate GC-content regions. This is similar to Alus but contrasts with functional genes and L1-repeats. Pseudogenes, moreover, have age profiles similar to Alus. The number of pseudogenes associated with a given gene follows a power-law relationship, with a few genes giving rise to many pseudogenes and most giving rise to few. The prevalence of processed pseudogenes agrees well with germ-line gene expression. Highly expressed ribosomal proteins account for ~20% of the total. Other notables include cyclophilin-A, keratin, GAPDH, and cytochrome c.


Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1429003.

1 Corresponding author.
E-MAIL Mark.Gerstein{at}yale.edu; FAX (360) 838-7861.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Genome Res.Home page
Y.-T. Huang, F.-C. Chen, C.-J. Chen, H.-L. Chen, and T.-J. Chuang
Identification and analysis of ancestral hominoid transcriptome inferred from cross-species transcript and processed pseudogene comparisons
Genome Res., July 1, 2008; 18(7): 1163 - 1170.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
K. Okamura and K. Nakai
Retrotransposition as a Source of New Promoters
Mol. Biol. Evol., June 1, 2008; 25(6): 1231 - 1238.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. P. Stoye and M. W. Yap
Chance favors a prepared genome
PNAS, March 4, 2008; 105(9): 3177 - 3178.
[Full Text] [PDF]


Home page
Mol Biol EvolHome page
Z. D. Zhang, P. Cayting, G. Weinstock, and M. Gerstein
Analysis of Nuclear Receptor Pseudogenes in Vertebrates: How the Silent Tell Their Stories
Mol. Biol. Evol., January 1, 2008; 25(1): 131 - 143.
[Abstract] [Full Text] [PDF]


Home page
Genes Dev.Home page
P. H. Maxwell and M. J. Curcio
Retrosequence formation restructures the yeast genome
Genes & Dev., December 15, 2007; 21(24): 3308 - 3318.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
A. Bhutkar, S. M. Russo, T. F. Smith, and W. M. Gelbart
Genome-scale analysis of positionally relocated genes
Genome Res., December 1, 2007; 17(12): 1880 - 1887.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
A. R. Muotri, M. C.N. Marchetto, N. G. Coufal, and F. H. Gage
The necessary junk: new functions for transposable elements
Hum. Mol. Genet., October 15, 2007; 16(R2): R159 - R167.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. Biol.Home page
J. L. Goodier, L. Zhang, M. R. Vetter, and H. H. Kazazian Jr.
LINE-1 ORF1 Protein Localizes in Stress Granules with Other RNA-Binding Proteins, Including Components of RNA Interference RNA-Induced Silencing Complex
Mol. Cell. Biol., September 15, 2007; 27(18): 6469 - 6483.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Zheng, J. Shi, X. Fang, Y. Li, S. Vang, W. Fan, J. Wang, Z. Zhang, W. Wang, K. Kristiansen, et al.
FGF: A web tool for Fishing Gene Family in a whole genome database
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W121 - W125.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. B. Gerstein, C. Bruce, J. S. Rozowsky, D. Zheng, J. Du, J. O. Korbel, O. Emanuelsson, Z. D. Zhang, S. Weissman, and M. Snyder
What is a gene, post-ENCODE? History and updated definition
Genome Res., June 1, 2007; 17(6): 669 - 681.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
Y. Ruan, H. S. Ooi, S. W. Choo, K. P. Chiu, X. D. Zhao, K.G. Srinivasan, F. Yao, C. Y. Choo, J. Liu, P. Ariyaratne, et al.
Fusion transcripts and transcribed retrotransposed loci discovered through comprehensive transcriptome analysis using Paired-End diTags (PETs)
Genome Res., June 1, 2007; 17(6): 828 - 838.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
D. Zheng, A. Frankish, R. Baertsch, P. Kapranov, A. Reymond, S. W. Choo, Y. Lu, F. Denoeud, S. E. Antonarakis, M. Snyder, et al.
Pseudogenes in the ENCODE regions: Consensus annotation, analysis of transcription, and evolution
Genome Res., June 1, 2007; 17(6): 839 - 851.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. L. a. S. Li
Genome-wide analyses of retrogenes derived from the human box H/ACA snoRNAs
Nucleic Acids Res., January 28, 2007; 35(2): 559 - 571.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. E. Karro, Y. Yan, D. Zheng, Z. Zhang, N. Carriero, P. Cayting, P. Harrrison, and M. Gerstein
Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D55 - D60.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
J. Coulombe-Huntington and J. Majewski
Characterization of intron loss events in mammals
Genome Res., January 1, 2007; 17(1): 23 - 32.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Yao, R. Charlab, and P. Li
Systematic identification of pseudogenes through whole genome expression evidence profiling
Nucleic Acids Res., September 11, 2006; 34(16): 4477 - 4485.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
G. Drouin
Processed Pseudogenes Are More Abundant in Human and Mouse X Chromosomes than in Autosomes
Mol. Biol. Evol., September 1, 2006; 23(9): 1652 - 1655.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
W. Wang, H. Zheng, C. Fan, J. Li, J. Shi, Z. Cai, G. Zhang, D. Liu, J. Zhang, S. Vang, et al.
High Rate of Chimeric Gene Origination by Retroposition in Plant Genomes
PLANT CELL, August 1, 2006; 18(8): 1791 - 1802.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Z. Zhang, N. Carriero, D. Zheng, J. Karro, P. M. Harrison, and M. Gerstein
PseudoPipe: an automated pseudogene identification pipeline
Bioinformatics, June 15, 2006; 22(12): 1437 - 1439.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. J. van Baren and M. R. Brent
Iterative gene prediction and pseudogene removal improves genome annotation.
Genome Res., May 1, 2006; 16(5): 678 - 685.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. Kaneko, I. Aki, K. Tsuda, K. Mekada, K. Moriwaki, N. Takahata, and Y. Satta
Origin and Evolution of Processed Pseudogenes That Stabilize Functional Makorin1 mRNAs in Mice, Primates and Other Mammals
Genetics, April 1, 2006; 172(4): 2421 - 2429.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S.-H. Shiu, J. K. Byrnes, R. Pan, P. Zhang, and W.-H. Li
Role of positive selection in the retention of duplicate genes in mammalian genomes
PNAS, February 14, 2006; 103(7): 2232 - 2236.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
R. Shemesh, A. Novik, S. Edelheit, and R. Sorek
Genomic fossils as a snapshot of the human transcriptome
PNAS, January 31, 2006; 103(5): 1364 - 1369.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
P. Akiva, A. Toporik, S. Edelheit, Y. Peretz, A. Diber, R. Shemesh, A. Novik, and R. Sorek
Transcription-mediated gene fusion in the human genome
Genome Res., January 1, 2006; 16(1): 30 - 36.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
M. Csuros and I. Miklos
Statistical Alignment of Retropseudogenes and Their Functional Paralogs
Mol. Biol. Evol., December 1, 2005; 22(12): 2457 - 2471.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S. Karlin
Colloquium Perspective: Statistical signals in bioinformatics
PNAS, September 20, 2005; 102(38): 13355 - 13362.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
N. Juretic, D. R. Hoen, M. L. Huynh, P. M. Harrison, and T. E. Bureau
The evolutionary fate of MULE-mediated duplications of host gene fragments in rice
Genome Res., September 1, 2005; 15(9): 1292 - 1297.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
Y. Zhang, Y. Wu, Y. Liu, and B. Han
Computational Identification of 69 Retroposons in Arabidopsis
Plant Physiology, June 1, 2005; 138(2): 935 - 948.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. M. Harrison, D. Zheng, Z. Zhang, N. Carriero, and M. Gerstein
Transcribed processed pseudogenes in the human genome: an intermediate form of expressed retrosequence lacking protein-coding ability
Nucleic Acids Res., April 28, 2005; 33(8): 2374 - 2383.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Adel, D. Laurent, and M. Dominique
HOPPSIGEN: a database of human and mouse processed pseudogenes
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D59 - D66.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
P. Bertone, V. Stolc, T. E. Royce, J. S. Rozowsky, A. E. Urban, X. Zhu, J. L. Rinn, W. Tongprasit, M. Samanta, S. Weissman, et al.
Global Identification of Human Transcribed Sequences with Genome Tiling Arrays
Science, December 24, 2004; 306(5705): 2242 - 2246.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
D. N. Messina, J. Glasscock, W. Gish, and M. Lovett
An ORFeome-based Analysis of Human Transcription Factor Genes and the Construction of a Microarray to Interrogate Their Expression
Genome Res., October 1, 2004; 14(10b): 2041 - 2047.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S. Nisole, C. Lynch, J. P. Stoye, and M. W. Yap
A Trim5-cyclophilin A fusion protein found in owl monkey kidney cells can restrict HIV-1
PNAS, September 7, 2004; 101(36): 13324 - 13328.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. R. Weil, P. Widlak, J. D. Minna, and H. R. Garner
Global Survey of Chromatin Accessibility Using DNA Microarrays
Genome Res., July 1, 2004; 14(7): 1374 - 1381.
[Abstract] [Full Text] [PDF]




Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.
Copyright © 2003 by Cold Spring Harbor Laboratory Press.