Genome Res. 13:2559-2567, 2003
©2003 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/03 $5.00
Letter
A Genome-Wide Survey of Human Pseudogenes
David Torrents1,
Mikita Suyama1,
Evgeny Zdobnov and
Peer Bork2
EMBL, Heidelberg 69117, Germany
We screened all intergenic regions in the human genome to identify pseudogenes with a combination of homology searches and a functionality test using the ratio of silent to replacement nucleotide substitutions (KA/KS). We identified 19,724 regions of which 95% ± 3% are estimated to evolve neutrally and thus are likely to encode pseudogenes. Half of these have no detectable truncation in their pseudocoding regions and therefore are not identifiable by methods that require the presence of truncations to prove nonfunctionality. A comparative analysis with the mouse genome showed that 70% of these pseudogenes have a retrotranspositional origin (processed), and the rest arose by segmental duplication (nonprocessed). Although the spread of both types of pseudogenes correlates with chromosome size, nonprocessed pseudogenes appear to be enriched in regions with high gene density. It is likely that the human pseudogenes identified here represent only a small fraction of the total, which probably exceeds the number of genes.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1455503.
1 These authors contributed equally to this work.
2 Corresponding author. E-MAIL bork{at}embl-heidelberg.de; FAX 11-49-6221-387-517.
[Supplemental information as well as the sequences identified in this work can be found at http://www.bork.embl-heidelberg.de/Docu/Human_Pseudogenes/.]

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
L. L Espey, R. A Garcia, H. Kondo, B. Ishizuka, S. Yoshioka, S. Fujii, S. Hampton, and J. S Richards
Expression of paralogs of cytochrome P45021a1 pseudogene (Cyp21a1-ps) and endogenous retrovirus SC1 (SC1) in the rat ovary during the ovulatory process
J. Endocrinol.,
July 1, 2008;
198(1):
231 - 241.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Weischenfeldt, I. Damgaard, D. Bryder, K. Theilgaard-Monch, L. A. Thoren, F. C. Nielsen, S. E. W. Jacobsen, C. Nerlov, and B. T. Porse
NMD is essential for hematopoietic stem and progenitor cells and for eliminating by-products of programmed DNA rearrangements
Genes & Dev.,
May 15, 2008;
22(10):
1381 - 1396.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Z. D. Zhang, P. Cayting, G. Weinstock, and M. Gerstein
Analysis of Nuclear Receptor Pseudogenes in Vertebrates: How the Silent Tell Their Stories
Mol. Biol. Evol.,
January 1, 2008;
25(1):
131 - 143.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. H. Maxwell and M. J. Curcio
Retrosequence formation restructures the yeast genome
Genes & Dev.,
December 15, 2007;
21(24):
3308 - 3318.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. Zheng, J. Shi, X. Fang, Y. Li, S. Vang, W. Fan, J. Wang, Z. Zhang, W. Wang, K. Kristiansen, et al.
FGF: A web tool for Fishing Gene Family in a whole genome database
Nucleic Acids Res.,
July 13, 2007;
35(suppl_2):
W121 - W125.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. D. Smith, S. Shu, C. J. Mungall, and G. H. Karpen
The Release 5.1 Annotation of Drosophila melanogaster Heterochromatin
Science,
June 15, 2007;
316(5831):
1586 - 1591.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. B. Gerstein, C. Bruce, J. S. Rozowsky, D. Zheng, J. Du, J. O. Korbel, O. Emanuelsson, Z. D. Zhang, S. Weissman, and M. Snyder
What is a gene, post-ENCODE? History and updated definition
Genome Res.,
June 1, 2007;
17(6):
669 - 681.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. R. Gingeras
Origin of phenotypes: Genes and transcripts
Genome Res.,
June 1, 2007;
17(6):
682 - 690.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. Zheng, A. Frankish, R. Baertsch, P. Kapranov, A. Reymond, S. W. Choo, Y. Lu, F. Denoeud, S. E. Antonarakis, M. Snyder, et al.
Pseudogenes in the ENCODE regions: Consensus annotation, analysis of transcription, and evolution
Genome Res.,
June 1, 2007;
17(6):
839 - 851.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. Rompler, C. Staubert, D. Thor, A. Schulz, M. Hofreiter, and T. Schoneberg
G Protein-Coupled Time Travel: Evolutionary Aspects of GPCR Research
Mol. Interv.,
February 1, 2007;
7(1):
17 - 25.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. E. Karro, Y. Yan, D. Zheng, Z. Zhang, N. Carriero, P. Cayting, P. Harrrison, and M. Gerstein
Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation
Nucleic Acids Res.,
January 12, 2007;
35(suppl_1):
D55 - D60.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. Yao, R. Charlab, and P. Li
Systematic identification of pseudogenes through whole genome expression evidence profiling
Nucleic Acids Res.,
September 11, 2006;
34(16):
4477 - 4485.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. Drouin
Processed Pseudogenes Are More Abundant in Human and Mouse X Chromosomes than in Autosomes
Mol. Biol. Evol.,
September 1, 2006;
23(9):
1652 - 1655.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
W. Wang, H. Zheng, C. Fan, J. Li, J. Shi, Z. Cai, G. Zhang, D. Liu, J. Zhang, S. Vang, et al.
High Rate of Chimeric Gene Origination by Retroposition in Plant Genomes
PLANT CELL,
August 1, 2006;
18(8):
1791 - 1802.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Suyama, D. Torrents, and P. Bork
PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments.
Nucleic Acids Res.,
July 1, 2006;
34(Web Server issue):
W609 - W612.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. J. van Baren and M. R. Brent
Iterative gene prediction and pseudogene removal improves genome annotation.
Genome Res.,
May 1, 2006;
16(5):
678 - 685.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S.-H. Shiu, J. K. Byrnes, R. Pan, P. Zhang, and W.-H. Li
Role of positive selection in the retention of duplicate genes in mammalian genomes
PNAS,
February 14, 2006;
103(7):
2232 - 2236.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. Birney, D. Andrews, M. Caccamo, Y. Chen, L. Clarke, G. Coates, T. Cox, F. Cunningham, V. Curwen, T. Cutts, et al.
Ensembl 2006
Nucleic Acids Res.,
January 1, 2006;
34(suppl_1):
D556 - D561.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. R. Brent
Genome annotation past, present, and future: How to define an ORF at each locus
Genome Res.,
December 1, 2005;
15(12):
1777 - 1786.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Csuros and I. Miklos
Statistical Alignment of Retropseudogenes and Their Functional Paralogs
Mol. Biol. Evol.,
December 1, 2005;
22(12):
2457 - 2471.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
N. Juretic, D. R. Hoen, M. L. Huynh, P. M. Harrison, and T. E. Bureau
The evolutionary fate of MULE-mediated duplications of host gene fragments in rice
Genome Res.,
September 1, 2005;
15(9):
1292 - 1297.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. Zhang, Y. Wu, Y. Liu, and B. Han
Computational Identification of 69 Retroposons in Arabidopsis
Plant Physiology,
June 1, 2005;
138(2):
935 - 948.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. Fernandez, D. Torrents, A. Zorzano, M. Palacin, and J. Chillaron
Identification and Functional Characterization of a Novel Low Affinity Aromatic-preferring Amino Acid Transporter (arpAT): ONE OF THE FEW PROTEINS SILENCED DURING PRIMATE EVOLUTION
J. Biol. Chem.,
May 13, 2005;
280(19):
19364 - 19372.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. M. Harrison, D. Zheng, Z. Zhang, N. Carriero, and M. Gerstein
Transcribed processed pseudogenes in the human genome: an intermediate form of expressed retrosequence lacking protein-coding ability
Nucleic Acids Res.,
April 28, 2005;
33(8):
2374 - 2383.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Adel, D. Laurent, and M. Dominique
HOPPSIGEN: a database of human and mouse processed pseudogenes
Nucleic Acids Res.,
January 1, 2005;
33(suppl_1):
D59 - D66.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Vandepoele and Y. Van de Peer
Exploring the Plant Transcriptome through Phylogenetic Profiling
Plant Physiology,
January 1, 2005;
137(1):
31 - 42.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Ding, A. Sabo, N. Berkowicz, R. R. Meyer, Y. Shotland, M. R. Johnson, K. H. Pepin, R. K. Wilson, and J. Spieth
EAnnot: A genome annotation tool using experimental evidence
Genome Res.,
December 1, 2004;
14(12):
2503 - 2509.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
O. Podlaha and J. Zhang
Nonneutral Evolution of the Transcribed Pseudogene Makorin1-p1 in Mice
Mol. Biol. Evol.,
December 1, 2004;
21(12):
2202 - 2209.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Nisole, C. Lynch, J. P. Stoye, and M. W. Yap
A Trim5-cyclophilin A fusion protein found in owl monkey kidney cells can restrict HIV-1
PNAS,
September 7, 2004;
101(36):
13324 - 13328.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. R. Gingeras
RNA Reference Materials for Gene Expression Studies: Point: Difficult First Steps
Clin. Chem.,
August 1, 2004;
50(8):
1289 - 1290.
[Full Text]
[PDF]
|
 |
|
|
|