Published online before print
August 9, 2007 Genome Research, DOI: 10.1101/gr.6406307
Perspective
Raising the estimate of functional human sequences
Michael Pheasant and
John S. Mattick1
ARC Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland 4072, Australia
While less than 1.5% of the mammalian genome encodes proteins, it is now evident that the vast majority is transcribed, mainly into non-protein-coding RNAs. This raises the question of what fraction of the genome is functional, i.e., composed of sequences that yield functional products, are required for the expression (regulation or processing) of these products, or are required for chromosome replication and maintenance. Many of the observed noncoding transcripts are differentially expressed, and, while most have not yet been studied, increasing numbers are being shown to be functional and/or trafficked to specific subcellular locations, as well as exhibit subtle evidence of selection. On the other hand, analyses of conservation patterns indicate that only 5% (3%–8%) of the human genome is under purifying selection for functions common to mammals. However, these estimates rely on the assumption that reference sequences (usually ancient transposon-derived sequences) have evolved neutrally, which may not be the case, and if so would lead to an underestimate of the fraction of the genome under evolutionary constraint. These analyses also do not detect functional sequences that are evolving rapidly and/or have acquired lineage-specific functions. Indeed, many regulatory sequences and known functional noncoding RNAs, including many microRNAs, are not conserved over significant evolutionary distances, and recent evidence from the ENCODE project suggests that many functional elements show no detectable level of sequence constraint. Thus, it is likely that much more than 5% of the genome encodes functional information, and although the upper bound is unknown, it may be considerably higher than currently thought.
1 Corresponding author.
E-mail j.mattick{at}imb.uq.edu.au; fax 61-7-3346-2111.
Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.6406307

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
Related Articles
-
What is a gene, post-ENCODE? History and updated definition
- Mark B. Gerstein, Can Bruce, Joel S. Rozowsky, Deyou Zheng, Jiang Du, Jan O. Korbel, Olof Emanuelsson, Zhengdong D. Zhang, Sherman Weissman, and Michael Snyder
Genome Res. 2007 17: 669-681.
[Abstract]
[Full Text]
[PDF]
-
Origin of phenotypes: Genes and transcripts
- Thomas R. Gingeras
Genome Res. 2007 17: 682-690.
[Abstract]
[Full Text]
[PDF]
This article has been cited by other articles:

|
 |

|
 |
 
M. E Dinger, T. R Mercer, and J. S Mattick
RNAs as extracellular signaling molecules
J. Mol. Endocrinol.,
April 1, 2008;
40(4):
151 - 159.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. P. Amaral, M. E. Dinger, T. R. Mercer, and J. S. Mattick
The Eukaryotic Genome as an RNA Machine
Science,
March 28, 2008;
319(5871):
1787 - 1789.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. R. Bendana and I. H. Holmes
Colorstock, SScolor, Raton: RNA alignment visualization tools
Bioinformatics,
February 15, 2008;
24(4):
579 - 580.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Ashburner
Drosophila Genomes by the Baker's Dozen
Genetics,
November 1, 2007;
177(3):
1263 - 1268.
[Full Text]
[PDF]
|
 |
|
|
|