Published online before print
November 12, 2003 Genome Research, DOI: 10.1101/gr.1317703
Letter
Nucleotide Frequency Variation Across Human Genes
Elizabeth Louie,
Jurg Ott and
Jacek Majewski1
The Rockefeller University, New York, New York 10021, USA
The frequencies of individual nucleotides exhibit significant fluctuations across eukaryotic genes. In this paper, we investigate nucleotide variation across an averaged representation of all known human genes. Such a representation allows us to average out random fluctuations that constitute noise and uncover remarkable systematic trends in nucleotide distributions, particularly near boundaries between genetic elements—the promoter, exons, and introns. We propose that such variations result from differential mutational pressures and from the presence of specific regulatory motifs, such as transcription and splicing factor binding sites. Specifically, we observe significant GC and TA biases (excess of G over C and T over A) in noncoding regions of genes. Such biases are most probably caused by transcription-coupled mismatch repair, an effect that has recently been detected in mammalian genes. Subsequently, we examine the distribution of all hexanucleotides and identify motifs that are overrepresented within regulatory regions. By clustering and aligning such sequences, we recognize families of putative regulatory elements involved in exonic and intronic splicing control, and 3' mRNA processing. Some of our motifs have been identified in prior theoretical and experimental studies, thus validating our approach, but we detect several novel sequences that we propose as candidates for future functional assays and mutation screens for genetic disorders.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1317703. Article published online before print in November 2003.
1 Corresponding author. E-MAIL majewski{at}complex.rockefeller.edu;FAX (212) 327-7996.

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
P. Polak and P. F. Arndt
Transcription induces strand-specific mutations at the 5' end of human genes
Genome Res.,
August 1, 2008;
18(8):
1216 - 1223.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Ke, X. H.-F. Zhang, and L. A. Chasin
Positive selection acting on splicing motifs reflects compensatory evolution
Genome Res.,
April 1, 2008;
18(4):
533 - 543.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. M. Resch, L. Carmel, L. Marino-Ramirez, A. Y. Ogurtsov, S. A. Shabalina, I. B. Rogozin, and E. V. Koonin
Widespread Positive Selection in Synonymous Sites of Mammalian Genes
Mol. Biol. Evol.,
August 1, 2007;
24(8):
1821 - 1831.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L.-W. Chang, R. Nagarajan, J. A. Magee, J. Milbrandt, and G. D. Stormo
A systematic model to predict transcriptional regulatory mechanisms based on overrepresentation of transcription factor binding profiles
Genome Res.,
March 1, 2006;
16(3):
405 - 413.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. L. Parmley, J. V. Chamary, and L. D. Hurst
Evidence for Purifying Selection Against Synonymous Mutations in Mammalian Exonic Splicing Enhancers
Mol. Biol. Evol.,
February 1, 2006;
23(2):
301 - 309.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B. R. Morton, I. V. Bi, M. D. McMullen, and B. S. Gaut
Variation in Mutation Dynamics Across the Maize Genome as a Function of Regional and Flanking Base Composition
Genetics,
January 1, 2006;
172(1):
569 - 577.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. Nikolaou and Y. Almirantis
A study on the correlation of nucleotide skews and the positioning of the origin of replication: different modes of replication in bacterial species
Nucleic Acids Res.,
November 30, 2005;
33(21):
6816 - 6822.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
X. H-F. Zhang, C. S. Leslie, and L. A. Chasin
Dichotomous splicing signals in exon flanks
Genome Res.,
June 1, 2005;
15(6):
768 - 779.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Venkataraman, K. M. Brown, and G. M. Gilmartin
Analysis of a noncanonical poly(A) site reveals a tripartite mechanism for vertebrate poly(A) site recognition
Genes & Dev.,
June 1, 2005;
19(11):
1315 - 1327.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Touchon, A. Arneodo, Y. d'Aubenton-Carafa, and C. Thermes
Transcription-coupled and splicing-coupled strand asymmetries in eukaryotic genomes
Nucleic Acids Res.,
September 23, 2004;
32(17):
4969 - 4978.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|