Vol 13, Issue 1, 64-72, January 2003
LETTER
Distinguishing Regulatory DNA From Neutral Sites
Laura Elnitski1,3,
Ross C. Hardison1,
Jia Li2,
Shan Yang1,
Diana Kolbe1,3,
Pallavi Eswara3,
Michael J. O'Connor3,
Scott Schwartz3,
Webb Miller3,4 and
Francesca Chiaromonte2,5,6
1Departments of Biochemistry and Molecular Biology,2
Statistics, 3Computer Science and Engineering,4
Biology, and 5Health Evaluation Sciences, The
Pennsylvania State University, University Park,
Pennsylvania 16802, USA
We explore several computational approaches to analyzing
interspecies genomic sequence alignments, aiming to distinguish
regulatory regions from neutrally evolving DNA. Humanmouse genomic
alignments were collected for three sets of human regions: (1)
experimentally defined gene regulatory regions, (2) well-characterized
exons (coding sequences, as a positive control), and (3) interspersed
repeats thought to have inserted before the humanmouse split (a good
model for neutrally evolving DNA). Models that potentially could
distinguish functional noncoding sequences from neutral DNA were
evaluated on these three data sets, as well as bulk genome alignments.
Our analyses show that discrimination based on frequencies of
individual nucleotide pairs or gaps (i.e., of possible alignment
columns) is only partially successful. In contrast, scoring procedures
that include the alignment context, based on frequencies of short runs
of alignment columns, dramatically improve separation between
regulatory and neutral features. Such scoring functions should aid in
the identification of putative regulatory regions throughout the human
genome.
6 Corresponding author. E-MAIL chiaro{at}stat.psu.edu; FAX
(814) 863-7114.
Article and publication are at
http://www.genome.org/cgi/doi/10.1101/gr.817703.

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
V. Gotea and I. Ovcharenko
DiRE: identifying distant regulatory elements of co-expressed genes
Nucleic Acids Res.,
July 1, 2008;
36(suppl_2):
W133 - W139.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Hannenhalli
Eukaryotic transcription factor binding sites--modeling and integrative search methods
Bioinformatics,
June 1, 2008;
24(11):
1325 - 1331.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. C. Dore, J. D. Amigo, C. O. dos Santos, Z. Zhang, X. Gai, J. W. Tobias, D. Yu, A. M. Klein, C. Dorman, W. Wu, et al.
A GATA-1-regulated microRNA locus essential for erythropoiesis
PNAS,
March 4, 2008;
105(9):
3333 - 3338.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
W. Miller, K. Rosenbloom, R. C. Hardison, M. Hou, J. Taylor, B. Raney, R. Burhans, D. C. King, R. Baertsch, D. Blankenberg, et al.
28-Way vertebrate alignment and conservation track in the UCSC Genome Browser
Genome Res.,
December 1, 2007;
17(12):
1797 - 1808.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. Kheradpour, A. Stark, S. Roy, and M. Kellis
Reliable prediction of regulator targets using 12 Drosophila genomes
Genome Res.,
December 1, 2007;
17(12):
1919 - 1931.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. D. Cook, B. Li, and F. Chiaromonte
Dimension reduction in regression without matrix inversion
Biometrika,
August 1, 2007;
94(3):
569 - 584.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Resseguie, J. Song, M. D. Niculescu, K.-A. da Costa, T. A. Randall, and S. H. Zeisel
Phosphatidylethanolamine N-methyltransferase (PEMT) gene expression is induced by estrogen in human and mouse primary hepatocytes
FASEB J,
August 1, 2007;
21(10):
2622 - 2632.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. W. Burt
Emergence of the Chicken as a Model Organism: Implications for Agriculture and Biology
Poult. Sci.,
July 1, 2007;
86(7):
1460 - 1471.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Kumar and A. Filipski
Multiple sequence alignment: In pursuit of homologous DNA positions
Genome Res.,
February 1, 2007;
17(2):
127 - 135.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C.-H. Peng, J.-T. Hsu, Y.-S. Chung, Y.-J. Lin, W.-Y. Chow, D. F. Hsu, and C. Y. Tang
Identification of degenerate motifs using position restricted selection and hybrid ranking combination
Nucleic Acids Res.,
December 2, 2006;
34(22):
6379 - 6391.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. Wang, Y. Zhang, Y. Cheng, Y. Zhou, D. C. King, J. Taylor, F. Chiaromonte, J. Kasturi, H. Petrykowska, B. Gibb, et al.
Experimental validation of predicted mammalian erythroid cis-regulatory modules
Genome Res.,
December 1, 2006;
16(12):
1480 - 1492.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Taylor, S. Tyekucheva, D. C. King, R. C. Hardison, W. Miller, and F. Chiaromonte
ESPERR: Learning strong and weak signals in genomic sequence alignments to identify functional elements
Genome Res.,
December 1, 2006;
16(12):
1596 - 1604.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. GuhaThakurta
Computational identification of transcriptional regulatory elements in DNA sequence
Nucleic Acids Res.,
July 19, 2006;
34(12):
3585 - 3598.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. J. Ostrin, Y. Li, K. Hoffman, J. Liu, K. Wang, L. Zhang, G. Mardon, and R. Chen
Genome-wide identification of direct targets of the Drosophila retinal determination protein Eyeless
Genome Res.,
April 1, 2006;
16(4):
466 - 476.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
I. Abnizova and W. R. Gilks
Studying statistical properties of regulatory DNA sequences, and their use in predicting regulatory regions in the eukaryotic genomes
Brief Bioinform,
March 1, 2006;
7(1):
48 - 54.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Sauer, E. Shelest, and E. Wingender
Evaluating phylogenetic footprinting for human-rodent comparisons
Bioinformatics,
February 15, 2006;
22(4):
430 - 437.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Tran, P. Havlak, and J. Miller
MicroRNA enrichment among short 'ultraconserved' sequences in insects.
Nucleic Acids Res.,
January 1, 2006;
34(9):
e65 - e65.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. Blanco, D. Farre, M. M. Alba, X. Messeguer, and R. Guigo
ABS: a database of Annotated regulatory Binding Sites from orthologous promoters
Nucleic Acids Res.,
January 1, 2006;
34(suppl_1):
D63 - D67.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. A. Sharov, D. B. Dudekula, and M. S. H. Ko
CisView: A Browser and Database of cis-regulatory Modules Predicted in the Mouse Genome
DNA Res,
January 1, 2006;
13(3):
123 - 134.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
V. Munugalavadla, L. C. Dore, B. L. Tan, L. Hong, M. Vishnu, M. J. Weiss, and R. Kapur
Repression of c-Kit and Its Downstream Substrates by GATA-1 Inhibits Cell Proliferation during Erythroid Maturation
Mol. Cell. Biol.,
August 1, 2005;
25(15):
6747 - 6759.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. C. King, J. Taylor, L. Elnitski, F. Chiaromonte, W. Miller, and R. C. Hardison
Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences
Genome Res.,
August 1, 2005;
15(8):
1051 - 1060.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Meunier, A. Khelifi, V. Navratil, and L. Duret
Homology-dependent methylation in primate repetitive DNA
PNAS,
April 12, 2005;
102(15):
5471 - 5476.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. Sandelin and W. W. Wasserman
Prediction of Nuclear Hormone Receptor Response Elements
Mol. Endocrinol.,
March 1, 2005;
19(3):
595 - 606.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. H. Margulies, NISC Comparative Sequencing Program, V. V. B. Maduro, P. J. Thomas, J. P. Tomkins, C. T. Amemiya, M. Luo, and E. D. Green
Comparative sequencing provides insights about the structure and conservation of marsupial and monotreme genomes
PNAS,
March 1, 2005;
102(9):
3354 - 3359.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Elnitski, B. Giardine, P. Shah, Y. Zhang, C. Riemer, M. Weirauch, R. Burhans, W. Miller, and R. C. Hardison
Improvements to GALA and dbERGE II: databases featuring genomic sequence alignment, annotation and experimental results
Nucleic Acids Res.,
January 1, 2005;
33(suppl_1):
D466 - D470.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. J. Welch, J. A. Watts, C. R. Vakoc, Y. Yao, H. Wang, R. C. Hardison, G. A. Blobel, L. A. Chodosh, and M. J. Weiss
Global regulation of erythroid gene expression by transcription factor GATA-1
Blood,
November 15, 2004;
104(10):
3136 - 3147.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. Suzuki, R. Yamashita, M. Shirota, Y. Sakakibara, J. Chiba, J. Mizushima-Sugano, K. Nakai, and S. Sugano
Sequence Comparison of Human and Mouse Genes Reveals a Homologous Block Structure in the Promoter Regions
Genome Res.,
September 1, 2004;
14(9):
1711 - 1718.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
I. Ovcharenko, M. A. Nobrega, G. G. Loots, and L. Stubbs
ECR Browser: a tool for visualizing and accessing data from comparisons of multiple vertebrate genomes
Nucleic Acids Res.,
July 1, 2004;
32(suppl_2):
W280 - W286.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. Kolbe, J. Taylor, L. Elnitski, P. Eswara, J. Li, W. Miller, R. Hardison, and F. Chiaromonte
Regulatory Potential Scores From Genome-Wide Three-Way Alignments of Human, Mouse, and Rat
Genome Res.,
April 1, 2004;
14(4):
700 - 707.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. Liu, X. S. Liu, L. Wei, R. B. Altman, and S. Batzoglou
Eukaryotic Regulatory Element Conservation Analysis and Identification Using Comparative Genomics
Genome Res.,
March 1, 2004;
14(3):
451 - 458.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
I. Ovcharenko, G. G. Loots, R. C. Hardison, W. Miller, and L. Stubbs
zPicture: Dynamic Alignment and Visualization Tool for Analyzing Conservation Profiles
Genome Res.,
March 1, 2004;
14(3):
472 - 477.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. A. Chapman, I. J. Donaldson, J. Gilbert, D. Grafham, J. Rogers, A. R. Green, and B. Gottgens
Analysis of Multiple Genomic Sequence Alignments: A Web Resource, Online Tools, and Lessons Learned From Analysis of Mammalian SCL Loci
Genome Res.,
February 1, 2004;
14(2):
313 - 318.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. H. Margulies, M. Blanchette, NISC Comparative Sequencing Program, D. Haussler, and E. D. Green
Identification and Characterization of Multi-Species Conserved Sequences
Genome Res.,
December 1, 2003;
13(12):
2507 - 2518.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. T. Dermitzakis, A. Reymond, N. Scamuffa, C. Ucla, E. Kirkness, C. Rossier, and S. E. Antonarakis
Evolutionary Discrimination of Mammalian Conserved Non-Genic Sequences (CNGs)
Science,
November 7, 2003;
302(5647):
1033 - 1035.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Schwartz, L. Elnitski, M. Li, M. Weirauch, C. Riemer, A. Smit, N. C. S. Program, E. D. Green, R. C. Hardison, and W. Miller
MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences
Nucleic Acids Res.,
July 1, 2003;
31(13):
3518 - 3524.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|