Genome Research Econo tag

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


Published online before print October 19, 2006, 10.1101/gr.4537706
Genome Res. 16:1596-1604, 2006
©2006 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/06 $5.00
OPEN ACCESS ARTICLE
This Article
OPEN ACCESS ARTICLE
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Supplemental Research Data
Right arrow All Versions of this Article:
gr.4537706v1
16/12/1596    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Taylor, J.
Right arrow Articles by Chiaromonte, F.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Taylor, J.
Right arrow Articles by Chiaromonte, F.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Methods

ESPERR: Learning strong and weak signals in genomic sequence alignments to identify functional elements

James Taylor1, Svitlana Tyekucheva, David C. King, Ross C. Hardison, Webb Miller, and Francesca Chiaromonte1

Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA

Genomic sequence signals—such as base composition, presence of particular motifs, or evolutionary constraint—have been used effectively to identify functional elements. However, approaches based only on specific signals known to correlate with function can be quite limiting. When training data are available, application of computational learning algorithms to multispecies alignments has the potential to capture broader and more informative sequence and evolutionary patterns that better characterize a class of elements. However, effective exploitation of patterns in multispecies alignments is impeded by the vast number of possible alignment columns and by a limited understanding of which particular strings of columns may characterize a given class. We have developed a computational method, called ESPERR (evolutionary and sequence pattern extraction through reduced representations), which uses training examples to learn encodings of multispecies alignments into reduced forms tailored for the prediction of chosen classes of functional elements. ESPERR produces a greatly improved Regulatory Potential score, which can discriminate regulatory regions from neutral sites with excellent accuracy (~94%). This score captures strong signals (GC content and conservation), as well as subtler signals (with small contributions from many different alignment patterns) that characterize the regulatory elements in our training set. ESPERR is also effective for predicting other classes of functional elements, as we show for DNaseI hypersensitive sites and highly conserved regions with developmental enhancer activity. Our software, training data, and genome-wide predictions are available from our Web site (http://www.bx.psu.edu/projects/esperr).


1 Corresponding authors.

E-mail james{at}bx.psu.edu; fax (814) 863-6699.

E-mail chiaro{at}stat.psu.edu; fax (814) 863-6699.

[Supplemental material is available online at www.genome.org.]

Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.4537706


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
J. M. Taylor, K. Wicks, C. Vandiedonck, and J. C. Knight
Chromatin profiling across the human tumour necrosis factor gene locus reveals a complex, cell type-specific landscape with novel regulatory elements
Nucleic Acids Res., July 24, 2008; (2008) gkn444v1.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
L. C. Dore, J. D. Amigo, C. O. dos Santos, Z. Zhang, X. Gai, J. W. Tobias, D. Yu, A. M. Klein, C. Dorman, W. Wu, et al.
A GATA-1-regulated microRNA locus essential for erythropoiesis
PNAS, March 4, 2008; 105(9): 3333 - 3338.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
D. M. McGaughey, R. M. Vinton, J. Huynh, A. Al-Saif, M. A. Beer, and A. S. McCallion
Metrics of sequence constraint overlook regulatory sequences in an exhaustive analysis at phox2b
Genome Res., February 1, 2008; 18(2): 252 - 260.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
W. Miller, K. Rosenbloom, R. C. Hardison, M. Hou, J. Taylor, B. Raney, R. Burhans, D. C. King, R. Baertsch, D. Blankenberg, et al.
28-Way vertebrate alignment and conservation track in the UCSC Genome Browser
Genome Res., December 1, 2007; 17(12): 1797 - 1808.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
P. Kheradpour, A. Stark, S. Roy, and M. Kellis
Reliable prediction of regulator targets using 12 Drosophila genomes
Genome Res., December 1, 2007; 17(12): 1919 - 1931.
[Abstract] [Full Text] [PDF]


Home page
BiometrikaHome page
R. D. Cook, B. Li, and F. Chiaromonte
Dimension reduction in regression without matrix inversion
Biometrika, August 1, 2007; 94(3): 569 - 584.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
D. C. King, J. Taylor, Y. Zhang, Y. Cheng, H. A. Lawson, J. Martin, ENCODE groups for Transcriptional Regulation and M, F. Chiaromonte, W. Miller, and R. C. Hardison
Finding cis-regulatory elements using comparative genomics: Some lessons from ENCODE data
Genome Res., June 1, 2007; 17(6): 775 - 786.
[Abstract] [Full Text] [PDF]




Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.
Copyright © 2006 by Cold Spring Harbor Laboratory Press.