Genome Res. 14:451-458, 2004
©2004 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/04 $5.00
Methods
Eukaryotic Regulatory Element Conservation Analysis and Identification Using Comparative Genomics
Yueyi Liu1,6,
X. Shirley Liu4,6,
Liping Wei5,
Russ B. Altman2 and
Serafim Batzoglou3,7
1 Stanford Medical Informatics, Stanford University, Stanford, California 94305, USA
2 Department of Genetics, Stanford University, Stanford, California 94305, USA
3 Department of Computer Science, Stanford University, Stanford, California 94305, USA
4 Department of Biostatistics, Harvard School of Public Health, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA
5 Nexus Genomics, Inc., Mountain View, California 94043, USA
Comparative genomics is a promising approach to the challenging problem of eukaryotic regulatory element identification, because functional noncoding sequences may be conserved across species from evolutionary constraints. We systematically analyzed known human and Saccharomyces cerevisiae regulatory elements and discovered that human regulatory elements are more conserved between human and mouse than are background sequences. Although S. cerevisiae regulatory elements do not appear to be more conserved by comparison of S. cerevisiae to Schizosaccharomyces pombe, they are more conserved when compared with multiple other yeast genomes (Saccharomyces paradoxus, Saccharomyces mikatae, and Saccharomyces bayanus). Based on these analyses, we developed a sequence-motif-finding algorithm called CompareProspector, which extends Gibbs sampling by biasing the search in regions conserved across species. Using humanmouse comparison, CompareProspector identified known motifs for transcription factors Mef2, Myf, Srf, and Sp1 from a set of human-muscle-specific genes. It also discovered the NFAT motif from genes up-regulated by CD28 stimulation in T-cells, which implies the direct involvement of NFAT in mediating the CD28 stimulatory signal. Using Caenorhabditis elegansCaenorhabditis briggsae comparison, CompareProspector found the PHA-4 motif and the UNC-86 motif. CompareProspector outperformed many other computational motif-finding programs, demonstrating the power of comparative genomics-based biased sampling in eukaryotic regulatory element identification.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1327604.
6 These authors contributed equally to this work.
7 Corresponding author. E-MAIL serafim{at}cs.stanford.edu; FAX (650) 725-1449.
[Supplemental data are available at www.genome.org and at http://compareprospector.stanford.edu. The program CompareProspector is available at http://compareprospector.stanford.edu.]

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
Y. Quan, Z.-L. Ji, X. Wang, A. M. Tartakoff, and T. Tao
Evolutionary and Transcriptional Analysis of Karyopherin {beta} Superfamily Proteins
Mol. Cell. Proteomics,
July 1, 2008;
7(7):
1254 - 1269.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Yaragatti, C. Basilico, and L. Dailey
Identification of active transcriptional regulatory modules by the functional assay of DNA from nucleosome-free regions
Genome Res.,
June 1, 2008;
18(6):
930 - 938.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. B. White and M. R. Ziman
Genome-wide discovery of Pax7 target genes during development
Physiol Genomics,
March 10, 2008;
33(1):
41 - 49.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Abeel, Y. Saeys, E. Bonnet, P. Rouze, and Y. Van de Peer
Generic eukaryotic core promoter prediction using structural features of DNA
Genome Res.,
February 1, 2008;
18(2):
310 - 323.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
X. Cai, H. Hu, and X. S. Li
Tree Gibbs Sampler: identifying conserved motifs without aligning orthologous sequences
Bioinformatics,
August 1, 2007;
23(15):
2013 - 2014.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. A. Newberg, W. A. Thompson, S. Conlan, T. M. Smith, L. A. McCue, and C. E. Lawrence
A phylogenetic Gibbs sampler that yields centroid solutions for cis-regulatory site prediction
Bioinformatics,
July 15, 2007;
23(14):
1718 - 1727.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. Ji, S. A. Vokes, and W. H. Wong
A comparative analysis of genome-wide chromatin immunoprecipitation data for mammalian transcription factors
Nucleic Acids Res.,
December 4, 2006;
34(21):
e146 - e146.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Elnitski, V. X. Jin, P. J. Farnham, and S. J.M. Jones
Locating mammalian transcription factor binding sites: A survey of computational and experimental techniques
Genome Res.,
December 1, 2006;
16(12):
1455 - 1464.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Ranjan, J. Seshadri, V. Vindal, S. Yellaboina, and A. Ranjan
iCR: a web tool to identify conserved targets of a regulatory protein across the multiple related prokaryotic species.
Nucleic Acids Res.,
July 1, 2006;
34(Web Server issue):
W584 - W587.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. S Hon and A. N Jain
A deterministic motif finding algorithm with application to the human genome
Bioinformatics,
May 1, 2006;
22(9):
1047 - 1054.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Sauer, E. Shelest, and E. Wingender
Evaluating phylogenetic footprinting for human-rodent comparisons
Bioinformatics,
February 15, 2006;
22(4):
430 - 437.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Q. Sun, G. Chen, J. W. Streb, X. Long, Y. Yang, C. J. Stoeckert Jr., and J. M. Miano
Defining the mammalian CArGome
Genome Res.,
February 1, 2006;
16(2):
197 - 207.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
F. Pauli, Y. Liu, Y. A. Kim, P.-J. Chen, and S. K. Kim
Chromosomal clustering and GATA transcriptional regulation of intestine-expressed genes in C. elegans
Development,
January 15, 2006;
133(2):
287 - 295.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Wang and G. D. Stormo
Identifying the conserved network of cis-regulatory sites of a eukaryotic genome
PNAS,
November 29, 2005;
102(48):
17400 - 17405.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
X. Li, S. Zhong, and W. H. Wong
Reliable prediction of transcription factor binding sites by phylogenetic verification
PNAS,
November 22, 2005;
102(47):
16945 - 16950.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Hu, B. Li, and D. Kihara
Limitations and potentials of current motif discovery algorithms
Nucleic Acids Res.,
September 2, 2005;
33(15):
4899 - 4913.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. R. D. Ganley, K. Hayashi, T. Horiuchi, and T. Kobayashi
Identifying gene-independent noncoding functional elements in the yeast ribosomal DNA by phylogenetic footprinting
PNAS,
August 16, 2005;
102(33):
11787 - 11792.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. C. King, J. Taylor, L. Elnitski, F. Chiaromonte, W. Miller, and R. C. Hardison
Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences
Genome Res.,
August 1, 2005;
15(8):
1051 - 1060.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Z. Zhu, J. Shendure, and G. M. Church
Discovering functional transcription-factor combinations in the human cell cycle
Genome Res.,
June 1, 2005;
15(6):
848 - 855.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. V. Sun, D. R. Boverhof, L. D. Burgoon, M. R. Fielden, and T. R. Zacharewski
Comparative analysis of dioxin response elements in human, mouse and rat genomic sequences
Nucleic Acids Res.,
August 24, 2004;
32(15):
4512 - 4523.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. Liu, L. Wei, S. Batzoglou, D. L. Brutlag, J. S. Liu, and X. S. Liu
A suite of web-based programs to search for transcriptional regulatory motifs
Nucleic Acids Res.,
July 1, 2004;
32(suppl_2):
W204 - W207.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|