Published online before print
July 15, 2005, 10.1101/gr.3715005
Genome Res. 15:1034-1050, 2005
©2005 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/05 $5.00
Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes
Adam Siepel1,6,
Gill Bejerano1,
Jakob S. Pedersen1,
Angie S. Hinrichs1,
Minmei Hou3,
Kate Rosenbloom1,
Hiram Clawson1,
John Spieth4,
LaDeana W. Hillier4,
Stephen Richards5,
George M. Weinstock5,
Richard K. Wilson4,
Richard A. Gibbs5,
W. James Kent1,
Webb Miller3 and
David Haussler1,2
1 Center for Biomolecular Science and Engineering, University of California, Santa Cruz, Santa Cruz, California 95064, USA
2 Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, California 95064, USA
3 Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, University Park, Pennsylvania 16802, USA
4 Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri 63108, USA
5 Human Genome Sequencing Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
We have conducted a comprehensive search for conserved elements in vertebrate genomes, using genome-wide multiple alignments of five vertebrate species (human, mouse, rat, chicken, and Fugu rubripes). Parallel searches have been performed with multiple alignments of four insect species (three species of Drosophila and Anopheles gambiae), two species of Caenorhabditis, and seven species of Saccharomyces. Conserved elements were identified with a computer program called phastCons, which is based on a two-state phylogenetic hidden Markov model (phylo-HMM). PhastCons works by fitting a phylo-HMM to the data by maximum likelihood, subject to constraints designed to calibrate the model across species groups, and then predicting conserved elements based on this model. The predicted elements cover roughly 3%8% of the human genome (depending on the details of the calibration procedure) and substantially higher fractions of the more compact Drosophila melanogaster (37%53%), Caenorhabditis elegans (18%37%), and Saccharaomyces cerevisiae (47%68%) genomes. From yeasts to vertebrates, in order of increasing genome size and general biological complexity, increasing fractions of conserved bases are found to lie outside of the exons of known protein-coding genes. In all groups, the most highly conserved elements (HCEs), by log-odds score, are hundreds or thousands of bases long. These elements share certain properties with ultraconserved elements, but they tend to be longer and less perfectly conserved, and they overlap genes of somewhat different functional categories. In vertebrates, HCEs are associated with the 3' UTRs of regulatory genes, stable gene deserts, and megabase-sized regions rich in moderately conserved noncoding sequences. Noncoding HCEs also show strong statistical evidence of an enrichment for RNA secondary structure.
[Supplemental material is available online at www.genome.org. The multiple alignments, predicted conserved elements, and base-by-base conservation scores presented here can be downloaded from http://www.cse.ucsc.edu/~acs/conservation. Up-to-date versions of these data sets are displayed in the "Conservation" and "Most Conserved" tracks in the UCSC Genome Browser (http://genome.ucsc.edu). The phastCons program is part of a software package called PHAST (PHylogenetic Analysis with Space/Time models), which is available by request from acs{at}soe.ucsc.edu.]
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.3715005. Article published online before print in July 2005.
6 Corresponding author. E-mail acs{at}soe.ucsc.edu; fax (831) 459-1809.

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
S. Whelan
Spatial and Temporal Heterogeneity in Nucleotide Sequence Evolution
Mol. Biol. Evol.,
August 1, 2008;
25(8):
1683 - 1694.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. B. Conley, J. Piriyapongsa, and I. K. Jordan
Retroviral promoters in the human genome
Bioinformatics,
July 15, 2008;
24(14):
1563 - 1567.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Nagar, H. Vernitsky, Y. Cohen, D. Dominissini, Y. Berkun, G. Rechavi, N. Amariglio, and I. Goldstein
Epigenetic inheritance of DNA methylation limits activation-induced expression of FOXP3 in conventional human CD25-CD4+ T cells
Int. Immunol.,
July 9, 2008;
(2008)
dxn062v2.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. K. Hussain, M. M. Madeleine, L. G. Johnson, Q. Du, M. Malkki, H.-W. Wilkerson, F. M. Farin, J. J. Carter, D. A. Galloway, J. R. Daling, et al.
Cervical and Vulvar Cancer Risk in Relation to the Joint Effects of Cigarette Smoking and Genetic Variation in Interleukin 2
Cancer Epidemiol. Biomarkers Prev.,
July 1, 2008;
17(7):
1790 - 1799.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. L. Saltzman, Y. K. Kim, Q. Pan, M. M. Fagnani, L. E. Maquat, and B. J. Blencowe
Regulation of Multiple Core Spliceosomal Proteins by Alternative Splicing-Coupled Nonsense-Mediated mRNA Decay
Mol. Cell. Biol.,
July 1, 2008;
28(13):
4320 - 4330.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Q. Zhou and J. S. Liu
Extracting sequence features to predict protein-DNA interactions: a comparative study
Nucleic Acids Res.,
July 1, 2008;
36(12):
4137 - 4148.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Barquist and I. Holmes
xREI: a phylo-grammar visualization webserver
Nucleic Acids Res.,
July 1, 2008;
36(suppl_2):
W65 - W69.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
X. Zeng, X.-M. Xia, and C. J. Lingle
Species-specific Differences among KCNMB3 BK {beta}3 Auxiliary Subunits: Some {beta}3 N-terminal Variants May Be Primate-specific Subunits
J. Gen. Physiol.,
June 30, 2008;
132(1):
115 - 129.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. H. Thomas
Genome evolution in Caenorhabditis
Brief Funct Genomic Proteomic,
June 23, 2008;
(2008)
eln022v1.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. Tsirigos and I. Rigoutsos
Human and mouse introns are linked to the same processes and functions through each genome's most frequent non-conserved motifs
Nucleic Acids Res.,
June 1, 2008;
36(10):
3484 - 3493.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Hackenberg and R. Matthiesen
Annotation-Modules: a tool for finding significant combinations of multisource annotations for gene lists
Bioinformatics,
June 1, 2008;
24(11):
1386 - 1393.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. Satija, L. Pachter, and J. Hein
Combining statistical alignment and phylogenetic footprinting to detect regulatory elements
Bioinformatics,
May 15, 2008;
24(10):
1236 - 1242.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. B. Noyes, X. Meng, A. Wakabayashi, S. Sinha, M. H. Brodsky, and S. A. Wolfe
A systematic characterization of factors that regulate Drosophila segmentation via a bacterial one-hybrid system
Nucleic Acids Res.,
May 1, 2008;
36(8):
2547 - 2560.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J.-H. Kim, P. N. Bogner, S.-H. Baek, N. Ramnath, P. Liang, H.-R. Kim, C. Andrews, and Y.-M. Park
Up-Regulation of Peroxiredoxin 1 in Lung Cancer and Its Implication as a Prognostic and Therapeutic Target
Clin. Cancer Res.,
April 15, 2008;
14(8):
2326 - 2333.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. de Guzman Strong and J. A. Segre
Navigating the genome
J. Cell Sci.,
April 1, 2008;
121(7):
921 - 923.
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Hiller, K. Szafranski, R. Sinha, K. Huse, S. Nikolajewa, P. Rosenstiel, S. Schreiber, R. Backofen, and M. Platzer
Assessing the fraction of short-distance tandem splice sites under purifying selection
RNA,
April 1, 2008;
14(4):
616 - 629.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Bethke, E. Webb, A. Murray, M. Schoemaker, C. Johansen, H. C. Christensen, K. Muir, P. McKinney, S. Hepworth, P. Dimitropoulou, et al.
Comprehensive analysis of the role of DNA repair gene polymorphisms on risk of glioma
Hum. Mol. Genet.,
March 15, 2008;
17(6):
800 - 805.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. S. Perez, T. R. Hoage, J. R. Pritchett, A. L. Ducharme-Smith, M. L. Halling, S. C. Ganapathiraju, P. S. Streng, and D. I. Smith
Long, abundantly expressed non-coding transcripts are altered in cancer
Hum. Mol. Genet.,
March 1, 2008;
17(5):
642 - 655.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Stanke, M. Diekhans, R. Baertsch, and D. Haussler
Using native and syntenically mapped cDNA alignments to improve de novo gene finding
Bioinformatics,
March 1, 2008;
24(5):
637 - 644.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Bethke, A. Murray, E. Webb, M. Schoemaker, K. Muir, P. McKinney, S. Hepworth, P. Dimitropoulou, A. Lophatananon, M. Feychting, et al.
Comprehensive Analysis of DNA Repair Gene Variants and Risk of Meningioma
J Natl Cancer Inst,
February 20, 2008;
100(4):
270 - 276.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. H. Margulies
Confidence in comparative genomics
Genome Res.,
February 1, 2008;
18(2):
199 - 200.
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. M. McGaughey, R. M. Vinton, J. Huynh, A. Al-Saif, M. A. Beer, and A. S. McCallion
Metrics of sequence constraint overlook regulatory sequences in an exhaustive analysis at phox2b
Genome Res.,
February 1, 2008;
18(2):
252 - 260.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. Torarinsson, Z. Yao, E. D. Wiklund, J. B. Bramsen, C. Hansen, J. Kjems, N. Tommerup, W. L. Ruzzo, and J. Gorodkin
Comparative genomics beyond sequence-based alignments: RNA structures in the ENCODE regions
Genome Res.,
February 1, 2008;
18(2):
242 - 251.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Stephen, M. Pheasant, I. V. Makunin, and J. S. Mattick
Large-Scale Appearance of Ultraconserved Elements in Tetrapod Genomes and Slowdown of the Molecular Clock
Mol. Biol. Evol.,
February 1, 2008;
25(2):
402 - 408.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
V. Bourdeau, J. Deschenes, D. Laperriere, M. Aid, J. H. White, and S. Mader
Mechanisms of primary and secondary estrogen target gene regulation in breast cancer cells
Nucleic Acids Res.,
January 17, 2008;
36(1):
76 - 93.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. Karolchik, R. M. Kuhn, R. Baertsch, G. P. Barber, H. Clawson, M. Diekhans, B. Giardine, R. A. Harte, A. S. Hinrichs, F. Hsu, et al.
The UCSC Genome Browser Database: 2008 update
Nucleic Acids Res.,
January 11, 2008;
36(suppl_1):
D773 - D779.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. Betel, M. Wilson, A. Gabow, D. S. Marks, and C. Sander
The microRNA.org resource: targets and expression
Nucleic Acids Res.,
January 11, 2008;
36(suppl_1):
D149 - D153.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Lamba, V. Lamba, S. Strom, R. Venkataramanan, and E. Schuetz
Novel Single Nucleotide Polymorphisms in the Promoter and Intron 1 of Human Pregnane X Receptor/NR1I2 and Their Association with CYP3A4 Expression
Drug Metab. Dispos.,
January 1, 2008;
36(1):
169 - 181.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
N. Molina and E. van Nimwegen
Universal patterns of purifying selection at noncoding positions in bacteria
Genome Res.,
January 1, 2008;
18(1):
148 - 160.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. I. Nikolaev, J. I. Montoya-Burgos, K. Popadin, L. Parand, E. H. Margulies, National Institutes of Health Intramural Sequencin, and S. E. Antonarakis
Life-history traits drive the evolutionary rates of mammalian coding and noncoding genomic elements
PNAS,
December 18, 2007;
104(51):
20443 - 20448.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. Terai, T. Komori, K. Asai, and T. Kin
miRRim: A novel system to find conserved miRNAs with high sensitivity and specificity
RNA,
December 1, 2007;
13(12):
2081 - 2090.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
W. Miller, K. Rosenbloom, R. C. Hardison, M. Hou, J. Taylor, B. Raney, R. Burhans, D. C. King, R. Baertsch, D. Blankenberg, et al.
28-Way vertebrate alignment and conservation track in the UCSC Genome Browser
Genome Res.,
December 1, 2007;
17(12):
1797 - 1808.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. D. Rasmussen and M. Kellis
Accurate gene-tree reconstruction by learning gene- and species-specific substitution rates across multiple complete genomes
Genome Res.,
December 1, 2007;
17(12):
1932 - 1942.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. G. Engstrom, S. J. Ho Sui, O. Drivenes, T. S. Becker, and B. Lenhard
Genomic regulatory blocks underlie extensive microsynteny conservation in insects
Genome Res.,
December 1, 2007;
17(12):
1898 - 1908.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. F. Lin, J. W. Carlson, M. A. Crosby, B. B. Matthews, C. Yu, S. Park, K. H. Wan, A. J. Schroeder, L. S. Gramates, S. E. St. Pierre, et al.
Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes
Genome Res.,
December 1, 2007;
17(12):
1823 - 1836.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. Kheradpour, A. Stark, S. Roy, and M. Kellis
Reliable prediction of regulator targets using 12 Drosophila genomes
Genome Res.,
December 1, 2007;
17(12):
1919 - 1931.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. Cutler, L. A. Marshall, N. Chin, H. Baribault, and P. D. Kassner
Significant gene content variation characterizes the genomes of inbred mouse strains
Genome Res.,
December 1, 2007;
17(12):
1743 - 1754.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. M. Andres, C. de Hemptinne, and J. Bertranpetit
Heterogeneous Rate of Protein Evolution in Serotonin Genes
Mol. Biol. Evol.,
December 1, 2007;
24(12):
2707 - 2715.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. U. Pontius, J. C. Mullikin, D. R. Smith, Agencourt Sequencing Team, K. Lindblad-Toh, S. Gnerre, M. Clamp, J. Chang, R. Stephens, B. Neelam, et al.
Initial sequence and comparative analysis of the cat genome
Genome Res.,
November 1, 2007;
17(11):
1675 - 1689.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Chakrabarti, M. Pearson, L. Grate, T. Sterne-Weiler, J. Deans, J. P. Donohue, and M. Ares Jr
Structural RNAs of known and unknown function identified in malaria parasites by comparative genomics and RNA analysis
RNA,
November 1, 2007;
13(11):
1923 - 1939.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. G. Parker, A. V. Kukekova, D. T. Akey, O. Goldstein, E. F. Kirkness, K. C. Baysac, D. S. Mosher, G. D. Aguirre, G. M. Acland, and E. A. Ostrander
Breed relationships facilitate fine-mapping studies: A 7.8-kb deletion cosegregates with Collie eye anomaly across multiple dog breeds
Genome Res.,
November 1, 2007;
17(11):
1562 - 1571.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Q. Zhou, H. Chipperfield, D. A. Melton, and W. H. Wong
A gene regulatory network in mouse embryonic stem cells
PNAS,
October 16, 2007;
104(42):
16438 - 16443.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
N. Uchida, B. Townsley, K.-H. Chung, and N. Sinha
Regulation of SHOOT MERISTEMLESS genes via an upstream-conserved noncoding sequence coordinates leaf development
PNAS,
October 2, 2007;
104(40):
15953 - 15958.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. R. Dreszer, G. D. Wall, D. Haussler, and K. S. Pollard
Biased clustered substitutions in the human genome: The footprints of male-driven biased gene conversion
Genome Res.,
October 1, 2007;
17(10):
1420 - 1430.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Casillas, A. Barbadilla, and C. M. Bergman
Purifying Selection Maintains Highly Conserved Noncoding Sequences in Drosophila
Mol. Biol. Evol.,
October 1, 2007;
24(10):
2222 - 2234.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Pheasant and J. S. Mattick
Raising the estimate of functional human sequences
Genome Res.,
September 1, 2007;
17(9):
1245 - 1253.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
W. S.W. Wong and R. Nielsen
Finding cis-regulatory modules in Drosophila using phylogenetic hidden Markov models
Bioinformatics,
August 15, 2007;
23(16):
2031 - 2037.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y.-F. Cheung, Z. Kan, P. Garrett-Engele, I. Gall, H. Murdoch, G. S. Baillie, L. M. Camargo, J. M. Johnson, M. D. Houslay, and J. C. Castle
PDE4B5, a Novel, Super-Short, Brain-Specific cAMP Phosphodiesterase-4 Variant Whose Isoform-Specifying N-Terminal Region Is Identical to That of cAMP Phosphodiesterase-4D6 (PDE4D6)
J. Pharmacol. Exp. Ther.,
August 1, 2007;
322(2):
600 - 609.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Asthana, W. S. Noble, G. Kryukov, C. E. Grant, S. Sunyaev, and J. A. Stamatoyannopoulos
Widely distributed noncoding purifying selection in the human genome
PNAS,
July 24, 2007;
104(30):
12410 - 12415.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. F. Mehler and J. S. Mattick
Noncoding RNAs and RNA Editing in Brain Development, Functional Diversification, and Neurological Disease
Physiol Rev,
July 1, 2007;
87(3):
799 - 823.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. B. Palmer, P. Majumder, M. R. Green, P. A. Wade, and J. M. Boss
A 3' Enhancer Controls Snail Expression in Melanoma Cells
Cancer Res.,
July 1, 2007;
67(13):
6113 - 6120.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. J. Gentles, M. J. Wakefield, O. Kohany, W. Gu, M. A. Batzer, D. D. Pollock, and J. Jurka
Evolutionary dynamics of transposable elements in the short-tailed opossum Monodelphis domestica
Genome Res.,
July 1, 2007;
17(7):
992 - 1004.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. H. Kim, M. S. Waterman, and L. M. Li
Diploid genome reconstruction of Ciona intestinalis and comparative analysis with Ciona savignyi
Genome Res.,
July 1, 2007;
17(7):
1101 - 1110.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B. L. Yaspan, J. P. Breyer, Q. Cai, Q. Dai, J. B. Elmore, I. Amundson, K. M. Bradley, X.-O. Shu, Y.-T. Gao, W. D. Dupont, et al.
Haplotype Analysis of CYP11A1 Identifies Promoter Variants Associated with Breast Cancer Risk
Cancer Res.,
June 15, 2007;
67(12):
5673 - 5682.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|