|
|
|
|
Published online before print
December 8, 2004, 10.1101/gr.3015505 Genome Res. 15:137-145, 2005 ©2005 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/05 $5.00
Chicken Special/Letter Evolution and functional classification of vertebrate gene deserts1 Energy, Environment, Biology, and Institutional Computing, Lawrence Livermore National Laboratory, Livermore, California 94550, USA 2 Genome Biology Division, Lawrence Livermore National Laboratory, Livermore, California 94550, USA 3 Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA 4 Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA 5 Department of Computer Science and Engineering, The Pennsylvania State University, University Park, Pennsylvania 16802, USA 6 Department of Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
Large tracts of the human genome, known as gene deserts, are devoid of protein-coding genes. Dichotomy in their level of conservation with chicken separates these regions into two distinct categories, stable and variable. The separation is not caused by differences in rates of neutral evolution but instead appears to be related to different biological functions of stable and variable gene deserts in the human genome. Gene Ontology categories of the adjacent genes are strongly biased toward transcriptional regulation and development for the stable gene deserts, and toward distinctively different functions for the variable gene deserts. Stable gene deserts resist chromosomal rearrangements and appear to harbor multiple distant regulatory elements physically linked to their neighboring genes, with the linearity of conservation invariant throughout vertebrate evolution.
One of the major challenges of genomics is to understand how the genome is organized and, especially, which sequences and factors contribute to the complex and precise regulation of gene expression. These include cis-regulatory sequences controlling gene expression, insulators or boundary elements defining physical domains, and sequences that anchor genomic regions to specific nuclear locations (Dorsett 1999
One of the unexplained architectural asymmetries observed in the human genome sequence is the uneven distribution of genes (Lander et al. 2001 To investigate this possibility, we focused on sequence comparisons with the chicken genome, an organism strategically positioned between rodents and fish in the vertebrate evolutionary tree. By analyzing genomic structure, conservation patterns, and evolutionary relationships, we were able to classify gene deserts into two functionally different groups and to provide new insights regarding the functions of these intervals in the human genome.
Identification of human gene deserts The current human gene annotation (knownGenes mapped to the NCBI Build 34) (Karolchik et al. 2003 25% of the sequenced human genome. This is consistent with previous estimates of gene desert coverage (Venter et al. 2001
Gene deserts of these sizes are more frequent than might occur by chance if the placement of genes in the genome were random. A randomization study (see Methods) showed that, by chance alone, the probability of a gene desert reaching the observed maximal size of 5.1 Mb is below 10-4the largest intergenic distance produced by randomizations was 2 Mb, a size exceeded by 76 of the observed gene deserts. The same study showed that, by chance alone, the probability of obtaining 545 deserts of size larger than 640 kb is, too, <10-4the largest count of intergenic distances >640 kb produced by randomizations was only 75.
Compared with other genomic regions, gene deserts in general display a strikingly low G+C content, an elevated density of single nucleotide polymorphisms (SNPs), and a decrease in the fraction of conserved sequence between humans, chicken, and mouse (Table 1). The average repeat content of gene deserts is slightly higher than the genome average, but the fraction of DNA comprised of repetitive sequences ranges from 30% to 90%. This suggests that reduced levels of purifying selection pressure may be acting in gene deserts, furthering the hypothesis that these regions represent segments of relatively low biological activity, enriched in pseudogenes, repeats, and other nonfunctional sequences. Contrary to this hypothesis, however, it has been shown that some human gene deserts harbor distant gene REs that are deeply conserved in vertebrate species (Nobrega et al. 2003
SINE-type repetitive elements are depleted in gene deserts Although the relative density of repetitive elements in the gene deserts is comparable with the average distribution in the genome, the content of the various classes of repetitive elements is markedly different. The density of LINE elements is distinctly elevated and the density of SINE elements is decreased in gene deserts, when compared to averages for the human genome (Fig. 2). The opposite trend (relative LINE depletion accompanied by SINE enrichment) is observed for gene-rich (see Methods) and regular intergenic regions (Grover et al. 2003
Dichotomy in evolutionary preservation of gene deserts The average density of evolutionarily conserved regions (ECRs; for a definition, see Methods) detected in human/mouse (h/m) and human/chicken (h/c) alignments is similar in gene deserts and regular intergenic regions (with only a slight increase in density within gene deserts). However, there is a wide variation in ECR density among different gene deserts, which cannot be entirely attributed to the variation in repeat density (Fig. 3A). The distribution of h/c ECR content (hereafter referred to simply as conservation) in these regions ranges from 0%12% and has an uneven shape, with many of the gene deserts having <2% of their sequence conserved (Fig. 3A). We used this arbitrary 2% h/c conservation cutoff to separate gene deserts into two categories, stable (172 regions; >2% conserved) and variable (373 regions; <2% conserved). This classification was initially used because empirically it highlights gene deserts that are well conserved throughout the time since the separation of mammalian and avian lineages; the usefulness of this estimated cutoff level was validated by later analyses (see below). Stable gene deserts have several critical properties indicating that they contain functional DNA elements. First, they include regions surrounding the DACH1, OTX2, and SOX2 genes, which have previously been shown to harbor long-distance transcriptional REs (Nobrega et al. 2003
To highlight the robustness of this partitioning of gene deserts, we also applied phylogenetic hidden Markov model (phastCons) annotation (Siepel and Haussler 2004a
Sequence conservation between species can result from purifying selection reflecting an active resistance to change or from a slower rate of neutral evolution in that region. Thus we investigated the estimated neutral substitution rates in gene deserts to ascertain whether they had a significantly slower neutral rate. The average substitutions per site in aligned ancestral repeats between human and mouse (tAR) has been used as a good estimate of the average substitutions per neutral site (Waterston et al. 2002
Inferences on the biological function of gene deserts
About 52% of all gene deserts are separated from another gene desert by at most 1 Mb and three genes. In particular, 33% (56 of 172) of stable gene deserts are paired in this manner with a stable partner, in what we call a conjoined stable gene desert. The genes interspersed between these conjoined stable gene deserts represent a unique class of loci that have evolved in a largely noncoding genomic environment from the times preceding the speciation event of mammals and birds. GO functional characterization of these genes indicates an enrichment in transcriptional gene regulatory functions and depletion in the response to stimulus category (P < 0.001). Other gene products function in skeletal development (BMP2), electron transport (COX7A3), muscle development (MEF2C), calcium ion binding (DGKB), apoptosis (FKSG2), and cell cycle (DBC1). Many of these genes are known or suspected to be involved in critical developmental steps or essential biochemical processes in vertebrates. The observed bias in genes in these interdesert regions indicates that noncoding elements regulating transcription of the transcription factors are kept under elevated levels of purifying selection throughout the evolution of vertebrates.
Robustness of the dichotomy of gene deserts
Although it is not possible to reliably determine whether a conserved element has regulatory activity using current computational techniques, Kolbe et al. (2004
By studying the average length of h/m and h/c ECRs, we found that ECRs in gene deserts are longer than those found in regular intergenic regions; the average h/m ECR length in gene deserts is 265 bp, whereas that in regular intergenic regions is 224 bp. Human and chicken alignments reveal even longer ECRs in gene desert regions, with an average h/c ECR of 282 bp, but shorter ECRs (222 bp) in the regular intergenic intervals. This difference is even more evident when stable gene deserts are considered; h/m ECRs average 288 bp in these regions, and h/c ECRs span 304 bp on average. In conjunction with our recent observation that a substantial fraction of known functional h/m noncoding ECRs (ncECRs) are >350 bp (Ovcharenko et al. 2004a
UTR conservation is amplified next to gene deserts
This approximately fourfold difference between h/m ncECRs and h/m UTR-ECRs that are also conserved in chicken suggests that an increased selective pressure applies to UTRs and that functional elements lie within the conserved UTRs. For example, h/c conserved UTRs might preferably indicate genes with REs embedded in their untranslated regions, including potential enhancers or sequences involved in posttranscriptional regulatory mechanisms (Pesole et al. 2002
Stable gene deserts are linked to neighboring genes
Remarkably, only two of the 172 identified human stable gene deserts are interrupted by a synteny breakpoint in h/c alignments (see Methods); four other deserts could not be reliably mapped to the chicken genome due to uncertainties associated with the current chicken sequence assembly. The remaining 166 human stable gene deserts appear to be conserved as single intact segments in chicken. The regions of contiguous ECR conservation spanned >80% of the length for 95% of these 166 stable gene deserts. Given the high frequency of syntenic rearrangements detected by humanchicken sequence alignments overall (ICGSC 2004 Dramatic differences in the density of synteny breakpoints were also observed between stable gene deserts, gene-rich regions, and average intergenic regions (Fig. 5). Interestingly, the density of synteny breakpoints was very high in gene-rich regions relative to the genome average for both h/m and h/c comparisons. One explanation may be that in sharp contrast to stable gene deserts, gene-rich regions have possibly evolved as hot spots of chromosomal rearrangements both before and after the primaterodent radiation. However, these data also might suggest that the genes embedded within gene-rich segments are not as likely to be functionally linked to distant REs as are loci found in stable gene deserts.
Because variable gene deserts align poorly to the chicken genome, we could not reliably ascertain the frequency of syntenic h/c breakpoints in them. Interestingly, the frequency of h/m breakpoints is only slightly higher in variable deserts than in stable deserts (0.014 versus 0.01 per Mb); both are roughly 10-fold lower than the rates in gene-rich regions (0.16) and average in the genome (0.09).
Identification of gene deserts in the chicken and mouse genomes
An interesting and unique feature of the chicken genome is an abundance of microchromosomes varying in size from 1.0 to 20.6 Mb. One might expect these to be depleted of long gene deserts, given the small size of the microchromosomes and also the possibility that microchromosomes may have evolved through multiple rearrangement events, while stable gene deserts tend to maintain their structural integrity and lack chromosomal breaks. In contrast to this expectation, we did not observe a decrease in the density or size of stable gene deserts on microchromosomes (Figs. 6, 7); rather the density of stable gene deserts was actually slightly higher in microchromosomes than in other chromosomal categories (Fig. 7). Thus, similarly to the pattern seen for human gene deserts (Fig. 1), the distribution of stable gene deserts in the chicken genome is largely independent of chromosome size. Also, the level of coverage of microchromosomes by stable gene deserts suggests that stable gene deserts do not have an obvious bias against appearance of synteny breaks in the surrounding regions, such as those that define the ends of these unusually small avian chromosomes.
Gene deserts are large intergenic regions that collectively cover 25% of the human genome. We show that they have distinct evolutionary histories and sequence signatures that set them apart from the rest of the genome. In particular, different types of repetitive elements are not uniformly represented; human gene deserts are enriched in LINE elements, while regular intergenic regions have preferably accumulated SINE elements. These data are compatible with previous studies that have shown differences in repeat content in gene-rich and gene-poor domains (Medstrand et al. 2002 Comparative sequence analysis of the human gene deserts and orthologous chicken regions effectively separates gene deserts into two categoriesstable and variable. Stable gene deserts display high levels of sequence similarity in human and chicken, while the variable deserts appear to be specific to the mammalian lineage. Stable gene deserts display lower repeat density and an amount of h/m sequence conservation comparable to that of the gene-rich regions of the human genome, suggesting that considerable degrees of purifying pressure are acting over these stable gene deserts. A third of the stable gene deserts are conjoined; i.e., they cluster in pairs surrounding a small number of genes. These conjoined deserts create long loci in the genome with minimum gene density, which are much more effectively preserved throughout the evolution of vertebrates than the rest of the genome. Perhaps not surprisingly, the majority of genes that are either flanked by stable gene deserts or are neighboring these highly conserved intervals are functionally related to core biochemical processes such as regulation of transcription, skeletal and muscle development, DNA binding, and regulation of metabolism.
The density of h/f ECRs is negligibly small across variable gene deserts and is simultaneously strongly elevated in stable gene deserts, suggesting a separation in the biological function and evolutionary importance for these two categories of gene deserts. Stable gene deserts are thus prime candidates for regions with key distant gene REs in the human genome. The function of variable gene deserts is more ambiguous. They possibly represent recently evolved regions that have not yet been fixed; alternatively they may lack important function and represent genomic "junkyards." This dichotomy potentially reconciles the apparent disparity in studies showing that while certain human gene deserts are rich in gene REs (Nobrega et al. 2003 In support of the idea that stable gene deserts are enriched in long-range regulators, we detected a threefold higher density of computationally predicted REs in stable gene deserts than in the variable gene desert regions. The syntenic stability of stable gene deserts also suggests that distinct types of evolutionary events have shaped gene deserts and gene-rich regions. While gene-rich regions accumulate synteny breakpoints twice as fast as the average intergenic regions, stable gene deserts are depleted of synteny breakpoints. Ninety-six percent of stable gene deserts are represented as a single syntenic block in the genomes of humans, mice, and chicken despite their large size. The almost absolute preservation of chromosomal integrity of stable deserts suggests that the regulation of genes flanking them differs from that in gene-rich regions. We hypothesize that genes flanking stable gene deserts are most likely to be associated with distant gene REs that cannot be separated from coding sequences by recombination events, while the regulation of the genes within gene-rich genomic regions typically takes place through promoters and/or intronic elements. Strong enrichment of the h/c UTR conservation of genes flanking gene deserts suggests that these genes might require evolutionary preservation of both transcriptional and posttranscriptional control. By using contiguous synteny relationships for the human genome with the genomes of mice and chicken, we were able to identify stable gene deserts in chicken and mice without requiring a reliable gene annotation for these two species. Human and mouse stable gene deserts are very similar in length, and the difference in length between specific human and chicken gene deserts agrees with the human genome expansion coefficient. The uniform expansion of individual stable gene deserts over the course of mammalian evolution implies that the function of distant REs is largely independent of the absolute distance between neighboring REs, or between the REs and the corresponding genes. However, vertebrate evolution has kept these components in a fixed relative order and at considerable distances from one another, suggesting that distant spacing of elements and their relative orders within the deserts and flanking genes is also important to function. Finally, the distribution of stable gene deserts in the chicken genome is not diminished in microchromosomes, suggesting that desert-associated chromosomal stability may disappear not far beyond the boundaries of the gene deserts and their adjacent genes. Although much remains to be explained about the function of gene deserts in general, these findings provide some potential new insights to distant regulatory activity. Our evolutionary analysis emphasizes the importance of stable gene deserts and suggests that they are likely to play a critical biological role in vertebrates.
Randomization study of the gene deserts' distribution in the human genome If positions within known genes (exons or introns) are not counted, the human genome assembly from July 2003 consists of 51 segments that are bounded by a telomere, a centromere, or an assembly gap (unassembled region) of size >250 kb, totaling 1.75 Gb. Within those segments there are 18,134 intergenic regions these contain a total of 286 gaps, each of under 250 kb. While some of the intergenic regions have size 0, many have considerable length; in particular, the largest of these regions measures 5.1 Mb, and 545 of the regions exceed 640 kb. In order to evaluate the likelihood that such wealth of gene deserts could occur by chance, we computed empirical P-values as follows. We derived a "null" set of intergenic distances, by randomly selecting positions (duplicates possible) from a set of 51 intervals having the sizes of the above-mentioned 51 genomic segments, avoiding positions corresponding to the 286 short gaps. Sufficiently many positions were selected as to create 18,134 interposition distances, and we then determined (1) the maximum interposition distance and (2) the number of interposition distances >640 kb. This process was repeated 1000 times, generating 1000 maximum distances and 1000 counts of distances >640 kb. The largest interposition distance in all trials was 2,033,165 (so none of the 1000 maxima exceeded 2.1 Mb), and in none of the trials were there >75 interposition distances in excess of 640Kb. Thus, empirical P-values for both observed maximum and count are <10-4.
Identification of gene-rich regions
Identification of ECRs
Sixty-six thousand h/f ncECRs were identified as described (Ovcharenko et al. 2004a
PhastCons conservation
Predicting REs
Large blocks of conserved synteny
Using this approach, large-scale similarity of the human and mouse genomes was modeled with
We would like to thank the anonymous reviewers for their valuable comments, Francesca Chiaromonte for suggestions about the randomization study, and John Karro and Shan Yang for determining rates of neutral evolution. W.M. and R.H. were supported by NHGRI grant HG02238 and NIDDK grant DK065806; G.G.L. was supported by LLNL LDRD-04-ERD-052 grant; and I.O. was in part supported by DOE SCW0345 grant. The work was performed under the auspices of the United States Department of Energy by the University of California, Lawrence Livermore National Laboratory Contract No. W-7405-Eng-48.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.3015505. Article published online before print in December 2004.
7 Corresponding author [Supplemental material is available online at www.genome.org.]
Bell, A.C., West, A.G., and Felsenfeld, G. 2001. Insulators and boundaries: Versatile regulatory elements in the eukaryotic. Science 291: 447-450.
Blanchette, M., Kent, W.J., Riemer, C., Elnitski, L., Smit, A.F., Roskin, K.M., Baertsch, R., Rosenbloom, K., Clawson, H., Green, E.D., et al. 2004. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 14: 708-715. Carter, D., Chakalova, L., Osborne, C.S., Dai, Y.F., and Fraser, P. 2002. Long-range chromatin regulatory interactions in vivo. Nat. Genet. 32: 623-626.[CrossRef][Medline]
Dehal, P., Predki, P., Olsen, A.S., Kobayashi, A., Folta, P., Lucas, S., Land, M., Terry, A., Ecale Zhou, C.L., Rash, S., et al. 2001. Human chromosome 19 and related regions in mouse: Conservative and lineage-specific evolution. Science 293: 104-111. Dorsett, D. 1999. Distant liaisons: Long-range enhancer-promoter interactions in Drosophila. Curr. Opin. Genet. Dev. 9: 505-514.[CrossRef][Medline] Gibbs, R.A. Weinstock, G.M., Metzker, M.L., Muzny, D.M., Sodergren, E.J., Scherer, S., Scott, G., Steffen, D., Worley, K.C., Burch, P.E., et al. 2004. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428: 493-521.[CrossRef][Medline]
Greally, J.M. 2002. Short interspersed transposable elements (SINEs) are excluded from imprinted regions in the human genome. Proc. Natl. Acad. Sci. 99: 327-332.
Grover, D., Majumder, P.P., Rao, C.B., Brahmachari, S.K., and Mukerji, M. 2003. Nonrandom distribution of Alu elements in genes of various functional categories: Insight from analysis of human chromosomes 21 and 22. Mol. Biol. Evol. 20: 1420-1424.
Grover, D., Mukerji, M., Bhatnagar, P., Kannan, K., and Brahmachari, S.K. 2004. Alu repeat analysis in the complete human genome: Trends and variations with respect to genomic composition. Bioinformatics 20: 813-817.
Hardison, R.C., Roskin, K.M., Yang, S., Diekhans, M., Kent, W.J., Weber, R., Elnitski, L., Li, J., O'Connor, M., Kolbe, D., et al. 2003. Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res. 13: 13-26.
Hasse, A. and Schulz, W.A. 1994. Enhancement of reporter gene de novo methylation by DNA fragments from the International Chicken Genome Sequencing Consortium (ICGSC). 2004. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature (in press).
Karolchik, D., Baertsch, R., Diekhans, M., Furey, T.S., Hinrichs, A., Lu, Y.T., Roskin, K.M., Schwartz, M., Sugnet, C.W., Thomas, D.J., et al. 2003. The UCSC Genome Browser Database. Nucleic Acids Res. 31: 51-54.
Kimura-Yoshida, C., Kitajima, K., Oda-Ishii, I., Tian, E., Suzuki, M., Yamamoto, M., Suzuki, T., Kobayashi, M., Aizawa, S., and Matsuo, I. 2004. Characterization of the pufferfish Otx2 cis-regulators reveals evolutionarily conserved genetic mechanisms for vertebrate head specification. Development 131: 57-71.
Kolbe, D., Taylor, J., Elnitski, L., Eswara, P., Li, J., Miller, W., Hardison, R., and Chiaromonte, F. 2004. Regulatory potential scores from genome-wide three-way alignments of human, mouse, and rat. Genome Res. 14: 700-707. Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. 2001. Initial sequencing and analysis of the human genome. Nature 409: 860-921.[CrossRef][Medline]
Medstrand, P., van de Lagemaat, L.N., and Mager, D.L. 2002. Retroelement distributions in the human genome: Variations associated with age and proximity to genes. Genome Res. 12: 1483-1495. Nelson, C.E., Hersh, B.M., and Carroll, S.B. 2004. The regulatory content of intergenic DNA shapes genome architecture. Genome Biol. 5: R25.[CrossRef][Medline] Nobrega, M.A., Zhu, Y., Plajzer-Frick, I., Afzal, V., and Rubin, E.M. 2004. Megabase deletions of gene deserts result in viable mice. Nature 431: 988-993.[CrossRef][Medline]
Nobrega, M.A., Ovcharenko, I., Afzal, V., and Rubin, E.M. 2003. Scanning human gene deserts for long-range enhancers. Science 302: 413.
Ovcharenko, I., Nobrega, M.A., Loots, G.C., and Stubbs, L. 2004a. ECR Browser: A tool for visualizing and accessing data from comparisons of multiple vertebrate genomes. Nucleic Acids Res. 32: W280-W286. Ovcharenko, I., Stubbs, L., and Loots, G.G. 2004b. Interpreting mammalian evolution using Fugu genome comparisons. Genomics 84: 890-895.[CrossRef][Medline]
Pesole, G., Liuni, S., Grillo, G., Licciulli, F., Mignone, F., Gissi, C., and Saccone, C. 2002. UTRdb and UTRsite: Specialized databases of sequences and functional elements of 5' and 3' untranslated regions of eukaryotic mRNAs: Update 2002. Nucleic Acids Res. 30: 335-340.
Rinchik, E.M., Carpenter, D.A., and Selby, P.B. 1990. A strategy for fine-structure functional analysis of a 6- to 11-centimorgan region of mouse chromosome 7 by high-efficiency mutagenesis. Proc. Natl. Acad. Sci. 87: 896-900.
Rubin, C.M., VandeVoort, C.A., Teplitz, R.L., and Schmid, C.W. 1994. Alu repeated DNAs are differentially methylated in primate germ cells. Nucleic Acids Res. 22: 5121-5127.
Russell, L.B., Montgomery, C.S., and Raymer, G.D. 1982. Analysis of the albino-locus region of the mouse, IV: Characterization of 34 deficiencies. Genetics 100: 427-453.
Shannon, M., Hamilton, A.T., Gordon, L., Branscomb, E., and Stubbs, L. 2003. Differential expansion of zinc-finger transcription factor loci in homologous human and mouse gene clusters. Genome Res. 13: 1097-1110. Siepel, A. and Haussler, D. 2004a. Combining phylogenetic and hidden Markov models in biosequence analysis. J. Comput. Biol. 11: 413-428.[CrossRef][Medline]
. 2004b. Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol. Biol. Evol. 21: 468-488. Uchikawa, M., Takemoto, T., Kamachi, Y., and Kondoh, H. 2004. Efficient identification of regulatory sequences in the chicken genome by a powerful combination of embryo electroporation and genome comparison. Mech. Dev. 121: 1145-1158.[CrossRef][Medline]
Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G., Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A., et al. 2001. The sequence of the human genome. Science 291: 1304-1351. Waterston, R.H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J.F., Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., An, P., et al. 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420: 520-562.[CrossRef][Medline]
Yang, S., Smit, A.F., Schwartz, S., Chiaromonte, F., Roskin, K.M., Haussler, D., Miller, W., and Hardison, R.C. 2004. Patterns of insertions and their covariation with substitutions in the rat, mouse, and human genomes. Genome Res. 14: 517-527. Yoder, J.A., Walsh, C.P., and Bestor, T.H. 1997. Cytosine methylation and the ecology of intragenomic parasites. Trends Genet. 13: 335-340.[CrossRef][Medline]
http://ecrbrowser.dcode.org; ECR Browser. http://genome.ucsc.edu; UCSC Genome database. http://hgdownload.cse.ucsc.edu/goldenPath/hg16/regPotential/; data used to generate the regulatory potential tracks.
Received July 15, 2004; accepted in revised format October 4, 2004. This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||