|
|
|
|
Genome Res. 17:898-909, 2007 ©2007 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/07 $5.00 OPEN ACCESS ARTICLE Methods Mapping of transcription factor binding regions in mammalian cells by ChIP: Comparison of array- and sequencing-based technologies1 Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, Connecticut 06520-8103, USA; 2 Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520-8114, USA; 3 Genome Institute of Singapore, Singapore 138672; 4 Department of Genetics, Yale University School of Medicine, New Haven, Connecticut 06520-8005, USA; 5 Center for Nanotechnology, NASA Ames Research Center, Moffett Field, California 94035, USA
Recent progress in mapping transcription factor (TF) binding regions can largely be credited to chromatin immunoprecipitation (ChIP) technologies. We compared strategies for mapping TF binding regions in mammalian cells using two different ChIP schemes: ChIP with DNA microarray analysis (ChIP-chip) and ChIP with DNA sequencing (ChIP-PET). We first investigated parameters central to obtaining robust ChIP-chip data sets by analyzing STAT1 targets in the ENCODE regions of the human genome, and then compared ChIP-chip to ChIP-PET. We devised methods for scoring and comparing results among various tiling arrays and examined parameters such as DNA microarray format, oligonucleotide length, hybridization conditions, and the use of competitor Cot-1 DNA. The best performance was achieved with high-density oligonucleotide arrays, oligonucleotides 50 bases (b), the presence of competitor Cot-1 DNA and hybridizations conducted in microfluidics stations. When target identification was evaluated as a function of array number, 80%86% of targets were identified with three or more arrays. Comparison of ChIP-chip with ChIP-PET revealed strong agreement for the highest ranked targets with less overlap for the low ranked targets. With advantages and disadvantages unique to each approach, we found that ChIP-chip and ChIP-PET are frequently complementary in their relative abilities to detect STAT1 targets for the lower ranked targets; each method detected validated targets that were missed by the other method. The most comprehensive list of STAT1 binding regions is obtained by merging results from ChIP-chip and ChIP-sequencing. Overall, this study provides information for robust identification, scoring, and validation of TF targets using ChIP-based technologies.
Identification of transcription factor binding sites is essential for understanding the regulatory circuits that control cellular processes such as cell division and differentiation as well as metabolic and physiological balance. Traditionally the pursuit of transcription factor targets has exposed only a few binding regions at a time. However, recent years have witnessed several new approaches for the global mapping of transcriptional regulatory regions. Such approaches include computational methods (Bailey and Elkan 1995 Although ChIP-based technologies have demonstrated widespread utility, many experimental parameters important for enhancing the performance of ChIP have not been adequately explored for mammalian cells. Moreover, a direct comparison of ChIP-chip and ChIP sequencing has not been performed. Such information is crucial for the large number of experiments that are performed on subsets of mammalian genomes and will become even more crucial as these experiments expand to cover entire genomes.
While many microarray parameters for ChIP-chip appear to translate well from previously established microarray protocols (see, for example, Hegde et al. 2000
We explored parameters for ChIP-chip using the sequence-specific transcription factor STAT1 (Signal Transducer and Activator of Transcription). STAT1 is a cytoplasmic protein that translocates to the nucleus when cells encounter interferons or other peptide signals (for review, see Boehm et al. 1997
STAT1 ChIP-chip studies have been conducted previously on a Chromosome 22 PCR product tiling array (Hartman et al. 2005
Our STAT1 mapping studies focus on the ENCODE regions, which represent 1% (30 Mb) of the human genome (The ENCODE Project Consortium 2004
Exploring ChIP-chip performance: Longer oligonucleotides yield better signals In the first phase of these studies, we investigated ChIP-chip performance on oligonucleotide arrays synthesized by maskless photolithography (Nuwaysir et al. 2002 1 kb final DNA size. STAT1 and its associated DNA were immunoprecipitated using an anti-STAT1 antibody. Cross-links were reversed, and the success of each immunoprecipitation was examined by PCR analysis using primers to a known STAT1-binding region in the promoter of IRF1 (Interferon Regulatory Factor 1) (Hartman et al. 2005Using this assay, we investigated the effects of varying a number of parameters on the performance of ChIP-chip. These parameters included the type of beads used in the immunoprecipitation step (magnetic or Sepharose), various labeling technologies, and array hybridization conditions. The final ChIP and microarray conditions selected are reported in Methods. No difference in immunoprecipitation efficiency was observed using magnetic as opposed to Sepharose beads. However, signal enrichment and array uniformity were significantly improved when the hybridization solution was continuously circulated over the array surface using microfluidic chambers; thus all arrays were subjected to this procedure. We also included unlabeled Cot-1 competitor DNA in all hybridizations except as noted below.
Arrays with oligonucleotides of different lengths (2560 bases [b]) are currently used for ChIP-chip experiments (Cawley et al. 2004
Validation of targets from the 50-b oligonucleotide array data Signal enrichment maps are suggestive of binding regions, but in order for array performance to be properly assessed, it is essential to validate the targets identified from the ChIP-chip experiments. Therefore, we devised a scheme to measure the sensitivity and specificity of the experiments. STAT1 targets were ranked according to their signal enrichments, and a subset of targets was sampled across the rankings and tested for enrichment in STAT1 ChIP DNA by ChIP-PCR analysis. A twofold or greater enrichment in each of at least two STAT1 biological replicate ChIP-PCR experiments was chosen as a threshold for enrichment. Target validation was plotted as a function of rank order for each ChIP-chip data set. As shown in Figure 2A, targets at the top of the rank list validated as true positives, and the frequency of target validation diminishes further down the rank list. Thus, most of the first 75 targets are expected to be bona fide targets, whereas most of the regions below 100 on the rank list are negative. Extrapolation of the confirmed positives as a function of rank order for the entire list suggests that there are 124 positives in the top 200 targets listed (Table 1). This figure is expected to be an overestimation because many targets lie immediately adjacent to one another and likely represent enrichments from a single common target region. If targets are combined into 10-kb regions, then the total number of STAT1 targets is 67 for the ChIP-chip data set using the arrays with 50-b end-to-end tiling.
We also compared the accuracy of target detection for the 50-b ChIP-chip data set as a function of signal enrichment. As shown in Figure 3, the fraction of validated positives decreases and the fraction of false positives increases at a very sharp signal enrichment threshold. Thus there is a very sharp transition at a particular signal enrichment ( 0.25 on a log2 scale) above which most targets validate as positives.
We next compared the accuracy of STAT1 targets identified from the 50-base array ChIP-chip data set to those identified from the 36-b array ChIP-chip data set. We selected the highest 75 ranked targets from the 50-b array data, corresponding to a false-positive rate of 0.26, and cross-referenced these with the entire list of 39 targets identified from the 36-b array data. For the 36-b arrays, only the top-ranked 39 regions had positive signals at a statistically significant cutoff. We suspect this low number is due to diminished signal on the 36-b arrays. The targets of the 50- and 36-b arrays combined into 84 distinct target regions (see "Comparison of Target Lists" in Methods); 18 were common to both lists, and most of these (eight of the 11 tested by ChIP-PCR analysis) validated as bona fide targets. The 36-b oligonucleotide array failed to identify 68% (51/75) of the targets detected with the 50-b oligonucleotide array. Of these 51 targets, 27 were tested by ChIP-PCR analysis, and the majority of these (20/27) could be validated. In contrast, 15 targets were unique to the 36-b array. Seven of these 15 were tested by ChIP-PCR analysis, and none showed enrichment. If we restrict analysis of the 36-b array to the top 25 targets, thereby reducing its false-positive rate from 0.52 to 0.38, a similar trend is observed (Supplemental Table 2) and fewer targets specific to the 36-b array are identified, indicating a greater overlap of the top-ranked targets between the 50- and 36-b lists. In conclusion, based on chromosomal maps of signal enrichments (Fig. 1) and target validations, the 50-b arrays outperformed the shorter 36-b arrays under the conditions we used.
Comparison of oligonucleotide and PCR product arrays The top 75 ranked targets from the 50-b oligonucleotide array data and the top 75 ranked targets from the PCR product array data were then merged to form a union of regions that could be used as the basis for comparing the two ChIP-chip data sets (Table 2A; see "Comparison of Target Lists" in Methods). Six targets overlapped between the 50-b oligonucleotide array and the PCR product array target lists (Table 2B). The different platforms were compared as a function of their rank order on the target lists. As shown in Figure 4, the positives at the very top of the rank order lists usually agree, and less concurrence is observed for targets with lower rankings. If we restrict analysis of the PCR product array data set to the top 33 targets, thereby reducing its false-positive rate from 0.64 to 0.40, a similar trend is observed (Supplemental Table 3).
To ascertain if targets from the PCR product arrays and the 50-b oligonucleotide arrays validate at similar rates, and to determine if the two platforms exhibit similar sensitivities and specificities, targets were selected and tested for validation across a wide range of rank orders using ChIP-PCR analysis. As shown in Figure 2B, the frequency of validated targets (i.e., the positive predictive value) from the PCR product array data was diminished relative to the 50-b oligonucleotide array data (Fig. 2A), indicating that the PCR product array data set contains more false positives. In addition, the sensitivity of the PCR product array format was lower. To investigate these differences in array performance, we examined regions that were specific to one of the target lists and that were tested for enrichment by ChIP-PCR (Table 2B). Seven targets that were identified by the PCR product array data set and validated by ChIP-PCR analysis were not present on the target list from the 50-b oligonucleotide array data set. Inspection of these regions revealed six of the seven targets contained a combination of repetitive elements and AT-rich sequences that likely resulted in low signal enrichments on the oligonucleotide arrays. In contrast, 21 targets identified from the oligonucleotide array data set and validated by ChIP-PCR analysis were not found using the PCR product arrays. Two of the 21 were adjacent to positive regions detected by the PCR product arrays, but we could not identify aspects of sequence composition that might cause the other 19 targets to escape detection in the ChIP-chip experiments performed with the PCR product arrays.
The presence of competitor Cot-1 DNA in the hybridization improves signal-to-noise
An inspection of 22 targets specific to the Cot-absent data set revealed that 13 targets had highly repetitive elements in their regions and eight targets had segmental duplications. When the same sliding window scoring method was applied to the Cot-absent and Cot-present data sets, a significant number of additional targets was found in the Cot-absent ranked target list (181 targets) relative to the Cot-present ranked target list (three targets) at the equivalent threshold of 3.5-fold enrichment. Importantly, validation of targets revealed a much higher accuracy for the STAT1-associated regions identified in the presence of Cot-1 DNA than in the absence of Cot-1 DNA. Targets specific to either the Cot-present or Cot-absent data sets were sampled from among the top 75 ranked targets identified (Table 3) and tested for enrichment by ChIP-PCR analysis. The experiment containing Cot-1 DNA detected 15 validated positive regions specific to that data set at a false-positive rate of 0.25, whereas the experiment lacking Cot-1 DNA detected only two validated positive regions specific to that data set at a false-positive rate of 0.83 (Table 3). Thus, more accurate results can be obtained through inclusion of Cot-1 DNA in ChIP-chip hybridizations.
The value of adding more biological replicate experiments Researchers typically perform multiple biological replicate experiments for microarray data sets, although a systematic analysis of how replicas improve accuracy and reproducibility of targets has not been previously investigated. We therefore examined the value of performing multiple experiments. The top 50, 100, and 200 targets were taken from six biological replicates hybridized with Cot-1 DNA to six arrays with 50 b every 38-b spacing (Supplemental Table 1). As noted above, the top 50 targets have the highest frequency of enrichment in ChIP-PCR validations, and those near the bottom of the list (e.g., ranked 150200) have the lowest frequency of positive validation. The efficiency of target detection from among all targets identified in this Cot-present data set was determined using a single biological replicate on one array, and then progressively increasing the number of biological replicates, with each replicate hybridized to a separate array. As shown in Figure 6, 50%70% of all targets from the six-array Cot-present data set can be identified even with a single array. As expected, a higher fraction of the targets are identified using the top 50 target list relative to the top 200 target list since the largest fraction of positive regions resides at the highest rankings as shown in the ChIP-PCR validation studies. The analysis of three independent biological replicates, which is typical for most published ChIP-chip experiments, identified most (80%86%) of the final targets included in the six-array data set.
Comparison of ChIP-chip to ChIP-PET In ChIP sequencing, a ChIP-enriched fragment is represented by either a single internal 20-base-pair (bp) tag sequence (ChIP-STAGE) or a 36-bp paired-end ditag (ChIP-PET in which the ditag is constructed from 18-bp 5' and 3' signature sequences extracted from each end of the ChIP DNA fragment, thus demarcating the full length of the sonicated ChIP fragment). The binding sites are then deduced by the frequency with which tags are extracted from ChIP DNA fragments relative to the background expectation. The advantage of using paired-end-ditags over single tags is that the PETs mark the start and end of each ChIP fragment. When PET fragments are mapped to the reference genome (e.g., the NCBIv35 [hg17] build of the human genome sequence), the identity of each individual ChIP fragment can be inferred by the PET mapping location, and binding sites can be accurately defined by the common regions within clusters of overlapping PETs. Furthermore, duplicate PET fragments arising from fragment amplification events during cloning can be easily distinguished and removed by treating these multiple PETs that map to an identical location as a single fragment. In all, 725,877 PETs were sequenced from STAT1 ChIP DNA isolated from IFNG-induced cells. Sixty-six percent of the PETs map to unique locations in the genome and represent 327,838 distinct ChIP DNA fragments ranging from 0.1 to 6 kb. Of these unique paired-end diTags, only those PET fragments with 5'- and 3'-ends <6 kb apart were considered. The PET-defined ChIP fragments that overlapped with each other were grouped into clusters: clusters of two overlapping fragments are termed as PET-2, clusters of three overlapping fragments as PET-3, and clusters of three or more overlapping fragments as PET3+, and so on. The frequency of each cluster throughout the ENCODE regions is shown in Table 4. The ENCODE region with the most overlapping fragments lies upstream of IRF1 and is a PET-33 cluster (Fig. 7A). Monte Carlo simulation was performed to determine the frequency of clusters expected by random chance (Table 4; see Supplemental Methods). Based on the frequency of PET clusters generated at random, more than 46% of PET-3 clusters and more than 88% of PET4+ clusters are likely to represent bona fide binding targets.
Comparison of signal maps derived from ChIP-chip and Chip-PET data reveals appreciable agreement between the two approaches (Fig. 7), and the concurrence is highest for those targets with the highest signal (Table 5). Since the ChIP-PET sequencing experiment inherently covers all of the ENCODE regions, we only considered those 75 PET3+ clusters whose sequence was represented on the 50 b every 50-b array tile path (Supplemental Table 1) for a comparison between the two platforms. Of these 75 PET3+ clusters, there were 11 PET5+ clusters (those with the highest enrichment), nine of which were also identified in the 50 b every 50-b array data set (Table 5). For the remaining 64 PET-3 and PET-4 clusters, only five overlap the targets lists for the ChIP-chip data set, giving an overall concurrence of 14 targets (Table 6).
To further investigate the targets that were unique to either the ChIP-chip or ChIP-PET target lists, validation experiments were performed. Ten of the targets identified by ChIP-PET3+ cluster regions and missed in the 50 b every 50-b array data set were selected for ChIP-PCR validation and shown to be bona fide targets (Table 6). Repetitive DNA elements appeared to obstruct the identification of six of these 10 targets in the 50 b every 50-b ChIP-chip data set. These repetitive regions had the following characteristics:
The remaining four PET3+ targets not detected by the 50-b array were missed for no apparent reason. Investigation of the 15 confirmed targets that were detected in the 50 b every 50-b array ChIP-chip data set but that were not on the PET3+ list (Table 6B) revealed that seven resided near a ChIP-PET target but were on the shoulder relative to the site of maximal signal. Five of the 7 targets corresponded to the IRF1 locus, which has one of the strongest signals in the genome (Fig. 7A). Thus these array targets correspond to a single common target region. Four of the remaining eight ChIP-chip targets from the 50 b every 50-b array data set intersected PET-2 clusters; we presume increased sequencing depth would have detected these STAT1-binding regions.
We also inspected those regions that did not show enrichment by ChIP-PCR analysis (11 negatives specific to the 50 b every 50-b array data set and five negatives specific to the ChIP-PET experiment) (Table 6B) to ascertain what sequence features might contribute to the identification of these targets as false positives. Of the 11 false positives from the 50 b every 50-b array ChIP-chip data set, six are either largely or entirely comprised of simple repeats, one additional target region occurs as a segmental duplication, another lies near a strong target in the IRF1 5'-non-coding region, and no unusual features that may be uniquely attributable to ChIP-chip performance could be established for the other three. All five ChIP-PETs that were not enriched in ChIP-PCR validation experiments (Table 6) were PET-3 clusters. As indicated by the Monte Carlo simulation (Table 4),
The combination of sequenced genomes and ChIP-based technologies has inspired progress for the comprehensive detection of transcription factor binding regions in vivo. While most efforts have focused on ChIP-chip strategies, ChIP sequencing is gaining popularity as a parallel method. In this study, we performed STAT1 chromatin immunoprecipitations from IFNG-stimulated cells and used the resulting ChIP DNA to map STAT1-binding regions by both microarray hybridizations and DNA sequencing. Based on the outcome of these studies, we determined that reliable ChIP-chip results can be obtained using maskless high-density arrays containing longer rather than shorter oligonucleotides and also by including Cot-1 DNA as a competitor to improve hybridization accuracy. In cross-referencing STAT1 targets obtained by ChIP-chip with those detected by ChIP-PET, we found regions that overlapped between ChIP-chip and ChIP-PET, as well as enriched regions specific to only one of these methods. Thus the sequencing of ChIP DNA fragments is shown to be a valuable and alternative strategy for target identification.
The ChIP-chip conditions applied here for STAT1 can be extended to other DNA-interacting proteins that are constitutively present in the nucleus. In these experiments, the hybridization reference samples are either total genomic DNA or ChIP DNA prepared using normal serum. Examples of other factors we have analyzed by ChIP-chip on 50-b maskless ENCODE tiling arrays include the chromatin remodeling proteins BAF155 and BAF170, as well as the transcription factor c-Jun; the binding profiles of all three of these proteins are part of the ENCODE meta-analyses, and their tracks are available in the UCSC Browser (The ENCODE Project Consortium 2004
For the maskless array platforms, longer oligonucleotides most likely improve performance through reduced cross-hybridization and potentially stronger signals. This, in turn, should lead to more accurate measurements and thus more accurate ratios of immunoprecipitated DNA relative to control DNA. Extending this logic, PCR product arrays have even longer DNA fragments as array elements and in theory should provide superior results to oligonucleotide arrays. This is not the case, probably for several reasons. First, multiple probes on high-density oligonucleotide arrays allow for several independent measurements across a region of interest. If any individual probe performs poorly (e.g., because of secondary structure, cross-hybridization, or AT-rich regions), then sampling over multiple probes using a sliding window approach (see Supplemental Methods) can still provide useful signals. Indeed, we have found that signals generated by one or a few oligonucleotides are not usually trustworthy. Second, repetitive sequences on PCR product arrays may reduce signal-to-noise ratios. Finally, a small fraction of PCR products (5%10%) amplify from regions other than those intended (Rinn et al. 2003 Our validation strategy involved analyzing regions sampled across a range of targets ranked by signal enrichments. By extending the validation frequency as a function of rank, we can extrapolate and determine the sensitivity of the experiment at a particular threshold. It should be noted, however, that positives that are unable to be detected by a specific protocol cannot be assessed for sensitivity using this validation method. Nonetheless, this strategy is expected to provide the best approach available for determining these measurements. Our study reveals that ChIP-chip and ChIP-PET generally yield similar results, particularly for the strongest signals. However, targets that are uniquely identified by one of these technologies are also captured, and many of these targets could be validated as positives by ChIP-PCR analysis. Targets exclusive to either ChIP-chip or ChIP-PET fall into several classes:
Our studies suggest that several design parameters can be modified to enhance the performances of ChIP-chip and ChIP-PET. For ChIP-chip, future generations of array design may incorporate the following improvements:
For ChIP-PET, slight modifications to the mapping algorithm should eliminate those few instances in which nearly identical ChIP fragments were double counted in determining the ChIP-PET cluster number (see example in Supplemental Fig. 2).
Another desirable feature of ChIP-PET is that it is inherently whole genome and can theoretically find all targets present in genomic sequences. Currently both ChIP-PET and whole-genome ChIP-chip are expensive because of the considerable cost of high-throughput sequencing and whole-genome oligonucleotide arrays. However, both of these technologies are expected to exhibit dramatic decreases in cost in the near future as new sequencing technologies become available (Margulies et al. 2005
STAT1 chromatin immunoprecipitations STAT1 ChIP samples were prepared from IFNG-stimulated HeLa S3 cells, and ChIP DNA quality was verified as previously described (Hartman et al. 2005 1 kb in size. Clarified lysates were incubated overnight at 4°C with anti-STAT1 p91 (C-24) rabbit polyclonal antibody (Santa Cruz Biotechnology #sc-345). ProteinDNA complexes were precipitated with RIPA-equilibrated protein A agarose beads (Upstate #16-156), and immunoprecipitates were washed three times in 1x RIPA, once in 1x PBS, and then eluted from the beads by addition of 1% SDS, 1x TE (10 mM Tris-Cl at pH 7.6, 1 mM EDTA at pH 8), and incubation for 10 min at 65°C. Cross-links were reversed overnight at 65°C. All samples were purified by treatment first with 200 µg/mL RNase A (QIAGEN #19101) for 1 h at 37°C, then with 200 µg/mL Proteinase K (Ambion #2548) for 2 h at 45°C, followed by extraction with phenol:chloroform:isoamyl alcohol and precipitation at 70°C with 0.1 volume of 3 M sodium acetate, 2 volumes of 100% ethanol, and 1.5 µL of pellet paint coprecipitant (Novagen #69049-3). ChIP DNA prepared from 1 x 108 cells was resuspended in 50 µL of ultrapure water (GIBCO-Invitrogen #10977-015).
ChIP sample preparation and labeling For PCR product arrays (gift of Bing Ren, UCSD) and maskless arrays with 50 b every 50-b spacing and 36 b every 36-b spacing (both oligo length arrays manufactured by NASA Ames Research Center), ChIP DNA from 1 x 108 cells was random primed with Klenow (enzyme and primers from BioPrime DNA Labeling System; Invitrogen #18094-011), and Aminoallyl-dUTP (Sigma #A0410) was incorporated. Next Alexa Fluor dyes (Invitrogen #A32755; Alexa647 for ChIP DNA isolated from IFNG-stimulated cells and Alexa555 for ChIP DNA isolated from unstimulated cells) were coupled to the Aminoallyl-dUTP. Coupling reactions were terminated with hydroxylamine. Alexa555- and Alexa647-coupled ChIP DNA samples were combined and recovered using a CyScribe GFX Purification Kit (Amersham #27-9606-02) according to the manufacturers protocol. The recovered probe was further purified by ethanol precipitation with 0.1 volume of 3 M sodium acetate (pH 5.2).
For maskless arrays (Nuwaysir et al. 2002
Microarray hybridizations
ChIP-PET experiment
STAT1 target validations
Comparison of target lists
The authors are grateful to Bing Ren (UCSD) for sharing the PCR product arrays. We thank Janine Mok and Mike Hudson for critical reading of the manuscript. Elsa Eysteinsdottir and Chloe Leplar of NimbleGen Systems of Iceland, LLC provided expert microarray support. This work was funded by NIH ENCODE grant HG003156.
6 These authors contributed equally to this work.
7 Present addresses: PDL BioPharma, Inc., 34801 Campus Drive, Fremont, CA 94555, USA;
8 Stockholm Bioinformatics Center, AlbaNova University Center, Stockholm University, SE-10691 Stockholm, Sweden.
E-mail michael.snyder{at}yale.edu; fax (203) 432-6161. [Supplemental material is available online at www.genome.org.] Article is online at http://www.genome.org/cgi/doi/10.1101/gr.5583007
Bailey, T.L. and Elkan, C. 1995. The value of prior knowledge in discovering motifs with MEME. Proc. Int. Conf. Intell. Syst. Mol. Biol. 3: 2129.[Medline] Bertone, P., Trifonov, V., Rozowsky, J.S., Schubert, F., Emanuelsson, O., Karro, J., Kao, M.Y., Snyder, M., and Gerstein, M. 2006. Design optimization methods for genomic DNA tiling arrays. Genome Res. 16: 271281. Boehm, U., Klamp, T., Groot, M., and Howard, J.C. 1997. Cellular responses to interferon- Boyer, L.A., Lee, T.I., Cole, M.F., Johnstone, S.E., Levine, S.S., Zucker, J.P., Guenther, M.G., Kumar, R.M., Murray, H.L., Jenner, R.G., et al. 2005. Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122: 947956.[CrossRef][Medline] Bromberg, J. and Chen, X. 2001. STAT proteins: Signal transducers and activators of transcription. Methods Enzymol. 333: 138151.[Medline] Buck, M.J. and Lieb, J.D. 2004. ChIP-chip: Considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments. Genomics 83: 349360.[CrossRef][Medline] Cawley, S., Bekiranov, S., Ng, H.H., Kapranov, P., Sekinger, E.A., Kampa, D., Piccolboni, A., Sementchenko, V., Cheng, J., Williams, A.J., et al. 2004. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116: 499509.[CrossRef][Medline] Chen, J. and Sadowski, I. 2005. Identification of the mismatch repair genes PMS2 and MLH1 as p53 target genes by using serial analysis of binding elements. Proc. Natl. Acad. Sci. 102: 48134818. DeRisi, J., Penland, L., Brown, P.O., Bittner, M.L., Meltzer, P.S., Ray, M., Chen, Y., Su, Y.A., and Trent, J.M. 1996. Use of a cDNA microarray to analyse expression patterns in cancer. Nat. Genet. 14: 457460.[CrossRef][Medline] The ENCODE Project Consortium, 2004. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306: 636640. The ENCODE Project Consortium, 2007. Indentification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature (in press). Euskirchen, G., Royce, T.E., Bertone, P., Martone, R., Rinn, J.L., Nelson, F.K., Sayward, F., Luscombe, N.M., Miller, P., Gerstein, M., et al. 2004. CREB binds to multiple loci on human chromosome 22. Mol. Cell. Biol. 24: 38043814. Hartman, S.E., Bertone, P., Nath, A.K., Royce, T.E., Gerstein, M., Weissman, S., and Snyder, M. 2005. Global changes in STAT target selection and transcription regulation upon interferon treatments. Genes & Dev. 19: 29532968. Hegde, P., Qi, R., Abernathy, K., Gay, C., Dharap, S., Gaspard, R., Earle-Hughes, J., Snesrud, E., Lee, N., and Quackenbush, J. 2000. A concise guide to cDNA microarray analysis. Biotechniques 29: 548562.[Medline] Horak, C.E., Mahajan, M.C., Luscombe, N.M., Gerstein, M., Weissman, S.M., and Snyder, M. 2002. GATA-1 binding sites mapped in the Hug, B.A., Ahmed, N., Robbins, J.A., and Lazar, M.A. 2004. A chromatin immunoprecipitation screen reveals protein kinase c Impey, S., McCorkle, S.R., Cha-Molstad, H., Dwyer, J.M., Yochum, G.S., Boss, J.M., McWeeney, S., Dunn, J.J., Mandel, G., and Goodman, R.H. 2004. Defining the CREB regulon: A genome-wide analysis of transcription factor regulatory regions. Cell 119: 10411054.[Medline] Kim, J., Bhinge, A.A., Morgan, X.C., and Iyer, V.R. 2005a. Mapping DNAprotein interactions in large genomes by sequence tag analysis of genomic enrichment. Nat. Methods 2: 4753.[CrossRef][Medline] Kim, T.H., Barrera, L.O., Zheng, M., Qu, C., Singer, M.A., Richmond, T.A., Wu, Y., Green, R.D., and Ren, B. 2005b. A high-resolution map of active promoters in the human genome. Nature 436: 876880.[CrossRef][Medline] Lee, T.I., Jenner, R.G., Boyer, L.A., Guenther, M.G., Levine, S.S., Kumar, R.M., Chevalier, B., Johnstone, S.E., Cole, M.F., Isono, K.I., et al. 2006. Control of developmental regulators by Polycomb in human embryonic stem cells. Cell 125: 301313.[CrossRef][Medline] Levy, D.E. and Darnell Jr., J.E. 2002. Stats: Transcriptional control and biological impact. Nat. Rev. Mol. Cell Biol. 3: 651662.[CrossRef][Medline] Liu, X., Brutlag, D.L., and Liu, J.S. 2001. BioProspector: Discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Pac. Symp. Biocomput. 127138. |