Vol 13, Issue 2, 308-312, February 2003
METHODS
Identification and Functional Analysis of Human Transcriptional Promoters
Nathan D. Trinklein1,
Shelley J. Force Aldred1,
Alok J. Saldanha and
Richard M. Myers2
Department of Genetics, Stanford University School of Medicine,
Stanford, California 94305-5120, USA
 |
ABSTRACT
|
|---|
Genomic and full-length cDNA sequences provide opportunities for
understanding human gene structure and transcriptional regulatory
elements. The simplest regulatory elements to identify are promoters,
as their positions are dictated by the location of transcription start
sites. We aligned full-length cDNA clones from the Mammalian Gene
Collection to the human genome rough draft sequence to estimate the
start sites of more than 10,000 human transcripts. We selected genomic
sequence just upstream from the 5' end of these cDNA sequences and
designated these as putative promoters. We assayed the functions of 152
of these DNA fragments, chosen at random from the entire set, in a
luciferase-based transfection assay in four human cultured cell types.
Ninety-one percent of these DNA fragments showed significant
transcriptional activity in at least one of the cell lines, whereas
89% showed activity in at least two of the lines. We analyzed the
distributions of strengths of these promoter fragments in the different
cell types and identified likely alternative promoters in a large
fraction of the genes. These data indicate that this approach is an
effective method for predicting human promoters and provide the first
set of functional data collected in parallel for a large set of human
promoters.
[Supplemental material is available online at
www.genome.org and http://www-shgc.stanford.edu/myerslab/.]
Gene expression in eukaryotes is a highly coordinated
process involving regulation at many different
levels. The regulation of transcription initiation is an important, and
often the rate-limiting step in this process. Although several types of
cis-acting DNA sequence elements contribute to this
regulation, the simplest elements to locate may be promoters, as they
are located just upstream from transcription start sites. Until
recently, most functional studies of promoters were conducted on a
gene-by-gene basis, yielding such data sets as the Eukaryotic Promoter
Database (http://www.epd.isb-sib.ch/), which contains 300 human
promoters. There have also been recent attempts to identify promoters
on a large-scale with strictly computational methods (Davuluri et al.
2001 ; Ohler and Niemann 2001 ; Down and Hubbard 2002 ).
Due to the availability of the draft sequence of the human genome and
full-length cDNA libraries, an alternative strategy to identify
promoters would be to align full-length cDNA sequences to the human
genome sequence, predict transcription start sites, and identify the
sequences immediately upstream from these start sites. In one such
application of this approach, Suzuki and colleagues used an
oligo-capping approach to enrich for full-length cDNAs (Suzuki and
Sugano 2001 ). They mapped these cDNAs onto the genomic sequence to
predict transcription start sites (TSS) and used these TSSs to infer
the location of potential promoter regions of 1031 human genes (Suzuki
et. al. 2001 ). They compiled this information in the DataBase of Human
Transcriptional Start Sites (DBTSS;
http://elmo.ims.u-tokyo.ac.jp/dbtss/home.html), which currently
contains start sites for 8996 genes (Suzuki et al. 2002 ).
In this study, we used a similar approach to identify putative human
promoters, but also combined it with an experimental reporter gene
transfection assay to test whether the predicted sequences contain
promoter activity. We chose to use the Mammalian Gene Collection (MGC;
Strausberg et al. 1999 ) data set, which is another resource for
putative full-length human cDNA clones. Our data indicate that a very
high fraction of the full-length cDNAs in the MGC data set are directly
adjacent to or very near transcription promoters and that this assay
system, combined with simple computational analysis of genomic and cDNA
sequence information, provides a useful tool for analyzing and
annotating the human genome.
 |
RESULTS AND DISCUSSION
|
|---|
The success of an approach that identifies putative transcriptional
promoters on the basis of their immediate proximity to the 5' ends of
cDNA clones clearly depends on high-quality cDNA sources that contain
full or almost full-length transcripts. Because we wanted to use the
new data set of full-length cDNAs generated by the Mammalian Gene
Collection (http://mgc.nci.nih.gov/index_html), we analyzed the MGC
sequences and compared them with the DBTSS data set. First, we compared
the lengths of every MGC clone with the corresponding clone of the same
gene from the DBTSS data set to see how much overlap and similarity in
length there were between them. We found that only 37.5% of the MGC
clones have a corresponding DBTSS clone. For the MGC clones that have a
corresponding DBTSS clone, 52% have a large transcription start site
discrepancy (>500 bp), and the MGC clone is longer on average by
1000 bp in this category. However, in the remaining group, in which
the clones have no discrepancy or a small TSS discrepancy (<500 bp),
the DBTSS clone is longer on average by only 18.8 bp (see Supplementary
Material). This analysis indicates that, for the 37.5% of cDNA clones
present in both data sets, DBTSS clones are nominally longer when the
discrepancy in TSS is small, but MGC clones are significantly longer
when the discrepancy in TSS is large. Furthermore, the large fraction
of genes represented by MGC clones that are not present in the DBTSS
indicate that the MGC is a rich source of full-length sequences for new
genes. This analysis suggests that these two sources are equally
effective for predicting TSSs.
To identify putative human transcriptional promoters, we aligned the
sequences of the 10,276 full-length MGC cDNA clones to the December,
2001 human draft sequence by using the BLAT algorithm
(http://genome.ucsc.edu/cgi-bin/hgBlat?command=start). We then selected
600 bp of genomic sequence located 550 to +50 relative to the 5' end
of each cDNA clone. If a cDNA is truly full length, we expect this
sequence to include the basal promoter and nearby upstream regulatory
elements. (These sequences and other supplementary material are
available at http://www-shgc.stanford.edu/myerslab/ and
www.genome.org.)
To determine whether these DNA segments contain transcriptional
promoters, we selected 152 of these fragments at random from the list
of 10,276 and tested them for their ability to drive transcription of a
plasmid-based luciferase reporter gene in cultured cell transfection
assays. We amplified each promoter with PCR from the genome and
directionally cloned the amplified fragments into a plasmid vector
upstream from the luciferase gene. We transfected each
promoter/luciferase construct individually, along with a control
plasmid expressing the renilla gene, into four human cell lines, an
embryonic kidney cell line (293), a cervical carcinoma line (HeLa), a
hepatocyte line (HuH7), and a fibrosarcoma line (HT1080). We determined
the strengths of these putative promoters by measuring the luciferase
to renilla ratio, allowing us to control for transfection efficiency.
Analysis of duplicate experiments indicates that these data are highly
reproducible (see Methods). We normalized the values within each cell
type by the mean of the negative controls, so that the measured
strength of each promoter indicates its fold increase over negative
control activity. We defined a fragment as a promoter when its strength
value was greater than three standard deviation units above the mean of
the negative controls for each cell type independently.
With this conservative approach, we found that 91% of the 152 DNA
fragments drove the expression of the luciferase reporter gene in at
least one of the four cell types, and 89% drove luciferase expression
in at least two of the four cell types (Table
1). A total of 14 of the 152 upstream DNA
segments that we tested did not produce luciferase signals above the
threshold set by our negative controls in any of the four cultured cell
lines. These results could be due to a variety of reasons, including
the possibility that some MGC cDNA sequences are not full length (and
therefore the adjacent segments we chose from the genomic sequence are
within the transcribed parts of the genes). Negative results could also
come from true promoters if they are very weak, or promoters that are
inactive in the cell types we assayed. These results suggest strongly
that the vast majority of the sequenced full-length cDNA clones in the
MGC set have a 5' end very near or at a natural start site of
transcription for their corresponding gene, and that the sequences just
upstream from these 5' ends have promoter activity.
Because there are differences in TSS predictions between the MGC data
set and the data sets in RefSeq and DBTSS, we stratified the 152
experimental promoters on the basis of the TSS discrepancies (Table
2). In cases in which the MGC clone is the
same length as or longer than sequences in RefSeq or DBTSS, we
designate the TSS difference as zero. Of the 152 MGC clones, 94 were
shorter than the corresponding cDNA sequences in RefSeq or DBTSS. Of
those 94, 81 had predicted TSSs in the 5' UTR of the longest
transcript, and only 2 of these 81 failed to show promoter activity in
our assay. The remaining 13 of the 94 clones corresponded to a putative
TSS that interrupts the ORF predicted by the longest transcript.
Surprisingly, 6 of these 13 had significant promoter activity in our
assay. If these are true biological promoters, they would produce a
truncated or entirely different protein than that encoded by the
longest transcript.
Eighty-two percent of our 152 experimental promoter fragments are based
on TSSs that lie within 500 bp of the TSS predicted by the longest cDNA
available from all sources. Therefore, these fragments are likely to
contain some or all of the basal promoter elements just upstream of the
longest transcript. However, those sequences with TSS differences of
500 bp or more are important to study, as 20/28 of these (14% of all
the positives) have promoter activity in our assay. These fragments are
likely to function as alternative promoters (Ayoubi and Van de Ven
1996 ). For some of the cDNA clones in this group, including
dystrobrevin (DTNA) and ribosomal proteins S10
(RPS10) and L3 (RPL3), shorter alternative
transcripts corresponding to the TSS we used have been identified
experimentally in reports by other groups or are supported by the
presence of many independently obtained transcripts in GenBank
(Sadoulet-Puccio et al. 1997 ; Kenmochi et al. 1998 ; Holzfeind et al.
1999 ). Interestingly, these alternative promoters are weaker, on
average, in our assay system.
Those fragments with less than a 500-bp TSS discrepancy may also
correspond to alternative promoters that regulate different
transcription start sites. For example, the human C4b-binding protein
-chain gene (C4BPB) has two distinct mRNA species with TSSs
that differ by only 376 bp (Hillarp et al. 1993 ), one of which
corresponds to the TSS used in this study. In another example, the same
shorter transcripts that we used for ribosomal protein large P0
(RPLP0) are also annotated in DBTSS and RefSeq. Thus, although
it might seem prudent to disregard sequences from full-length cDNA
libraries that have longer clones in GenBank, our results indicate that
doing this would incorrectly remove a substantial number of alternative
promoters. These findings suggest that a significant number of human
genes use one or more alternative promoters. The use of alternative
splicing, along with alternative promoters, provides two mechanisms for
greatly increasing the diversity of transcripts.
We examined the strengths of all 138 positive promoters in each of the
4 cell types. Not surprisingly, these results indicate that there is a
wide range of apparent strengths among these promoters. Furthermore,
the distributions of the promoter strengths appear to be non-normal for
each cell type, and each cell type has a different distribution (Fig.
1). These results suggest that there are
multiple classes of promoters within this set. The multi-modality of
promoter strengths seen within a single cell type may be biologically
relevant, but we cannot exclude the possibility that this is an
artifact of our functional assay. The different classes may represent
promoters for which we captured only the basal promoter versus groups
for which we isolated the basal promoter and some other upstream
regulatory elements. However, we observed no obvious correlation
between promoter strengths in our functional assay and the
discrepancies in TSS predictions.

View larger version (48K):
[in this window]
[in a new window]
|
Figure 1. Distribution of promoter strengths. The distribution of the promoter
strengths for the 138 positive clones of the 152 we tested is shown at
top on a log scale in the 4 cell types tested. The number of
promoters that fall within each bin is shown on the Y axis, and bin
boundaries are denoted on the X-axis. We calculated promoter strength
as a fold increase of luciferase activity over the negative controls in
a given cell type. The black bars indicate promoters that fall above
our threshold value for a functional promoter, and the white bars
indicate those below that threshold (see Methods). Bins that contain
both positive and negative promoters have boundaries that span the
threshold value.
|
|
The differences we observe in each individual promoter's strength in
the four cell types are less likely to be an experimental artifact,
considering that we used the same reporter construct in each cell line
and used an internal transfection control. To measure the variability
of the activity of a promoter independent of its strength, we
calculated the coefficient of variance (CV, the standard deviation
divided by the mean) for each promoter in each of the four cell lines.
The distribution of the CVs for the 138 fragments that showed promoter
activity demonstrated a possible minor mode at the tail containing the
highest coefficient of variance (Fig. 2).
Of the genes with promoters with the lowest CV, several are known to
exhibit widespread expression in many tissue types. Conversely, of
those promoters with the highest CV, several are known to be more
restricted in their distribution of expression. For example, the
ferritin (FTH1) transcript has been detected in 211 distinct
cell types (Unigene; http://www.ncbi.nlm.nih.gov/UniGene/), and its
promoter had a CV of 0.29 in our analysis. Additionally, mannosidase
(MAN2B1) expression has been measured in 84 distinct cell
types, and its promoter had a CV of 0.24 (Unigene). In contrast, the
kallikrein 5 (KLK5) transcript, which had a high CV (1.65), is
expressed primarily in breast, brain, and testes (Yousef and Diamandis
1999 ). Likewise, the synovial sarcoma (SSX4) transcript has
been detected only in the bone, foreskin, and a carcinoma cell line,
and its CV was 1.59 (Unigene). Therefore, although this reporter gene
transfection system is artificial, it appears not only to verify a
large fraction of human promoters, but also to maintain some aspects of
cell type-specific regulation.

View larger version (25K):
[in this window]
[in a new window]
|
Figure 2. Distribution of the variance of promoter strength. We calculated the
coefficient of variance for each individual promoter in the four cell
types to estimate the variance of promoter strength. By measuring the
standard deviation relative to the mean, we compared the variation
between both strong and weak promoters. The promoters in the
left tail (low variance) and right tail (high
variance) of the distribution may be more likely to have constitutive
and cell type-specific activities, respectively.
|
|
Because this sampling indicates that at least 90% of the DNA fragments
are likely to be functional promoters, we analyzed the 600-bp sequence
of all 10,276 putative promoters that we derived from the MGC dataset.
The overall GC content of this large set of DNA fragments is 57%. We
found that 27% of these fragments contain a strict TATA-box sequence
(TATA[T/A][T/A]) and 65% include a less-strict TATA-box sequence
(TA[T/A][T/A] [T/A][T/A]) (Table
3). Although the TATA-box is often located
25 to 30 bases upstream of the transcription start site, we chose to
look for the element in the entire 600 bp of each fragment because of
the uncertainty of the exact location of the transcription start site.
The percentage of sequences with a TATA-box in our dataset is smaller
than the percentage of human promoters containing a TATA-box in the
Eukaryotic Promoter Database (51% strict and 76% less strict),
indicating that TATA elements may not be as prevalent in human
promoters as suggested by the EPD. Interestingly, when we stratified
the promoters into three groups based on GC content, there were fewer
strict TATA elements than expected by chance in the 25%45% GC
group, and there were more TATA elements than expected by chance in the
groups that are 45%65% GC and 65%85% GC (Table 3). We obtained
similar results when we analyzed the 152 experimental fragments,
suggesting that this smaller set is representative of the entire set.
The slight increase of promoters in the range of 45%65% GC in our
experimental set is likely due to the bias of PCR primer design and
amplification efficiency, which is generally more successful on
sequences that are 50% GC (Table 3). We observe no correlation
between GC content and measured promoter strength, or between the
presence of the TATA element and promoter strength.
The work described here provides a large collection of DNA fragments
that contain a significant fraction of the transcriptional promoters of
human genes. These data are a starting point for studying transcription
initiation of human genes on a global scale and provide information
that will be helpful in annotating the functional elements of the human
genome.
 |
METHODS
|
|---|
Promoter Prediction and Negative Controls
We aligned full-length cDNA clones from the Mammalian Gene
Collection (MGC) to the human rough draft sequence by using the BLAT
algorithm (http://genome.ucsc.edu/cgi-bin/hgBlat?command=start). We
collected the predicted promoter sequence 550 to +50 from the 5' end
of the cDNA. We also collected four sequences from the last exons of
four random brain-specific genes, and seven random nonrepetitive
intergenic sequences to serve as negative controls. The complete
predicted promoter dataset, the sequences of promoters that we tested
experimentally, and the sequences of negative controls are available as
supplementary material at http://www.shgc.stanford.edu/myerslab/ and
http://www.genome.org.
PCR Amplification and Digestion
We designed primers to amplify a 500-bp product from a set of 152
of the DNA fragments that we selected randomly from the set of 10,276
putative promoters. BglII and MluI restriction sites
were included at the 5' end of the forward and reverse primers,
respectively, to maintain the promoter orientation while cloning into
the luciferase test vector. We cleaned up the PCR reactions by using
the Qiaquick 96-well cleanup kit (QIAGEN), and digested the products to
generate sticky ends for cloning.
Cloning and Vector Preparation
We ligated each digested product upstream to the luciferase gene in
the pGL3-basic vector (Promega) and transformed each ligation into
Top10 chemically competent bacteria (Invitrogen). To make this as
high-throughput as possible, we performed the PCR reactions,
digestions, ligations, and transformations in 96-well format. After
plating each transformation reaction, we picked colonies, tested for
the proper insert, and then grew each clone as an overnight culture. We
then made minipreps (QIAGEN) of each culture with an individual column,
measured the concentration of DNA in each sample by UV absorption, and
diluted each to a concentration of 100 ng/microliter.
Transfection and Luciferase Assay
To control for transfection efficiency, we cotransfected 100 ng of
each experimental luciferase plasmid with 8 ng of the
renilla-containing pRL-TK control plasmid (Promega) into HeLa, 293,
HuH7, and HT1080 human cells (ATCC) by using the FuGene6 Lipofectamine
Reagent (Roche). HeLa and HuH7 cells were at 80% confluence, 293 and
HT1080 were at 50% confluence at the time of transfection. After 24 h,
we prepared lysates from each transfection, and assayed luciferase and
renilla activity in a 96-well plate luminometer (Wallace) according to
the protocol in the Dual Luciferase Kit (Promega). Cells were grown in
96-well plates and the transfections and luciferase assays were done in
96-well format.
Data Analysis
Luciferase to renilla ratios are available on the Supplementary
Information page of Genome Research online. We determined the
promoter strength of each DNA fragment by calculating the ratio of
luciferase signal to renilla signal from each transfection to control
for well-to-well variation in transfection efficiency. We then divided
each promoter strength value by the mean of the negative controls in a
given cell type so that a normalized promoter strength is a measure of
its fold increase in activity over background. Finally, for each cell
type, we calculated the standard deviation of the negative control
values, and set our threshold for a positive signal as three standard
deviation units above the mean of the negatives within each cell type.
Therefore, we can be 99.7% confident that any values beyond this
threshold are positive promoter signals in that cell type.
Twelve representative fragments were tested in duplicate in each of
four cell lines to determine the degree of experimental variation in
the data. The average coefficient of variance was 0.105, indicating
that our results are highly reproducible. This suggests that most of
the differences observed in promoter strengths between cell lines are
not due to experimental variation.
Sequence Analysis
We calculated the GC content of each promoter, and then grouped
them into three classes: 25%45%, 45%65%, and 65%85% GC. We
then determined the nucleotide frequency within each group and
calculated the probability of finding at least one randomly occurring
TATA element per promoter. Next, we searched our experimental and total
datasets to find the number of promoter fragments with at least one of
the TATA elements. We then used the 2 test to determine
whether the observed frequencies differed from the expected at a
significance cutoff of P <0.05.
 |
WEB SITE REFERENCES
|
|---|
http://elmo.ims.u-tokyo.ac.jp/dbtss/home.html; DBTSS.
http://genome.ucsc.edu/cgi-bin/hgBlat?command=start; UCSC Genome
Bioinformatics Site, BLAT.
http://mgc.nci.nih.gov/index_html; Mammalian Gene Collection.
http://www.epd.isb-sib.ch/; Eukaryotic Promoter Database.
http://www.ncbi.nlm.nih.gov/UniGene/; Unigene.
http://www-shgc.stanford.edu/myerslab/; Supplementary Material.
 |
Acknowledgements
|
|---|
We thank members of the Myers Laboratory for discussions and
support, and Jeremy Schmutz at the Stanford Human Genome Center for
helpful advice. This work was supported by the Stanford Genome Training
Program (Training Grant NIH 5 T32 HG00044 to N.D.T.), a Geraldine
Jackson Fuhrman Stanford Graduate Fellowship (to S.F.A.), and a
National Defense Science and Engineering Graduate Fellowship (to A.S.).
The publication costs of this article were defrayed in part by payment
of page charges. This article must therefore be hereby marked
"advertisement" in accordance with 18 USC section 1734 solely to
indicate this fact.
 |
Footnotes
|
|---|
1 These two authors contributed equally to this work. 
2 Corresponding author. 
E-MAIL myers{at}shgc.stanford.edu; FAX (650) 725-9689.
Article and publication are at
http://www.genome.org/cgi/doi/10.1101/gr.794803.
 |
REFERENCES
|
|---|
Ayoubi, T.A.Y. and Van de Ven, W.J.M. 1996. Regulaton of gene expression by alternative promoters. FASEB J. 10: 453-460.[Abstract]
Davuluri, R.V., Grosse, I., and Zhang, M.Q. 2001. Computational identification of promoters and first exons in the human genome. Nat. Genet. 29: 412-417.[CrossRef][Medline]
Down, T.A. and Hubbard, T.J. 2002. Computational detection and location of transcription start sites in mammalian genomic DNA. Genome Res. 12: 458-461.[Abstract/Free Full Text]
Hillarp, A., Pardo-Manuel, F., Ruiz, R.R., Rodriguez de Cordoba, S., and Dahlback, B. 1993. The human C4b-binding protein -chain gene. J. Biol. Chem. 268: 15017-15023.[Abstract/Free Full Text]
Holzfeind, P.J., Ambrose, H.J., Newey, S.E., Nawrotzki, R.A., Blake, D.J., and Davies, K.E. 1999. Tissue-selective expression of -dystrobrevin is determined by multiple promoters. J. Biol. Chem. 274: 6250-6258.[Abstract/Free Full Text]
Kenmochi, N., Kawaguchi, T., Rozen, S., Davis, E., Goodman, N., Hudson, T.J., Tanaka, T., and Page, D.C. 1998. A map of 75 human ribosomal protein genes. Genome Res. 8: 509-523.[Abstract/Free Full Text]
Ohler, U. and Niemann, H. 2001. Identification and analysis of eukaryotic promoters: Recent computational approaches. Trends Genet. 17: 56-60.[CrossRef][Medline]
Sadoulet-Puccio, H.M., Feener, C.A., Schaid, D.J., Thibodeau, S.N., Michels, V.V., and Kunkel, L.M. 1997. The genomic organization of human dystrobrevin. Neurogenetics 1: 37-42.[CrossRef][Medline]
Strausberg, R.L., Feingold, E.A., Klausner, R.D., and Collins, F.S. 1999. The mammalian gene collection. Science 286: 455-457.[Abstract/Free Full Text]
Suzuki, Y. and Sugano, S. 2001. Construction of full-length-enriched cDNA libraries. The oligo-capping method. Methods Mol. Biol. 175: 143-153.[Medline]
Suzuki, Y., Tsunoda, T., Sese, J., Taira, H., Mizushima-Sugano, J., Hata, H., Ota, T., Isogai, T., Tanaka, T., Nakamura, Y., et al. 2001. Identification and characterization of the potential promoter regions of 1031 kinds of human genes. Genome Res. 11: 677-684.[Abstract/Free Full Text]
Suzuki, Y., Yamashita, R., Nakai, K., and Sugano, S. 2002. DBTSS: DataBase of human transcriptional start sites and full-length cDNAs. Nucleic Acids Res. 30: 328-331.[Abstract/Free Full Text]
Yousef, G.M. and Diamandis, E.P. 1999. The new kallikrein-like gene, KLK-L2. Molecular characterization, mapping, tissue expression, and hormonal regulation. J. Biol. Chem. 274: 37511-37516.[Abstract/Free Full Text]
Received September 10, 2002;
accepted in revised format December 3, 2002.
13:308-312 © by 2003 Cold Spring Harbor Laboratory Press ISSN 1088-9051/03 $5.00

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
M. M.R. Petit, H. Lindskog, E. Larsson, P. Wasteson, E. Athley, S. Breuer, M. Angstenberger, D. Hertfelder, E. Mattsson, A. Nordheim, et al.
Smooth Muscle Expression of Lipoma Preferred Partner Is Mediated by an Alternative Intronic Promoter That Is Regulated by Serum Response Factor/Myocardin
Circ. Res.,
July 3, 2008;
103(1):
61 - 69.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Yaragatti, C. Basilico, and L. Dailey
Identification of active transcriptional regulatory modules by the functional assay of DNA from nucleosome-free regions
Genome Res.,
June 1, 2008;
18(6):
930 - 938.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. M. Cooper and C. D. Brown
Qualifying the relationship between sequence conservation and molecular function
Genome Res.,
February 1, 2008;
18(2):
201 - 205.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
O. L. Griffith, S. B. Montgomery, B. Bernier, B. Chu, K. Kasaian, S. Aerts, S. Mahony, M. C. Sleumer, M. Bilenky, M. Haeussler, et al.
ORegAnno: an open-access community-driven resource for regulatory annotation
Nucleic Acids Res.,
January 11, 2008;
36(suppl_1):
D107 - D113.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. X. L. Zhang, T. R. Searcy, Y. Wu, D. Gozal, and Y. Wang
Alternative promoter usage and alternative splicing contribute to mRNA heterogeneity of mouse monocarboxylate transporter 2
Physiol Genomics,
December 19, 2007;
32(1):
95 - 104.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. Chabot, R. A. Shrit, R. Blekhman, and Y. Gilad
Using Reporter Gene Assays to Identify cis Regulatory Differences Between Humans and Chimpanzees
Genetics,
August 1, 2007;
176(4):
2069 - 2076.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
N. D. Trinklein, U. Karaoz, J. Wu, A. Halees, S. Force Aldred, P. J. Collins, D. Zheng, Z. D. Zhang, M. B. Gerstein, M. Snyder, et al.
Integrated analysis of experimental data sets reveals many novel promoters in 1% of the human genome
Genome Res.,
June 1, 2007;
17(6):
720 - 731.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. G. Giresi, J. Kim, R. M. McDaniell, V. R. Iyer, and J. D. Lieb
FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) isolates active regulatory elements from human chromatin
Genome Res.,
June 1, 2007;
17(6):
877 - 885.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. Sakakibara, T. Irie, Y. Suzuki, R. Yamashita, H. Wakaguri, A. Kanai, J. Chiba, T. Takagi, J. Mizushima-Sugano, S.-i. Hashimoto, et al.
Intrinsic Promoter Activities of Primary DNA Sequences in the Human Genome
DNA Res,
May 23, 2007;
(2007)
dsm006v1.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. A. Muller, F. Heissig, and K. Engeland
Chimpanzee, Orangutan, Mouse, and Human Cell Cycle Promoters Exempt CCAAT Boxes and CHR Elements from Interspecies Differences
Mol. Biol. Evol.,
March 1, 2007;
24(3):
814 - 826.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. Baek, C. Davis, B. Ewing, D. Gordon, and P. Green
Characterization and predictive discovery of evolutionarily conserved mammalian alternative promoters
Genome Res.,
February 1, 2007;
17(2):
145 - 155.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Elnitski, V. X. Jin, P. J. Farnham, and S. J.M. Jones
Locating mammalian transcription factor binding sites: A survey of computational and experimental techniques
Genome Res.,
December 1, 2006;
16(12):
1455 - 1464.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
W. Sun, J. Hong, Y. C. Q. Zang, X. Liu, and J. Z. Zhang
Altered expression of vasoactive intestinal peptide receptors in T lymphocytes and aberrant Th1 immunity in multiple sclerosis
Int. Immunol.,
December 1, 2006;
18(12):
1691 - 1700.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Liu and U. Francke
Identification of cis-regulatory elements for MECP2 expression
Hum. Mol. Genet.,
June 1, 2006;
15(11):
1769 - 1782.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. F. Yin, B. L. Fan, B. Yang, Y. F. Liu, J. Luo, X. H. Tian, and N. Li
Cloning of pig parotid secretory protein gene upstream promoter and the establishment of a transgenic mouse model expressing bacterial phytase for agricultural phosphorus pollution control
J Anim Sci,
March 1, 2006;
84(3):
513 - 519.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B. M. Hemminger, B. Saelim, and P. F. Sullivan
TAMAL: an integrated approach to choosing SNPs for genetic studies of human complex traits
Bioinformatics,
March 1, 2006;
22(5):
626 - 627.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. A. Rollins, F. Haghighi, J. R. Edwards, R. Das, M. Q. Zhang, J. Ju, and T. H. Bestor
Large-scale structure of genomic methylation patterns
Genome Res.,
February 1, 2006;
16(2):
157 - 163.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. J. Cooper, N. D. Trinklein, E. D. Anton, L. Nguyen, and R. M. Myers
Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome
Genome Res.,
January 1, 2006;
16(1):
1 - 10.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. Parra, A. Reymond, N. Dabbouseh, E. T. Dermitzakis, R. Castelo, T. M. Thomson, S. E. Antonarakis, and R. Guigo
Tandem chimerism as a means to increase protein complexity in the human genome
Genome Res.,
January 1, 2006;
16(1):
37 - 44.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. E. Crawford, I. E. Holt, J. Whittle, B. D. Webb, D. Tai, S. Davis, E. H. Margulies, Y. Chen, J. A. Bernat, D. Ginsburg, et al.
Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS)
Genome Res.,
January 1, 2006;
16(1):
123 - 131.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B. Giardine, C. Riemer, R. C. Hardison, R. Burhans, L. Elnitski, P. Shah, Y. Zhang, D. Blankenberg, I. Albert, J. Taylor, et al.
Galaxy: A platform for interactive large-scale genome analysis
Genome Res.,
October 1, 2005;
15(10):
1451 - 1455.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. Pizzi, S. Bortoluzzi, A. Bisognin, A. Coppe, and G. A. Danieli
Detecting seeded motifs in DNA sequences
Nucleic Acids Res.,
September 1, 2005;
33(15):
e135 - e135.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. C. King, J. Taylor, L. Elnitski, F. Chiaromonte, W. Miller, and R. C. Hardison
Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences
Genome Res.,
August 1, 2005;
15(8):
1051 - 1060.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
O. V. Vishnevsky and N. A. Kolchanov
ARGO: a web system for the detection of degenerate motifs and large-scale recognition of eukaryotic promoters
Nucleic Acids Res.,
July 1, 2005;
33(suppl_2):
W417 - W422.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. H. Kim, L. O. Barrera, C. Qu, S. Van Calcar, N. D. Trinklein, S. J. Cooper, R. M. Luna, C. K. Glass, M. G. Rosenfeld, R. M. Myers, et al.
Direct isolation and identification of promoters in the human genome
Genome Res.,
June 1, 2005;
15(6):
830 - 839.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. Testa, G. Donati, P. Yan, F. Romani, T. H.-M. Huang, M. A. Vigano, and R. Mantovani
Chromatin Immunoprecipitation (ChIP) on Chip Experiments Uncover a Widespread Distribution of NF-Y Binding CCAAT Sites Outside of Core Promoters
J. Biol. Chem.,
April 8, 2005;
280(14):
13606 - 13615.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Elnitski, B. Giardine, P. Shah, Y. Zhang, C. Riemer, M. Weirauch, R. Burhans, W. Miller, and R. C. Hardison
Improvements to GALA and dbERGE II: databases featuring genomic sequence alignment, annotation and experimental results
Nucleic Acids Res.,
January 1, 2005;
33(suppl_1):
D466 - D470.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. C. FitzGerald, A. Shlyakhtenko, A. A. Mir, and C. Vinson
Clustering of DNA Sequences in Human Promoters
Genome Res.,
August 1, 2004;
14(8):
1562 - 1574.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
N. D. Trinklein, J. I. Murray, S. J. Hartman, D. Botstein, and R. M. Myers
The Role of Heat Shock Transcription Factor 1 in the Genome-wide Regulation of the Mammalian Heat Shock Response
Mol. Biol. Cell,
March 1, 2004;
15(3):
1254 - 1261.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Marino-Ramirez, J. L. Spouge, G. C. Kanga, and D. Landsman
Statistical analysis of over-represented words in human promoter sequences
Nucleic Acids Res.,
February 12, 2004;
32(3):
949 - 958.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|