|
|
|
Published online before print
April 12, 2002, 10.1101/gr.225502. Article published online before print in April 2002
Vol. 12, Issue 5, 832-839, May 2002
METHODS
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |
ABSTRACT |
|---|
|
|
|---|
Identifying transcriptional regulatory elements represents a significant challenge in annotating the genomes of higher vertebrates. We have developed a computational tool, rVISTA, for high-throughput discovery of cis-regulatory elements that combines clustering of predicted transcription factor binding sites (TFBSs) and the analysis of interspecies sequence conservation to maximize the identification of functional sites. To assess the ability of rVISTA to discover true positive TFBSs while minimizing the prediction of false positives, we analyzed the distribution of several TFBSs across 1 Mb of the well-annotated cytokine gene cluster (Hs5q31; Mm11). Because a large number of AP-1, NFAT, and GATA-3 sites have been experimentally identified in this interval, we focused our analysis on the distribution of all binding sites specific for these transcription factors. The exploitation of the orthologous human-mouse dataset resulted in the elimination of >95% of the ~58,000 binding sites predicted on analysis of the human sequence alone, whereas it identified 88% of the experimentally verified binding sites in this region.
| |
INTRODUCTION |
|---|
|
|
|---|
A major challenge of the postgenome-sequencing era is decoding
the regulatory networks underlining gene expression.
In eukaryotes, modulation of gene expression is achieved through the
complex interaction of regulatory proteins (trans-factors)
with specific DNA regions (cis-acting regulatory sequences).
Intensive efforts over several decades have identified numerous
regulatory proteins, transcription factors (TF), whose
sequence-specific DNA binding activity is central to transcriptional
regulation. Traditionally, DNA binding specificity of many TFs has been
experimentally determined primarily with in vitro techniques such as
DNase I footprinting and electromobility shift assay (EMSA) (Rooney et
al. 1995
). Recently, alternative techniques such as expression DNA
microarrays, in silico oligonucleotide binding, and phylogenetic
footprinting have been adopted to identify DNA targets for TFs (Fickett
and Wasserman 2000
).
Unfortunately, despite the fact that the binding sites of many TF have
been experimentally defined, most TFs bind to short (6-12 base pairs
[bp]), degenerate sequence motifs that occur very frequently in the
human genome. The binding specificities of these factors can be
summarized as position weight matrices (PWM) (Heinemeyer et al. 1998
)
that are compiled in various databases such as the TRANSFAC database
(http://www.biobase.de) (Wingender et al. 2001
). Pattern-recognition
programs such as MATCH or MatInspector
(Quandt et al. 1995
) use these libraries of TF-PWMs to identify
significant matches in DNA sequences. A major confounding factor in the
use of PWMs to identify transcription factor binding sites (TFBSs) is
that only a very small fraction of predicted binding sites are
functionally significant. Accordingly, the use of PWMs has proved to be
a poor resource for sequence-based discovery of biologically relevant
regulatory elements (Fickett and Wasserman 2000
).
In complex organisms, gene expression results from the cooperative
action of many different proteins exerting different effects in time
and space. Multiple TFs are simultaneously required to cooperatively
activate and modulate eukaryotic gene expression (Berman et al. 2002
).
One potential avenue for improving the discovery of functional
regulatory elements is to identify multiple TFBSs that are specifically
clustered together (Wagner 1999
). This strategy has been successfully
implemented in the analysis of regulatory regions involved in muscle
(Wasserman and Fickett 1998
) and liver-specific gene expression (Krivan
and Wasserman 2001
).
An additional powerful strategy that has been shown to counter the
large numbers of false positives derived from the analysis of sequences
from a single organism is the use of multispecies comparative sequence
alignments or phylogenetic footprinting (Gumucio et al. 1996
; Hardison
et al. 1997b
; Duret and Bucher 1997
; Levy et al. 2001
).
Several recent studies have shown that noncoding regulatory sequences
tend to be evolutionarily conserved and support the use of comparative
genomics as an extremely effective tool for the discovery of
biologically active gene regulatory elements (Hardison et al. 1997a
;
Oeltjen et al. 1997
; Hardison et al. 2000
; Loots et al. 2000
; Wasserman
et al. 2000
). The computational algorithms developed to perform
comparative sequence analysis are based either on local alignments
(BLAST [Altschul et al. 1990
]; PIPMaker [Schwartz et al. 2000
]) or on global alignments (AVID; VISTA [Mayor et al., 2000
;]), both of which have proved very efficient in detecting regions of high DNA conservation.
To facilitate the efficient and accurate identification of regulatory
sequences in large genomic intervals from complex organisms, we have
developed a computational tool, Regulatory VISTA (rVISTA: http://pga.lbl.gov/rvista.html), that enriches for evolutionarily conserved TFBSs. rVISTA uses
orthologous sequence analysis and clustering to overcome some of the
limitations associated with TFBS predictions of sequences derived from
a single organism. Here we introduce the rVISTA program
and illustrate its ability to identify functional TFBSs as it
dramatically reduces the total number of AP-1, NFAT, and GATA-3 sites
predicted in a ~1-Mb genomic interval of the well-annotated cytokine
gene cluster (Hs5q31; Mm11) (Frazer et al. 1997
; Wenderfer et al.
2000
).
| |
RESULTS |
|---|
|
|
|---|
Computational Design of the rVISTA Program
To take advantage of combining sequence motif recognition and multiple sequence alignment of orthologous regions in an unbiased manner, rVISTA analysis proceeds in four major steps: (1) identification of TFBS matches in the individual sequences, (2) identification of globally aligned noncoding TFBSs, (3) calculation of local conservation extending upstream and downstream from each orthologous TFBS, and (4) visualization of individual or clustered noncoding TFBSs (Fig. 1). The program uses available PWMs in the TRANSFAC database and independently locates all TFBS matches in each sequence with the MATCH program. A global alignment generated by the AVID program (http://bio.math.berkeley.edu/avid/) and the corresponding sequence annotations are used to identify aligned TFBS matches in noncoding genomic intervals.
|
An aligned TFBS represents a region in the global alignment that corresponds to identical TFBS matches in each orthologous sequence. Orthologous regions correspond to similar DNA sequences from different species that arose from a common ancestral gene during speciation and are likely to be involved in similar biological functions. Because the global alignment forces two closely related sequences to generate the best possible pairwise alignment by introducing gaps, an aligned TFBS site can be present in a region of poor DNA conservation that is below 80% ID. To identify TFBSs present in regions of high DNA conservation, the "hula hoop" component of the algorithm calculates DNA conservation for each aligned TFBS as percent identity (% ID) over a dynamically shifting window of 21 bp that centers on a nucleotide inside the TFBS with the maximum % ID. This process identifies TFBSs located at the edges of highly conserved sequences that would falsely fall below the established conservation criteria threshold if the DNA conservation was determined by a static DNA window perfectly centered on the TFBS alignment.
By use of the same principle, rVISTA calculates the
maximum DNA conservation over larger DNA segments (up to 201 bp)
facilitating the identification of sites present in larger, highly
conserved regions. The rVISTA algorithm generates two
types of outputs: (1) a static data table with detailed statistics for
all aligned TFBSs and (2) a dynamic web-interactive module that allows
the user to customize the data for unfiltered, aligned, or conserved
TFBS sites and graphically visualize them as colored tick marks.
Visualized conserved binding sites fit the criteria of
80% ID over a
21-bp region.
Combinatorial Analysis of TFBSs with Multiclustering
Detailed molecular analyses addressing the architecture of complex
regulatory regions in higher eukaryotes have established that the
majority of transcriptional control elements such as enhancers and
repressors represent a conglomerate of multiple TFBSs that act in
concordance to directly modulate the expression patterns of the linked
genes (Pilpel et al. 2001
). In addition, it has been observed that
regulatory elements involved in similar physiological functions, such
as the enhancement of liver-specific genes (Krivan and Wasserman 2001
),
are associated with distinct patterns of coordinate TF binding. These
regulatory regions are frequently present in clusters of two or more
repeated sites for the same TF or in combinatorial clusters of two or
more adjacent sites belonging to unique regulatory proteins that act
together to modulate gene expression (Fickett and Wasserman 2000
;
Pilpel et al. 2001
; Berman et al. 2002
).
To analyze combinations of multiple TFBSs and identify TF binding patterns that control gene expression in novel sequences, rVISTA calculates the distance between all neighboring TFBSs and allows the user to perform customized clustering of individual or multiple unique transcription factors. One clustering module allows the user to selectively cluster two or more sites of the same TF present in regions of user-defined lengths (Fig 2A, B), facilitating the identification of evolutionarily conserved elements that harbor multiple clusters of various unrelated TFBSs. A second clustering module allows the user to identify groups of multiple TFBSs present in DNA segments of user-specified length (Fig 2C).
|
Collection of Experimental Data and Validation of rVISTA
To evaluate the biological significance of TFBS data generated by
the rVISTA algorithm, we analyzed ~1 Mb of a
well-annotated cytokine gene cluster (Hs5q31; Mm11) (IL-3; IL-4;
IL-5; IL-13; IRF-1; GM-CSF) (Frazer et al. 1997
; Wenderfer et al.
2000
) plus the intensively characterized cytokine 2 (IL-2)
promoter region (Hs4q26; Mm3) (Rooney et al. 1995
). Cytokines are of
particular biomedical importance because they augment the growth and
differentiation of T helper cell subsets and have been directly
implicated in having a major role in determining susceptibilities to
asthma phenotypes and inflammatory disorders (Lacy et al. 2000
;
O'Garra and Arai 2000
). As such, much interest has focused on the
regulatory mechanisms by which naive helper CD4+ T cells establish
their cytokine repertoires, events that are predominantly regulated at
the transcriptional level.
Because of the vast interest in understanding the regulation of
cytokines, we focused our analysis on transcription factors known to
transcriptionally activate these genes. One of the best known examples
of cooperative binding is the NFAT : AP-1 TF complex that has been
described for genes involved in various immune responses. NFAT and AP-1
synergistically form stable complexes with DNA sequences that contain
composite elements of adjacent NFAT and AP-1 TFBSs to induce the
expression of genes (Macian et al. 2001
). We have compiled a
representative collection of AP-1 and NFAT experimentally defined TFBSs
(Table 1) from the published data on this
~1-Mb interval and used it to examine the ability of
rVISTA to identify true TFBSs. By analyzing ~925 kb
noncoding human sequence independent of the mouse sequence, the
MATCH program predicted 23,457 AP-1 and 14,900 NFAT sites
with the PWMs available in the TRANSFAC database for these
transcription factors (parameters: 0.75/0.8). A comparable number of
sites were independently predicted for the orthologous mouse sequence.
Among the large number of predicted AP-1 and NFAT sites for the human
sequence were also included 17 of the 19 functional AP-1 sites and 19 of the 21 functional NFAT sites (Fig. 3A).
The omitted AP-1 and NFAT functional sites failed to meet the TRANSFAC
default parameters.
|
|
Subjecting the orthologous human and mouse sequences to
rVISTA analysis reduced the total number of predicted AP-1 and NFAT sites by >95%, identifying 1114 conserved AP-1 and 734 conserved NFAT sites. rVISTA also identified 16 of the 19 AP-1 and 19 of the 21 functionally characterized NFAT sites. Whereas
only 4.5% of the total NFAT and AP-1 predicted sites for the human
sequence were conserved in the orthologous mouse sequence, in sharp
contrast, 88% of the experimentally defined TFBSs were present in
highly conserved DNA blocks. This data establishes a strong correlation
between the presence of TFBSs in regions of high DNA conservation and
biological function (Table 2). However, only a small percentage of the total identified conserved sites correspond to functional sites that have been experimentally verified. Some of the other conserved TFBSs may also be functional but remain to
be experimentally confirmed.
|
Cytokine Promoter Analysis to Assess rVISTA Predictions
In addition to AP-1 and NFAT, the GATA-3 TF has also been implicated
in the transcriptional control of the large number of Th2-specific cytokines present in this interval (IL-4, IL-13, IL-5, GMCSF, IL-3) (Ranganath and Murphy 2001
). GATA-3's direct involvement in gene activation has been extensively shown for the
IL-4 and IL-5 promoters (Zheng and Flavell 1997
; Lee
et al. 1998
) and has been postulated for the activation (or repression) of all the cytokine genes present in this interval (Zheng and Flavell
1997
). On the basis of GATA-3's predicted binding to
upstream regions of cytokine genes, we hypothesized that there should
be an increased distribution of GATA-3 sites across the six cytokine promoters compared with the promoters of the 16 non-Th1/Th2 expressing genes in this region.
To test this hypothesis, we determined the GATA-3 site distribution for
the 2-kb promoter region of all 22 annotated genes in this interval.
Because of the highly degenerate nature of the GATA binding profile
that is recognized by all members of the GATA-family (Merika and Orkin
1993
), TRANFAC predicted an average of 50 GATA-3 sites per promoter
that were evenly distributed across both cytokine and noncytokine gene
promoters. In contrast, the rVISTA analysis dramatically
reduced the total number of GATA-3 sites per promoter and, most
importantly, resulted in an increased representation of GATA-3 sites in
cytokine promoters (Fig. 4A). On average,
rVISTA detected eight conserved GATA-3 sites per cytokine
promoter while yielding only two conserved GATA-3 sites per noncytokine
promoter. In addition, the experimentally characterized GATA-3 sites in
both the IL-4 and IL-5 promoters (Zheng and Flavell
1997
; Lee et al. 1998
) were among the highly conserved sites identified
by rVISTA (Fig. 3B).
|
Because functional GATA-3 sites are present in pairs (Table 1), we next analyzed the distribution of GATA-3 sites clustered (two or more sites present within 60-bp regions). By clustering the conserved GATA-3 sites, we observed a further enrichment of GATA-3 sites in the cytokine promoters. In each cytokine promoter there were an average of six GATA-3 clustered sites, whereas no such clustered sites were noted in the promoters of noncytokine genes. These GATA-3 clustered sites, although not yet experimentally verified, were exclusively found in the promoters of genes predicted to be GATA-3 responsive. rVISTA's ability to recognize what are likely true TFBSs in the promoters of cytokine genes supports the hypothesis that GATA-3 plays an important role in the regulation of all the cytokine genes present on human 5q31 (Fig. 4B).
| |
DISCUSSION |
|---|
|
|
|---|
Annotating the noncoding portion of the human genome remains among the greatest challenges of the post-sequencing era. Clues for identifying sequences involved in the complex regulatory networks of eukaryotic genes are provided by the presence of TFBS motifs, the clustering of such binding site motifs, and the conservation of these sites between species. rVISTA takes advantage of all these established strategies to enhance the detection of functional transcriptional regulatory sequences controlling gene expression through its ability to identify evolutionarily conserved and clustered TFBSs.
By performing an unbiased analysis of the distribution of NFAT and AP-1
binding sites across ~1 Mb of human/mouse orthologous region, we were
able to show that although rVISTA reduces more than 95%
of the predicted TFBSs derived from the sequence analysis of a single
organism, it still recognizes 88% of the biologically characterized
AP-1 and NFAT in this region. The PWM compiled from experimentally
determined TFBSs available in the TRANSFAC database pose a major
limitation in the rVISTA analysis, because the
computational approach described relies on the available DNA binding
profiles of known transcription factors (Table 1). Of the total 19 AP-1
and 21 NFAT experimentally described sites, 17 AP-1 and 19 NFAT sites
had TRANSFAC values greater than 0.75/ 0.8, two AP-1 and one NFAT site
had values of 0.7/0.7, and one NFAT site had a value of 0.6/0.7 (Table
1). Of the 36 experimentally defined AP-1 and NFAT sites recognized by
the PWMs available in the TRANSFAC database (with the 0.75/0.8
parameters), only one aligned AP-1 site (71%) was below our
established conservation threshold (
80%) and failed to be identified
by the rVISTA program. Our data indicates that the
rVISTA program dramatically eliminates a large number of
false-positive TFBSs while it enriches for functional TFBSs.
Although the identification of conserved TFBSs on a small genomic
interval can be achieved by phylogenetic footprinting (Hardison et al.
1997
; Krivan and Wasserman 2001
), a great strength of the rVISTA algorithm is its ability to efficiently analyze large genomic intervals and potentially whole genomes. The clustering modules and the user-defined customization of visualized sites makes
this a further useful tool for the investigation of TFBSs. Through the
use of a global alignment, rVISTA takes into account the
linear structure of sequence conservation across a large DNA segment.
By allowing small gaps and DNA shifts in the aligned TFBSs, we are
maximizing the identification of functional TFBSs that have diverged
slightly yet are present in highly conserved regions; ~25-35% of
all aligned and ~15-20% of all conserved TFBSs identified have one
gap in their alignment (data not shown). In addition, ~25% of the
aligned and ~18% of the conserved NFAT and AP-1 sites have shifts
(1-6 bp) across their alignments (data not shown). The presence of
gaps in the alignments of experimentally characterized TFBSs further
supports the use of a global alignment for rVISTA analysis
and the need for loose parameters for the identification of aligned
sites, as well as stringent percent identity criteria for detecting
highly conserved TFBSs.
Properties related to protein-protein interaction and chromatin
structure, as well as clusters of multiple unique sites that have been
reshuffled in one of the human or the mouse genome and have lost their
positional linearity, are not addressed. Also, clustering does not take
into account the spacing between sites but rather counts the number of
adjacent sites of a given TF spanning DNA segments of specified length.
Although TFBS clustering has been suggested for identifying regulatory
sequence, no data to date has proved the effectiveness of this approach
(Wagner 1997
; Wagner 1999
). Our clustering analysis results indicate
that this approach has the potential to efficiently prioritize
functionally relevant noncoding sequences. rVISTA
represents the only publicly available program that allows the user to
identify customized clusters of multiple TFBSs in large genomic intervals.
Our analysis of the AP-1 and NFAT TFBS in the cytokine gene cluster illustrates the effectiveness of the rVISTA algorithm in eliminating many false positives while retaining the majority of experimentally verified sites. In our analysis of GATA-3 sites in the putative promoters of the 23 genes from human 5q31 we were able to prioritize, exclusively on the basis of sequence analysis, a limited number of GATA-3 sites with a high likelihood of being functional that can be used for further biological investigation. With the increasing availability of sequence data for multiple organisms, rVISTA's ability to use comparative data and clustering options in a user-friendly manner makes it particularly suited to assist investigators focused on biologically defined genomic intervals, as well as those interested in performing whole genome analyses to identify functional TFBSs and regulatory elements.
| |
METHODS |
|---|
|
|
|---|
rVISTA is implemented as a publicly available web-based tool (http://pga.lbl.gov/rvista.html) that requires a sequence alignment file and optional gene annotation files as user input. The rVISTA analysis tool consists of four major modules: (1) motif recognition, (2) identification of aligned TFBSs, (3) conservation analysis, and (4) visualization of TFBSs. The system units are implemented with the C++ computer language equipped with Web user-interactive interface written in Perl. For the conservation analysis rVISTA uses an alignment file in the AVID format obtained using the AVID program from the AVID (http://bio.math.berkeley.edu/avid/) or VISTA (http://www-gsd.lbl.gov/VISTA/) servers. The following methodological scheme was implemented as a core for the rVISTA tool. Initially, the user chooses a set of TFs and the PWM parameters to be used. Next, rVISTA extracts all TFBS coordinates independently in the two orthologous sequences before the analysis of the alignment. The locally installed TRANFAC 5.2 library and the MATCH program from Biobase, Inc. (http://www.biobase.de) are used at this step.
Subsequently, the global alignment is scanned for pairs of neighboring human and mouse TFBSs that are aligned and match identically in both sequences. An aligned TFBS is allowed to have a maximum 6-bp shift (majority of TF matrices have core sequences of 4-6 bp) in the alignment of the TFBS core and a single gap present across the entire local alignment of the TFBS. The conservation analysis module contains one major unit, the hula hoop, which is designed to analyze the local DNA conservation of each aligned TFBS to eliminate aligned sites present in regions of weak DNA conservation. A fixed-size DNA window (21 bp) is being shifted through all the positions of an aligned TFBS, whereas the entire sequence spanning the TFBS is permanently enclosed by the shifting DNA window. The percent identity is calculated at every base pair across the aligned TFBS and extending 10 nucleotides upstream and downstream from it, similar to a hula hoop. The position with the highest percent identity is used to assign the conservation level of that particular TFBS. This process allows the identification of the maximum percent identity for the local alignment of a conserved TFBS. The program calculates % ID for each binding site with dynamically shifting windows of up to 200 bp. These data are provided in a table format and allow the identification of TFBSs present in large regions of high conservation.
The visualization module is a web-based tool that post-processes the
rVISTA output. One unit of the program eliminates redundancy. Overlapping TFBS matches (within 3 bp from each other) belonging to the same family of regulatory proteins are considered to
be an identical match. A second unit of the program measures the
distance between adjacent matches belonging to the same TF family and
allows the user to selectively cluster TFBSs into groups of x
number of sites over y base pair length. The clustering
parameters are user-defined and are assigned independently for every
family of TF. Any combination of unfiltered, aligned, or conserved
TFBSs with customized clustering for the selected set of TFs are
interactively visualized as a `tick-plot' track overlaid on the
conservation VISTA-type track and the gene annotation
track. All conserved binding sites displayed fit the criteria of
80%
over a 21-bp alignment.
Evaluation of True TF Enrichment with GATA Sites in Promoters
To quantitatively measure the enrichment of GATA predictions for
functional sites, we performed a statistical simulation for the
expected number of conserved sites and compared it with the observed
number of conserved sites (
80% ID; 21 bp) present in promoter
regions (2 kb upstream of the 5'UTR). Redundant GATA sites (defined to
be overlapping sites) were excluded before the analysis. GATA site
clustering was performed for two or more neighboring GATA sites
present over a
60-bp region. The upper bound for the expected
number of conserved GATA sites in a promoter under consideration, i, was calculated as follows:
|
80%.)
To obtain a more accurate value for the number of expected conserved
and aligned GATA sites, we also considered the fact that a human GATA
site present in a region of high DNA conservation (
80%) will not
always have an aligned match in the mouse sequence. Any given conserved
GATA site could either have no corresponding aligned binding site in
the second sequence or the binding site alignment could exceed the gap
and shift requirements. We approached this problem by introducing a
scaling parameter
. (
is the probability of a conserved site to
be aligned and is approximated to be a constant for all the promoter
regions.) The estimation for the
value was calculated on the basis
of the number of conserved sites that were also aligned in noncytokine
promoters. We estimated the expected distribution of conserved GATA
sites across the promoter sequences as follows:
|
Similarly, we estimated the number of expected conserved, aligned, and
clustered GATA sites. The length of the conserved and clustered segment
of each promoter was obtained by checking all the possible paired
coordinates in the conserved regions of all promoters and obtaining the
ratio of sites closer than 60 bp but greater than 4 bp apart from each
other. The
values for conserved and for conserved and clustered
sites were found to be 0.23 and 0.19, respectively. Close
values
for these two types of sites (conserved; conserved and clustered)
indicate that the probability of a conserved site to be aligned is
independent from the probability of the same site to be clustered (Fig.
4B).
| |
WEB SITE REFERENCES |
|---|
|
|
|---|
http://bio.math.berkeley.edu/avid/; AVID program.
http://pga.lbl.gov/rvista.html; rVISTA program.
http://www.biobase.de; TRANSFAC database.
http://www.biobase.de; MATCH program, Biobase, Inc.
| |
ACKNOWLEDGMENTS |
|---|
We are grateful to Moshe Malkin, Jody Schwartz, Alexander Fabrikant, and Michael Brudno for technical assistance. We thank the Rubin Laboratory for insightful comments on the manuscript. This work was supported by the Program for Genomic Applications (PGAs) funded by the National Heart, Lung, and Blood Institute (NHLBI/NIH); G.G. Loots was supported by the Department of Energy Alexander Hollaender Fellowship.
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| |
FOOTNOTES |
|---|
4 Corresponding authors.
E-MAIL ggloots{at}lbl.gov; ildubchak{at}lbl.gov; FAX (510) 486-6746.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.225502. Article published online before print in April 2002.
| |
REFERENCES |
|---|
|
|
|---|
a web server for aligning two genomic DNA sequences.
Genome Res.
10:
577-586.Received November 27, 2001; accepted in revised form March 7, 2002.
This article has been cited by other articles:
![]() |
B. M. Kublaoui, T. Gemelli, K. P. Tolson, Y. Wang, and A. R. Zinn Oxytocin Deficiency Mediates Hyperphagic Obesity of Sim1 Haploinsufficient Mice Mol. Endocrinol., July 1, 2008; 22(7): 1723 - 1734. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. M. Mager, R. M. Ward, R. Srinivasan, S.-W. Jang, L. Wrabetz, and J. Svaren Active Gene Repression by the Egr2{middle dot}NAB Complex during Peripheral Nerve Myelination J. Biol. Chem., June 27, 2008; 283(26): 18187 - 18197. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Xiao, S. Zhang, B. S. Magenheimer, J. Luo, and L. D. Quarles Polycystin-1 Regulates Skeletogenesis through Stimulation of the Osteoblast-specific Transcription Factor RUNX2-II J. Biol. Chem., May 2, 2008; 283(18): 12624 - 12634. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. de Guzman Strong and J. A. Segre Navigating the genome J. Cell Sci., April 1, 2008; 121(7): 921 - 923. [Full Text] [PDF] |
||||
![]() |
L. A. Lettice, A. E. Hill, P. S. Devenney, and R. E. Hill Point mutations in a distant sonic hedgehog cis-regulator generate a variable regulatory output responsible for preaxial polydactyly Hum. Mol. Genet., April 1, 2008; 17(7): 978 - 985. [Abstract] [Full Text] [PDF] |
||||
![]() |
T C van der Pouw Kraan, C A Wijbrandts, L G van Baarsen, F Rustenburg, J M Baggen, C L Verweij, and P P Tak Responsiveness to anti-tumour necrosis factor {alpha} therapy is related to pre-treatment tissue inflammation levels in rheumatoid arthritis patients Ann Rheum Dis, April 1, 2008; 67(4): 563 - 566. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Huang, C. Zhu, H. Wang, E. Horvath, and E. A. Eklund The Interferon Consensus Sequence-binding Protein (ICSBP/IRF8) Represses PTPN13 Gene Transcription in Differentiating Myeloid Cells J. Biol. Chem., March 21, 2008; 283(12): 7921 - 7935. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Fujita and H. Iba Putative promoter regions of miRNA genes involved in evolutionarily conserved regulatory systems among vertebrates Bioinformatics, February 1, 2008; 24(3): 303 - 308. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Kanda, W. Chen, M. Othman, K. E. H. Branham, M. Brooks, R. Khanna, S. He, R. Lyons, G. R. Abecasis, and A. Swaroop A variant of mitochondrial protein LOC387715/ARMS2, not HTRA1, is strongly associated with age-related macular degeneration PNAS, October 9, 2007; 104(41): 16227 - 16232. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. R. Davies, L.-W. Chang, D. Patra, X. Xing, K. Posey, J. Hecht, G. D. Stormo, and L. J. Sandell Computational identification and functional validation of regulatory motifs in cartilage-expressed genes Genome Res., October 1, 2007; 17(10): 1438 - 1447. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. O. Fledderus, J. V. van Thienen, R. A. Boon, R. J. Dekker, J. Rohlena, O. L. Volger, A.-P. J. J. Bijnens, M. J. A. P. Daemen, J. Kuiper, T. J. C. van Berkel, et al. Prolonged shear stress and KLF2 suppress constitutive proinflammatory transcription through inhibition of ATF2 Blood, May 15, 2007; 109(10): 4249 - 4257. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. E. LeBlanc, R. M. Ward, and J. Svaren Neuropathy-Associated Egr2 Mutants Disrupt Cooperative Activation of Myelin Protein Zero by Egr2 and Sox10 Mol. Cell. Biol., May 1, 2007; 27(9): 3521 - 3529. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Tomovic and E. J. Oakeley Position dependencies in transcription factor binding sites Bioinformatics, April 15, 2007; 23(8): 933 - 941. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Mitsui, M. Saito, K. Mori, and Y. Yoshihara A Transcriptional Enhancer That Directs Telencephalon-Specific Transgene Expression in Mouse Brain Cereb Cortex, March 1, 2007; 17(3): 522 - 530. < |