|
Vol. 11, Issue 1, 112-123, January 2001
METHODS
Assessing Clusters and Motifs from Gene Expression Data
Lars M.
Jakt,1
Liang
Cao,2
Kathryn S.E.
Cheah,1 and
David K.
Smith1,3
1 Department of Biochemistry, University of Hong Kong, Pok
Fu Lam, Hong Kong; 2 Department of Microbiology,
University of Hong Kong, Queen Mary Hospital, Pok Fu Lam, Hong Kong
Large-scale gene expression studies and genomic sequencing projects
are providing vast amounts of information that can be used to identify
or predict cellular regulatory processes. Genes can be clustered on the
basis of the similarity of their expression profiles or function and
these clusters are likely to contain genes that are regulated by the
same transcription factors. Searches for cis-regulatory
elements can then be undertaken in the noncoding regions of the
clustered genes. However, it is necessary to assess the efficiency of
both the gene clustering and the postulated regulatory motifs, as there
are many difficulties associated with clustering and determining the
functional relevance of matches to sequence motifs. We have developed a
method to assess the potential functional significance of clusters and
motifs based on the probability of finding a certain number of matches
to a motif in all of the gene clusters. To avoid problems with
threshold scores for a match, the top matches to a motif are taken in
several sample sizes. Genes from a sample are then counted by the
cluster in which they appear. The probability of observing these counts
by chance is calculated using the hypergeometric distribution. Because
of the multiple sample sizes, strong and weak matching motifs can be detected and refined and significant matches to motifs across cluster
boundaries are observed as all clusters are considered. By applying
this method to many motifs and to a cluster set of yeast genes, we
detected a similarity between Swi Five Factor and forkhead proteins and
suggest that the currently unidentified Swi Five Factor is one of the
yeast forkhead proteins.
3
Corresponding author.
11:112-123 ©2001 by Cold Spring Harbor Laboratory Press ISSN 1088-9051/01 $5.00

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
G. N. Vemuri, E. Altman, D. P. Sangurdekar, A. B. Khodursky, and M. A. Eiteman
Overflow Metabolism in Escherichia coli during Steady-State Growth: Transcriptional Regulation and Effect of the Redox Ratio.
Appl. Envir. Microbiol.,
May 1, 2006;
72(5):
3653 - 3661.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Kasturi and R. Acharya
Clustering of diverse genomic data using information fusion
Bioinformatics,
February 15, 2005;
21(4):
423 - 429.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. L. Yap, M. P. Wong, X. W. Zhang, D. Hernandez, R. Gras, D. K. Smith, and A. Danchin
Conserved transcription factor binding sites of cancer markers derived from primary lung adenocarcinoma microarrays
Nucleic Acids Res.,
January 14, 2005;
33(1):
409 - 421.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Sarwal, M.-S. Chua, N. Kambham, S.-C. Hsieh, T. Satterwhite, M. Masek, and O. Salvatierra Jr.
Molecular Heterogeneity in Acute Renal Allograft Rejection Identified by DNA Microarray Profiling
N. Engl. J. Med.,
July 10, 2003;
349(2):
125 - 138.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Zheng, J. Wu, and Z. Sun
An approach to identify over-represented cis-elements in related sequences
Nucleic Acids Res.,
April 1, 2003;
31(7):
1995 - 2005.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. Zhang, Y. Ramanathan, P. Soteropoulos, M. L. Recce, and P. P. Tolias
EZ-Retrieve: a web-server for batch retrieval of coordinate-specified human DNA sequences and underscoring putative transcription factor-binding sites
Nucleic Acids Res.,
November 1, 2002;
30(21):
e121 - e121.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
F. D. Gibbons and F. P. Roth
Judging the Quality of Gene Expression-Based Clustering Methods Using Gene Annotation
Genome Res.,
October 1, 2002;
12(10):
1574 - 1581.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Ladanyi, W. C. Chan, T. J. Triche, and W. L. Gerald
Expression Profiling of Human Tumors: The End of Surgical Pathology?
J. Mol. Diagn.,
August 1, 2001;
3(3):
92 - 97.
[Full Text]
[PDF]
|
 |
|
|
|