Genome Research

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Jakt, L. M.
Right arrow Articles by Smith, D. K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Jakt, L. M.
Right arrow Articles by Smith, D. K.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Vol. 11, Issue 1, 112-123, January 2001

METHODS
Assessing Clusters and Motifs from Gene Expression Data

Lars M. Jakt,1 Liang Cao,2 Kathryn S.E. Cheah,1 and David K. Smith1,3

1 Department of Biochemistry, University of Hong Kong, Pok Fu Lam, Hong Kong; 2 Department of Microbiology, University of Hong Kong, Queen Mary Hospital, Pok Fu Lam, Hong Kong

Large-scale gene expression studies and genomic sequencing projects are providing vast amounts of information that can be used to identify or predict cellular regulatory processes. Genes can be clustered on the basis of the similarity of their expression profiles or function and these clusters are likely to contain genes that are regulated by the same transcription factors. Searches for cis-regulatory elements can then be undertaken in the noncoding regions of the clustered genes. However, it is necessary to assess the efficiency of both the gene clustering and the postulated regulatory motifs, as there are many difficulties associated with clustering and determining the functional relevance of matches to sequence motifs. We have developed a method to assess the potential functional significance of clusters and motifs based on the probability of finding a certain number of matches to a motif in all of the gene clusters. To avoid problems with threshold scores for a match, the top matches to a motif are taken in several sample sizes. Genes from a sample are then counted by the cluster in which they appear. The probability of observing these counts by chance is calculated using the hypergeometric distribution. Because of the multiple sample sizes, strong and weak matching motifs can be detected and refined and significant matches to motifs across cluster boundaries are observed as all clusters are considered. By applying this method to many motifs and to a cluster set of yeast genes, we detected a similarity between Swi Five Factor and forkhead proteins and suggest that the currently unidentified Swi Five Factor is one of the yeast forkhead proteins.


3 Corresponding author.


11:112-123 ©2001 by Cold Spring Harbor Laboratory Press  ISSN 1088-9051/01 $5.00

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Appl. Environ. Microbiol.Home page
G. N. Vemuri, E. Altman, D. P. Sangurdekar, A. B. Khodursky, and M. A. Eiteman
Overflow Metabolism in Escherichia coli during Steady-State Growth: Transcriptional Regulation and Effect of the Redox Ratio.
Appl. Envir. Microbiol., May 1, 2006; 72(5): 3653 - 3661.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Kasturi and R. Acharya
Clustering of diverse genomic data using information fusion
Bioinformatics, February 15, 2005; 21(4): 423 - 429.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. L. Yap, M. P. Wong, X. W. Zhang, D. Hernandez, R. Gras, D. K. Smith, and A. Danchin
Conserved transcription factor binding sites of cancer markers derived from primary lung adenocarcinoma microarrays
Nucleic Acids Res., January 14, 2005; 33(1): 409 - 421.
[Abstract] [Full Text] [PDF]


Home page
NEJMHome page
M. Sarwal, M.-S. Chua, N. Kambham, S.-C. Hsieh, T. Satterwhite, M. Masek, and O. Salvatierra Jr.
Molecular Heterogeneity in Acute Renal Allograft Rejection Identified by DNA Microarray Profiling
N. Engl. J. Med., July 10, 2003; 349(2): 125 - 138.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Zheng, J. Wu, and Z. Sun
An approach to identify over-represented cis-elements in related sequences
Nucleic Acids Res., April 1, 2003; 31(7): 1995 - 2005.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Zhang, Y. Ramanathan, P. Soteropoulos, M. L. Recce, and P. P. Tolias
EZ-Retrieve: a web-server for batch retrieval of coordinate-specified human DNA sequences and underscoring putative transcription factor-binding sites
Nucleic Acids Res., November 1, 2002; 30(21): e121 - e121.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
F. D. Gibbons and F. P. Roth
Judging the Quality of Gene Expression-Based Clustering Methods Using Gene Annotation
Genome Res., October 1, 2002; 12(10): 1574 - 1581.
[Abstract] [Full Text] [PDF]


Home page
J. Mol. Diagn.Home page
M. Ladanyi, W. C. Chan, T. J. Triche, and W. L. Gerald
Expression Profiling of Human Tumors: The End of Surgical Pathology?
J. Mol. Diagn., August 1, 2001; 3(3): 92 - 97.
[Full Text] [PDF]




Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.