|
|
|
|
|
Published online before print
March 13, 2006, 10.1101/gr.4473506 Genome Res. 16:520-526, 2006 ©2006 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/06 $5.00
Letter Disentangling information flow in the Ras-cAMP signaling network1 Institute for Systems Biology, Seattle, Washington 98103, USA; 2 Fraunhofer Institute for Interfacial Engineering and Biotechnology IGB, 70569 Stuttgart, Germany; 3 Whitehead Institute, Cambridge, Massachusetts 02142, USA
The perturbation of signal-transduction molecules elicits genomic-expression effects that are typically neither restricted to a small set of genes nor uniform. Instead there are broad, varied, and complex changes in expression across the genome. These observations suggest that signal transduction is not mediated by isolated pathways of information flow to distinct groups of genes in the genome. Rather, multiple entangled paths of information flow influence overlapping sets of genes. Using the Ras-cAMP pathway in Saccharomyces cerevisiae as a model system, we perturbed key pathway elements and collected genomic-expression data. Singular value decomposition was applied to separate the genome-wide transcriptional response into weighted expression components exhibited by overlapping groups of genes. Molecular interaction data were integrated to connect gene groups to perturbed signaling elements. The resulting series of linked subnetworks maps multiple putative pathways of information flow through a dense signaling network, and provides a set of testable hypotheses for complex gene-expression effects across the genome.
Biochemical and genetic techniques have led to a picture of intracellular signaling as a sequential cascade or pathway consisting of a limited set of signaling proteins linked by a small number of biochemical interactions. In this "sparse-network" view, signals are propagated via mostly isolated linear sequences of molecular interactions. The collection of high-throughput molecular-interaction data now allows these signaling elements to be mapped in a large dense interaction network. This "dense-network" view suggests that signaling paths are not isolated, but rather form an entangled web of numerous possible signaling avenues. The reconciliation of the sparse-network signaling concept and dense biological networks is a central problem in systems biology (Ideker et al. 2001
The perturbation of signal-transduction molecules can have distinct regulatory effects of differing magnitudes on overlapping sets of genes. For many genes, if not most, the expression pattern revealed by genomic expression analysis is a composite of overlapping regulatory influences. In other words, the measured expression of a gene in a condition often reflects a summation of separate influences that are prevalent in the genome. Detecting and isolating distinct overlapping expression effects of varying magnitude requires appropriate data-analysis methods. Such methods should be able to (1) decompose the expression pattern of each gene in each condition; (2) detect major expression components as well as minor but biologically informative components that may be difficult to discern; (3) identify overlapping clusters of genes sharing an expression component. Clustering algorithms in common use, for example, hierarchical clustering (Eisen et al. 1998
Distinct influences of varying magnitude within genes and among overlapping gene sets can be discerned using singular value decomposition (SVD) (Weaver et al. 1999
Here, we propose that a signaling regulatory influence isolated by SVD is delivered by one or a few strands, which we denote an "expression-component subnetwork," of the dense interwoven signaling network. As a model, we consider the Ras-cAMP signaling pathway in the budding yeast Saccharomyces cerevisiae. This pathway is implicated in pseudohyphal growth (Gimeno and Fink 1992
Experimental design Key elements of the Ras-cAMP network were perturbed in nine genomic-expression profiling experiments (Methods). The first four experimental conditions were designed to directly control the concentration of cAMP. A yeast strain with defective synthesis and defective degradation of cAMP was constructed (Methods). Cellular synthesis of cAMP was prevented by disruption of the major (RAS2) and minor (RAS1) activators of adenylate cyclase. Cellular degradation of cAMP was prevented by disruption of the cAMP phosphodiesterase gene, PDE2. Exogenous cAMP was infused into the cells by adding it at various concentrations (0, 0.5, 1, and 2mM) to the growth medium. This experimental design has been shown to regulate cAMP-pathway activity (Rupp et al. 1999
The other five experiments used strains altered in their ability to transmit a signal through the cAMP pathway. Two of the strains contained a genetic modification of the Ras2 protein sequence. The RAS2V19 dominant-active allele locks Ras2 protein in the active, GTP-bound state (Toda et al. 1985
Singular value decomposition analysis The eigenconditions are plotted in Figure 1. The eigengene matrix is too large for informative display. By inspecting the columns of the raster plot, one can discern the expression component represented in each mode. Comparisons among rows reveal similarities in expression components among conditions. The modes are ordered by their singular values (weights) from highest (Mode 1) to lowest (Mode 9). There is a clear ordering of modes in that their singular values vary widely in magnitude. However, SVD measured a high data set entropy of 0.76 (Methods), indicating genomic expression with multiple substantial genomic expression components rather than dominance by one or a few modes. By perturbing key elements in a major pathway we have apparently affected a diversity of signaling mechanisms and biological processes.
Though all genes and all conditions contribute to each SVD mode, some contributions are significant and others are negligible. To determine which genes and conditions are the most significant contributors in each SVD mode, we extracted those with eigengene and eigencondition matrix entries more than one standard deviation above or below the mean of all modes (similar to Wall et al. 2001 Joint membership of any gene in more than one SVD mode is possible. We found intermodal overlaps of 5%15% (Supplemental Table 3). Statistics on mode memberships of genes are shown in Supplemental Table 4. More than half (52%) of the genes are grouped into more than one mode, and many genes (17%) appear in four or more modes. The joint membership of a gene in more than one mode indicates that the expression pattern of the gene is a weighted composite of the modes of which it is a member. The modes shown in Figure 1 define the orthogonal set of expression components from which the expression pattern of any gene can be composed. This is illustrated in Figure 2 for three genes of increasing expression complexity. Here, "complexity" refers to the number of modes exhibited by the gene. Note that the composite (measured) expression pattern of each gene in each condition is a summation of the contributions of the expression components. Simple expression patterns, such as that of the ILV6 amino acid biosynthesis gene (Fig. 2A), can be accounted for by a combination of one or two expression components. Genes that have many substantial expression components, like the ADH5 alcohol dehydrogenase gene (Fig. 2B) and the MSN4 transcriptional activator gene (Fig. 2C), have a relatively unique composite expression pattern that can be described as a combination of many expression components prevalent across the genome. These examples, and the prevalence of expression-pattern complexity indicated in Supplemental Table 4, demonstrate an essential and advantageous feature of SVD analysis, i.e., the expression data set, and the expression of every gene in each condition, is decomposed into a series of components that are entirely determined by the data itself. In contrast, methods that cluster expression patterns without decomposition are not designed to isolate these overlapping regulatory influences of varying magnitude (Supplemental text; Supplemental Fig. 1), though this is exactly what is sought from genomic-expression data in signaling perturbation studies.
Functional associations of SVD gene sets To assess the functional relevance of each SVD gene set, we analyzed member genes for overrepresentation of genes with the same Gene Ontology annotations (Table 1; Methods). For a majority of modes there are biological processes, molecular functions, and cellular components associated with the gene sets. Generally, these associations are strongest (i.e., less likely due to chance) for the modes with high singular values. For some gene sets the lack of annotations is likely to be a consequence of expression effects that cut across functional classes, a lack of annotation due to an unknown common function, or nonbiological effects such as systematic error, noise contamination, and data normalization. Nonetheless, most modes, including modes with low singular values, show significant annotations. For example, the highly significant annotation for Mode 7 and the moderately significant annotation of Mode 9 suggest that these modes carry functional information.
Transcription-factor associations with SVD gene sets The apparent coregulation of the genes in each SVD gene set suggests cobinding of the genes by specific transcription factors. Correspondence of DNA-binding patterns and SVD gene sets would further support the biological significance of SVD modes. For each SVD gene set, we assessed the member genes for a statistical overrepresentation of targets of each of 137 transcription factors (Methods). Between one and 11 transcription factors were found for 12 of the 18 gene sets (Table 1; Supplemental Table 5; Supplemental Fig. 2). Similar to Gene Ontology annotations previously discussed, transcription factors were more likely to be found for the modes with high singular values. Note, though, that some modes with either low weights or a lack of group annotation show significant enrichment of transcription-factor binding. This lends credence to their transcriptional coregulation. Some transcription factors (e.g., Gcn4) show enrichment for target genes in more than one gene set. A possible explanation is that the same target genes are members of more than one gene set. However, there is generally low overlap among targets of each transcription factor in different gene sets (Supplemental Table 6). This observation suggests that some individual transcription factors have separate roles in different gene-expression modes.
Expression-component subnetworks
Multiple transcription factors were found for most SVD gene sets. These factors bind not only to target genes in the gene sets; they also have proteinDNA interactions among themselves. Such binding can form transcriptional regulatory loops (e.g., autoregulation, multicomponent loops, feed-forward loops) and regulatory chains and hierarchies (Lee et al. 2003
The final step in assembling expression-component subnetworks was connecting transcription factors to the causal perturbations. Public databases were queried (Reiss et al. 2005
SVD can isolate large and subtle overlapping effects resulting from signaling perturbations. The experiments in the present study were designed to extract information by comparing the genomic responses elicited by strategic perturbations of Ras-cAMP signaling. Analysis of the experiments by SVD permits: (1) A comparison of the effects of increasing cAMP levels. (2) A comparison of the effects of different IRA mutant alleles. (3) A comparison of strains carrying dominant-active and dominant-negative alleles of the major GTPase gene, RAS2. By isolating expression changes due to strategic perturbations of key cAMP-pathway elements, the findings directly address questions motivating the experimental design of our genomic-expression analysis of pathway genetics. Because further perturbations of correctly inferred subnetwork elements would induce predictable changes in the expression component mediated by those elements, the results suggest further experimentation to test whether regulatory influences are received through the proposed expression-component subnetworks
Expression responses to cAMP levels
Expression-pattern decomposition and subnetwork mapping achieves a level of network detail greater than previous expression analyses of the Ras-cAMP pathway. For example, Wang and collaborators (Wang et al. 2004
A novel function for the Ira1 protein?
Differential expression for RAS2 point mutants
Signaling through dense molecular networks
Strains and growth conditions Standard strain construction methods and growth medium formulations were used (Guthrie and Fink 1991
For genomic-expression analysis of the response to varying cAMP concentration, strain SR959 was grown in Synthetic Complete (SC) medium, 2% glucose, with 1mM cAMP to OD600 = 1. The culture was split and diluted to OD600 = 0.3 in fresh SC medium with either 0, 0.5, 1.0, or 2 mM cAMP. These cultures were grown to OD600 = 1.0 and harvested by centrifugation. Before and after each experiment, strain SR959 was checked for suppressor mutations by plating on YPD (rich medium) as in Rupp et al. (Rupp et al. 1999
Genomic expression data collection and analysis
Functional analysis of SVD-derived gene sets
Associating transcription factors with SVD gene sets
Construction of expression-component subnetworks
We thank Hui Ge, Susanne Prinz, David Reiss, James Taylor, and Vesteinn Thorsson for their contributions. T. Galitski is a recipient of a Burroughs Wellcome Fund Career Award in the Biomedical Sciences. G.F.R. was funded by NIH grant GM035010.
4 Corresponding author.
E-mail gcarter{at}systemsbiology.org; fax (206) 732-1299. [Supplemental material is available online at www.genome.org. Genomic-expression data have been deposited in the Gene Expression Omnibus database under accession no. GSE2927.] Article published online ahead of print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.4473506
Alter O., Brown P.O., Botstein D. 2000. Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl. Acad. Sci. 97: 1010110106. Alter O., Brown P.O., Botstein D. 2003. Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms. Proc. Natl. Acad. Sci. 100: 33513356. Anderson A., Hudson M., Chen W., Zhu T. 2003. Identification of nutrient partitioning genes participating in rice grain filling by singular value decomposition (SVD) of genome expression data. BMC Genomics 4: 26.[CrossRef][Medline] Cheng Y. and Church G.M. 2000. Biclustering of expression data. Proc. Int. Conf. Intell. Syst. Mol. Biol. 8: 93103.[Medline] DSouza C.A. and Heitman J. 2001. Conserved cAMP signaling cascades regulate fungal development and virulence. FEMS Microbiol. Rev. 25: 349364.[CrossRef][Medline] Eisen M.B., Spellman P.T., Brown P.O., Botstein D. 1998. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. 95: 1486314868. Gasch A.P. and Eisen M.B. 2002. Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. In Genome Biol. pp. research0059. 3:.[Medline] Ghosh D. 2002. Singular value decomposition regression models for classification of tumors from microarray experiments. In Pac. Symp. Biocomput. pp. 1829. Gimeno C.J. and Fink G.R. 1992. The logic of cell division in the life cycle of yeast. Science 257: 626. Vol. 194.Guthrie C. and Fink G.R. In Guide to yeast genetics and molecular biology, . 1991. Academic Press, New York. Halme A., Bumgarner S., Styles C., Fink G.R. 2004. Genetic and epigenetic regulation of the FLO gene family generates cell-surface variation in yeast. Cell 116: 405415.[CrossRef][Medline] Harbison C.T., Gordon D.B., Lee T.I., Rinaldi N.J., Macisaac K.D., Danford T.W., Hannett N.M., Tagne J.B., Reynolds D.B., Yoo J.et al. 2004. Transcriptional regulatory code of a eukaryotic genome. Nature 431: 99104.[CrossRef][Medline] Holter N.S., Mitra M., Maritan A., Cieplak M., Banavar J.R., Fedoroff N.V. 2000. Fundamental patterns underlying gene expression profiles: Simplicity from complexity. Proc. Natl. Acad. Sci. 97: 84098414. Horn D. and Axel I. 2002. Novel clustering algorithm for microarray expression data in a truncated SVD space. Bioinformatics 19: 11101115. Ideker T., Galitski T., Hood L. 2001. A new approach to decoding life: Systems biology. Annu. Rev. Genomics Hum. Genet. 2: 343372.[CrossRef][Medline] Jones D.L., Petty J., Hoyle D.C., Hayes A., Ragni E., Popolo L., Oliver S.G., Stateva L.I. 2003. Transcriptome profiling of a Saccharomyces cerevisiae mutant with a constitutively activated Ras/cAMP pathway. Physiol. Genomics 16: 107118. Kellis M., Patterson N., Endrizzi M., Birren B., Lander E.S. 2003. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423: 241254.[CrossRef][Medline] Lee T.I., Rinaldi N.J., Robert F., Odom D.T., Bar-Joseph Z., Gerber G.K., Hannett N.M., Harbison C.T., Thompson C.M., Simon I.et al. 2003. Transcriptional regulatory networks in Saccharomyces cerevisiae.. Science 298: 799804. Liu L., Hawkins D.M., Ghosh S., Young S.S. 2003. Robust singular value decomposition analysis of microarray data. Proc. Natl. Acad. Sci. 100: 1316713172. Marcotte E.M. 2001. The path not taken. Nat. Biotechnol. 19: 626627.[CrossRef][Medline] Reiss D.J., Avila-Campillo I., Thorsson V., Schwikowski B., Galitski T. 2005. Tools enabling the elucidation of molecular pathways active in human disease: Application to Hepatitis C Virus infection. BMC Bioinformatics 6: 154.[CrossRef][Medline] Robertson L.S. and Fink G.R. 1998. The three yeast A kinases have specific signaling functions in pseudohyphal growth. Proc. Natl. Acad. Sci. 95: 1378313787. Rupp S., Summers E., Lo H.J., Madhani H., Fink G. 1999. MAP kinase and cAMP filamentation signaling pathways converge on the unusually large promoter of the yeast FLO11 gene. EMBO J. 18: 12571269.[CrossRef][Medline] Shannon P., Markiel A., Ozier O., Baliga N.S., Wang J.T., Ramage D., Amin N., Schwikowski B., Ideker T. 2003. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 13: 24982504. Stanhill A., Schick N., Engelberg D. 1999. The yeast ras/cyclic AMP pathway induces invasive growth by suppressing the cellular stress response. Mol. Cell. Biol. 19: 75297538. Steffen M., Petti A., Aach J., Dhaeseleer P., Church G. 2002. Automated modelling of signal transduction networks. BMC Bioinformatics 3: 34.[CrossRef][Medline] Tamayo P., Slonim D., Mesirov J., Zhu Q., Kitareewan S., Dmitrovsky E., Lander E.S., Golub T.R. 1999. Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. 96: 29072912. Tanaka K., Nakafuku M., Satoh T., Marshall M.S., Gibbs J.B., Matsumoto K., Kaziro Y., Toh-e A. 1990. S. cerevisiae genes IRA1 and IRA2 encode proteins that may be functionally equivalent to mammalian ras GTPase activating protein. Cell 60: 803807.[CrossRef][Medline] Thevelein J.M. 1992. The RAS-adenylate cyclase pathway and cell cycle control in Saccharomyces cerevisiae.. Antonie Van Leeuwenhoek 62: 109130.[CrossRef][Medline] Toda T., Uno I., Ishikawa T., Powers S., Kataoka T., Broek D., Cameron S., Broach J., Matsumoto K., Wigler M. 1985. In yeast, RAS proteins are controlling elements of adenylate cyclase. Cell 40: 2736.[CrossRef][Medline] Wall M.E., Dyck P.A., Brettin T.S. 2001. SVDMANsingular value decomposition analysis of microarray data. Bioinformatics 17: 566568. Wang Y., Pierce M., Schneper L., Guldal C.G., Zhang X., Tavazoie S., Broach J.R. 2004. Ras and Gpa2 mediate one branch of a redundant glucose signaling pathway in yeast. PLoS Biol. 2: e128.[Medline] Weaver D.C., Workman C.T., Stormo G.D. 1999. Modeling regulatory networks with weight matrices. Pac. Symp. Biocomput. 112123. Wodicka L., Dong H., Mittmann M., Ho M.H., Lockhart D.J. 1997. Genome-wide expression monitoring in Saccharomyces cerevisiae.. Nat. Biotechnol. 15: 13591367.[CrossRef][Medline] Yeung M.K.S., Tegner J., Collins J.J. 2002. Reverse engineering gene networks using singular value decomposition and robust regression. Proc. Natl. Acad. Sci. 99: 61636168. Zeitlinger J., Simon I., Harbison C.T., Hannett N.M., Volkert T.L., Fink G.R., Young R.A. 2003. Program-specific distribution of a transcription factor dependent on partner transcription factor and MAPK signaling. Cell 113: 395404.[CrossRef][Medline]
Received July 21, 2005; accepted in revised format January 17, 2006.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||