|
|
|
|
Published online before print
August 9, 2006, 10.1101/gr.5076506 Genome Res. 16:1149-1158, 2006 ©2006 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/06 $5.00 OPEN ACCESS ARTICLE
Methods STAC: A method for testing the significance of DNA copy number aberrations across multiple array-CGH experiments1Division of Oncology, Children's Hospital of Philadelphia and Department of Pediatrics, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104, USA; 2Penn Center for Bioinformatics (PCBI), University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA; 3Department of Genetics, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, 19104, USA; 4Abramson Family Cancer Research Institute, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104, USA
Regions of gain and loss of genomic DNA occur in many cancers and can drive the genesis and progression of disease. These copy number aberrations (CNAs) can be detected at high resolution by using microarray-based techniques. However, robust statistical approaches are needed to identify nonrandom gains and losses across multiple experiments/samples. We have developed a method called Significance Testing for Aberrant Copy number (STAC) to address this need. STAC utilizes two complementary statistics in combination with a novel search strategy. The significance of both statistics is assessed, and P-values are assigned to each location on the genome by using a multiple testing corrected permutation approach. We validate our method by using two published cancer data sets. STAC identifies genomic alterations known to be of clinical and biological significance and provides statistical support for 85% of previously reported regions. Moreover, STAC identifies numerous additional regions of significant gain/loss in these data that warrant further investigation. The P-values provided by STAC can be used to prioritize regions for follow-up study in an unbiased fashion. We conclude that STAC is a powerful tool for identifying nonrandom genomic amplifications and deletions across multiple experiments. A Java version of STAC is freely available for download at http://cbil.upenn.edu/STAC.
The accurate and unbiased identification of nonrandom subchromosomal gains and losses is important for diseases such as cancer and will likely play an increasingly important role in understanding inherited, germline copy number variation as well. Genomic copy number aberrations (CNAs) that are recurrent across individuals with a particular cancer often harbor critical disease genes whose expression level has been altered due to structural changes or abnormal gene dosage. An example is given by amplification of the MYCN oncogene in neuroblastoma that results in significant overexpression and independently predicts for high-risk and poor outcome (for review, see Maris and Matthay 1999
Genomic copy number can be estimated for a single sample on a genome-wide scale at a high resolution using recently developed microarray-based techniques (Pinkel et al. 1998 To date, however, little attention has been given to the need for multi-experiment methods to identify regions of consistent aberration across samples. Given that we are often interested in the recurrent regions of aberration, such multi-experiment methods are needed to complement the existing work on breakpoint detection in single experiments. Researchers routinely rely on simple frequency thresholds (i.e., selecting aberrations that occur in a specified percentage of the samples) for prioritizing regions for follow-up studies. This is followed by a tedious manual review of the data to define region boundaries and identify candidate genes that may be targeted by the genomic aberration. This process is time-consuming and easily prone to investigator bias. Moreover, this approach lacks the power to detect aberrant regions shared only within a subset of samples (e.g., a cancer sub-type). Multi-experiment statistical methods would provide a mechanism for accurately localizing and prioritizing recurrent aberrations in an unbiased manner thereby allowing for more focused follow-up efforts. Here we present a new algorithm (STAC) developed to address this need. STAC identifies regions of gain or loss that occur across an entire sample set or within a subset of samples more often than would be expected under a reasonable null model. The algorithm provides a rigorous mechanism for localizing regions of significance and has been engineered to accommodate data from any array platform (e.g., BAC, SNP, oligo-based) and to handle input from any one of the previously mentioned single-experiment methods after minor data transformation. STAC includes a search of the sample space and is sensitive to concordance even if coming from only a subset of the data. We demonstrate the utility of our method by applying it to two publicly available cancer data sets for which CNAs have been published and in several cases validated experimentally. We then show how STAC can uncover additional regions of interest in these data sets, many containing known cancer-related genes. Finally, we successfully use STAC results to identify subtypes of neuroblastoma characterized by novel aberration patterns.
Here we present STAC, a method for testing the statistical significance of DNA CNAs across multiple experiments. We first describe the data and notations. This is followed by a description of the null model, permutation approach, and selection of statistics. A heuristic method for searching the sample space is presented next, and we conclude with a detailed application to two publicly available cancer CNA data sets. STAC is available for download in a standalone format (STAC-Station) or a parallelized grid-based version (STAC-Grid) at http://www.cbil.upenn.edu/STAC.
Data and notations
STAC analysis currently focuses on gain and loss as separate cases since these are generally regarded as providing distinct mechanisms for disease. We will use the term "aberration" as the generic term for both, but the type of aberration (gain or loss, considered separately) is fixed throughout this discussion. Formatted input data consist of an aberration call for each of N experiments and M fixed-width spans, which we call genomic "locations." The sequence of M locations constitutes the stretch of genome under study and should exclude centromeres since alterations cannot be reliably detected in these regions. For cancer data we generally recommend that analysis be performed at the level of a chromosome arm, given that the observed background rate of aberration often varies considerably between arms (for examples, see aberration frequency plots in Mosse et al. 2005 We represent aberration with a 1 and no aberration with a 0. Therefore, for each stretch of genome considered and for each aberration type (gain or loss), the data can be put in an array of 0s and 1s where rows represent experiments and columns represent locations. We refer to a single row of this array as a "profile." A set of consecutive 1s in a row is called an (aberrant) "interval" for that profile. Therefore, each profile consists of a set of intervals and their locations. Figure 1 shows a graphical display of chromosome 11 loss data from a set of breast cancers consisting of N = 37 profiles and M = 77 locations. These example data are utilized below as we develop the methodology and two complementary statistics designed to test the significance of recurrent intervals of aberration.
Null model and permutation approach To calculate significance across samples, we need a statistical test that is sensitive to recurrent intervals of aberration at a given location in a selected sample set. A highly analogous problem arose in the analysis of direct identity-by-descent data, and a solution was given in Grant et al. (1999) An estimate of the null distribution is obtained via permutations where a permutation consists of a random rearrangement of the intervals of each profile (without replacement). In this way we preserve much of the nature of the data within samples while perturbing any concordance across samples. For example, if a profile with M locations had only one interval of length l, then there would be M l + 1 permutations of this profile, each equally likely.
The frequency statistic
Since we are comparing F(m 0) to the distribution of the maximum aberration frequency over all m, the resulting P-value P F(m 0) is a multiple testing corrected confidence measure (for the M tests) for rejection of the null model. Since our statistic is an indicator of behavior at location m0 , we prioritize the locations by the P F(m 0). If location m0 is significant to level We define the "confidence" at location m0 as 1 P F(m 0). Figure 2A shows the data from Figure 1 with the frequency confidences overlaid as gray bars. Four intervals of significant loss are identified and suggest putative locations for cancer-related genes. One can also see from these data that the frequency is not significant at the three leftmost locations (marked by *); however, there is a consistent aberration within a subset of nine samples. Given that cancers are often heterogeneous and copy number profiles can be used to discover and distinguish subtypes, it is imperative that one be able to identify this type of alteration in addition to those that are significantly frequent across the entire sample set.
The footprint and the normalized footprint To overcome the shortcoming of the global frequency statistic outlined at the end of the previous section, we develop a refined version of the "footprint" statistic and subset search methodology originally introduced by Grant et al. (1999)
We define a "stack," S, as a set of intervals that contains at most one interval per profile and where there is at least one location common to every interval in the set. Note that in Grant et al. (1999)
The requirement that stacks be anchored mitigates the need for normalization of the footprint as introduced in Grant et al. (1999)
Footprint-based P-values
For any stack S, we call m an anchor point of the stack if m is contained in every interval. We denote the set of all anchor points of a stack S by S*. By the definition of a stack given in the previous section, S*
R provides a uniform P-valuebased score that makes all locations comparable, regardless of the nature of the stacks over them. We cannot use the score as a meaningful P-value, however, since they are not multiple testing corrected for taking a minimum over all subset sizes. Therefore we perform a second permutation calculation on the R(m) themselves in order to assess true significance. Since R is a score for each location, much as the frequency is, we assess the significance of R in exactly the same way as we did with the frequency. This provides us with a footprint-based P-value at each location. It is important to note that a location may derive significance from either a subset of samples or the entire sample set given that we are evaluating stacks of all possible sizes (i.e., containing any number of samples).
Searching the sample space The approach is heuristic and searches the sample space in a greedy and incremental manner from 2 to N (i.e., the maximum possible stack size). For B, a fixed positive integer, it starts by finding the best B anchored stacks involving two intervals; "best" meaning with smallest normalized footprint. The algorithm then extends those B stacks in all possible ways to anchored stacks involving three intervals and finds the best nonredundant B of those. Those B stacks of three intervals are in turn extended to all anchored stacks of four intervals, and the best B nonredundant of those are determined. This process continues incrementally up to the largest possible stack. The minimum normalized footprint found at each step is recorded. These are the NF(S) values used above in the distributions D n. The removal of redundancy is a necessary step, particularly for large data sets. Because the number of substacks of a stack grows exponentially with the size of the stack, if redundancy is not removed, then the best B stacks considered for extension at a level could consist entirely of stacks anchored at the same location(s). Extending only these stacks to the next level could result in false negatives elsewhere on the chromosome arm. Note that the removal of redundancy does not bias our P-values because the same search strategy is applied to both the permuted and unpermuted data.
Optimization of this process can be achieved through the review and testing of the "search parameter" B. The higher B is set the more likely it is to find the global minimum, but the longer it will take to run. The appropriate setting of B will depend on the particular data set being analyzed, and STAC provides output that can help guide this decision. For example, one can output the number of stacks considered for extension at each level. From this, one can determine at which level of extension the heuristic will begin to take affect. In practice, we have found that setting B = 10,000 is more than sufficient for most data sets consisting of Figure 2B shows the results for the example data using footprint-based confidences alone. Notice how the locations at the left that were not significantly frequent across the entire data set are now found to be significant using the normalized footprint and subset search. In addition, another stack has been revealed (marked by *) that was less apparent and that we might have missed by eye. These locations may be relevant to a distinct subgroup of the samples. In practice, we find the frequency and footprint statistics complement one another. We therefore report the results for both statistics. Deriving meaningful and effective conclusions requires the careful consideration of both statistics since the inherent statistical meaning (and therefore biological implication) of each is different.
Application to two publicly available cancer CNA data sets
STAC detects regions of known biological and clinical relevance We first sought to investigate whether STAC could identify known clinically and biologically relevant genomic aberrations. Specifically, we expect to identify amplification at 2p24 containing MYCN, loss at 1p36, loss at 11q1425, and gain of 17q material in the neuroblastoma data, as these aberrations have been shown to be clinically and/or biologically relevant (for review, see Maris and Matthay 1999 Figure 4 shows the results for the four chromosome arms studied. STAC successfully finds locations of significance for each relevant chromosomal region. Amplification at 2p24 including the MYCN oncogene is readily identified (P fp = 0.0003, P fr = 0.0001), as is the region of loss at 1p36 (P fp = 0.0014, P fr = 0.0028), Analysis of chromosome arms 11q and 17q reveals the complementary nature of the statistics employed by STAC. Given the frequency of 17q gain, it seems clear that this aberration plays an important role in neuroblastoma. However, the problem of localizing a region (or regions) that may harbor putative oncogenes is far more difficult given the large intervals of gain seen in most samples. STAC identifies three relatively small regions of significant gain at 17q24.1 (P fp = 0.0102), 17q24.2 (P fp = 0.0075), and 17q25 (P fp = 0.0348) based on the footprint. Similarly, while a small region of loss is detected by both statistics at 11q23.3 (P fp = 0.0193, P fr = 0.0181), identification of the regions at 11q14.311q21 (P fp = 0.0005) and 11q25 (P fp = 0.0023) requires the increased sensitivity of the footprint statistic. All regions identified by STAC are within currently accepted significant regions of overlap (SROs) in neuroblastoma. Moreover, these data potentially narrow the regions and provide a mechanism for prioritizing follow-up efforts.
STAC provides statistical support for previously reported CNAs As a further validation step, we compare regions identified previously by Mosse et al. (2005) 0.05 for either statistic as significant.
STAC provides statistical support for the majority of the regions reported in Mosse et al. (2005)
STAC results for each of the 25 regions reported by Naylor et al. (2005)
Examination of the few discordant regions revealed two reasons for discrepancy. The most common explanation is the presence of frequent and long aberrant intervals, such as seen in the neuroblastoma data for 17q gains. Here, 90% of the cell lines exhibit large gains, rendering the localization of regions near impossible without statistical methods. The discordant region (54.557.7 Mb) was gained in 70% of the samples, yet our STAC frequency statistic tells us that this occurs 95% of the time in randomly permuted data. Manual review of such data to define regions is easily subject to investigator bias. The second explanation for discrepancy is simply that the region fell just below our P-value cutoff for significance. For example, loss on 3p reported in Mosse et al. (2005)
STAC identifies additional regions of significant gain/loss STAC finds a total of 94 regions of significant gain and 79 regions of significant loss in the neuroblastoma data (Supplemental Table 3). The gains encompass a total of 332 Mb of genomic sequence with an average region size of 3.53 Mb. Significant loss covers 305 Mb of genomic sequence with an average region size of 3.86 Mb. Of note, 77% (72 of 94) of the gain regions and 86% (69 of 80) of the loss regions went undetected by the traditional frequency threshold approach. Supplemental Table 4 provides a complete listing of all regions identified in the sporadic primary breast tumor data set. In summary, STAC analysis identifies 149 distinct regions of significant gain covering 525 Mb of genomic sequence. The average region of gain spans 3.43 Mb, and 94 of the regions found by STAC were not identified by a simple frequency thresholding approach. Our analysis identifies 124 distinct regions of loss covering 383 Mb of genomic sequence. The average region of loss spans 3.10 Mb.
Biological relevance of additional regions found by STAC
We performed two-way agglomerative hierarchical clustering on the significant STAC regions to facilitate the biological interpretation of our neuroblastoma STAC results (Fig. 5). Two clusters of samples characterized by distinct patterns of gain and loss are observed (Fig. 5A). Regions of known biological and clinical relevance are shown in Figure 5B, A through D. Sample clustering is not driven by gain of the MYCN oncogene at 2p24 (A) or 17q gain (B), both very frequent events in this data set. Samples with 1p36 loss (D) are clearly separated from those with 11q loss (C); it is well established that these genomic aberrations are negatively correlated and associated with poor prognosis in neuroblastoma (for review, see Maris and Matthay 1999
Sample cluster 1 is characterized by regions of loss, whereas cluster 2 exhibits frequent gain at these same locations (Fig. 5A, location cluster labeled E). Two thirds (145/217) of the locations contained in E were not identified by Mosse et al. (2005)
Cancer genomes are often riddled with CNAs, rendering the identification of relevant regions across multiple samples extremely difficult. Researchers traditionally rely on a simple frequency cutoff (e.g., "deleted in 30% of samples") followed by a laborious manual review to define region boundaries. While this approach may identify some relevant locations, it is tedious and time-consuming and lacks statistical control over false positives and false negatives. In particular, it assumes a constant null model across the genome, and therefore is too liberal in some cases and too conservative in others. We propose a sensitive statistical method for assessing the significance of recurrent genomic CNAs. STAC readily identifies regions of known biological and clinical relevance and reveals new recurrent aberrations that warrant further investigation. The method is sensitive to tight alignment of aberrant intervals and is capable of finding consistent regions of aberration within subsets of samples/experiments. These features are essential for localizing cancer genes and understanding cancer subtypes and progression. As with any computational analysis of large-scale data, STAC results should be reviewed to assess potential biological relevance. Not all significantly concordant aberrations may be relevant to the problem at hand. For example, STAC may identify copy number polymorphisms (CNPs) in addition to the recurrent CNAs that are the focus of cancer studies. Also, it is possible that some of the regions found by STAC represent artifacts of array fabrication/hybridization, the binning into fixed-width locations, or inaccuracies in the input data that our significance calculations are based on. However, unsupervised clustering of STAC results from neuroblastoma cell lines suggests that many of the additional regions have biological significance. Evidence for this is provided by their correlation with genomic abnormalities known to be associated with high-risk and poor outcome. Additional studies in a large panel of tumor specimens are underway to confirm this.
Few others have attempted to address the multi-experiment problem computationally (Aguirre et al. 2004 It is important to note that all multi-experiment approaches currently require one to first define the gains and losses within each individual sample. The best approach for doing this has yet to be determined and depends on the particular array and experimental design. We have found that the use of ratio thresholds for calling gain and loss often leads to false negatives (missing regions of aberration in individual samples) and can also lead to false positives, depending on experimental design. Concordant bias such as that which may be introduced by severe sample processing should be accounted for. For example, if the probe distributions are significantly variable, one can hybridize a battery of normal controls (processed identically to the test samples) in order to use a standard deviation criterion instead of a global ratio cutoff. It is often preferable to use one of the model-based methods to make gain/loss calls for each sample; however, this can result in a decrease in resolution since they tend to not call a region as aberrant unless it is supported by several array elements. In general, given that concordant bias has been minimized as described above, the single slide calls should be made fairly liberally, so to avoid false negatives, since the false positives in individual samples will be randomly scattered across the genome and STAC will not assign significance to these additional aberrations. In short, if it is just noise in the array, it does not result in STAC false positives. We envision at least two extensions to STAC in the near future. We first plan to enhance the power of STAC by incorporating the degree of gain and loss at each interval, especially high-level amplification and homozygous deletion. This can be accomplished by modifying our statistics to account for weighted intervals, where the weight of an interval is reflective of its degree of gain/loss. This is an intuitive extension to our method given that researchers routinely give greater consideration to more extreme alterations. The second planned extension is the assessment of significant co-occurring aberrations across multiple experiments. Such shared aberrations can be indicative of distinct disease progression pathways and as such are of obvious interest.
Lastly, we note that our algorithm is applicable to genomic research beyond cancer and the study of other diseases. Recent studies utilizing genomic copy number data from normal populations have noted the extent of genomic CNPs in the human genome and that CNPs are enriched near regions of segmental duplication (Bailey et al. 2002
Validation genomic CNA data Array-CGH data from 42 neuroblastoma cell lines (Mosse et al. 2005
Data preprocessing
Unsupervised class discovery
We thank Mitchell Guttman for important bug reports and Warren J. Ewens for his guidance and useful discussions. This work was supported in part by NIH/NHGRI Training Grant in Computational Genomics 2-T32-HG000046-07 (S.J.D.), K25-HG-0052 (G.R.G., the Abramson Family Cancer Research Institute (B.L.W., J.M.M.), and a seed grant provided by the Penn Genomics Institute (PGI) of the University of Pennsylvania.
5 Present address: Translational Medicine and Genetics, GlaxoSmith-Kline, King of Prussia, PA 19406.
E-mail diskin{at}email.chop.edu; fax (215) 590-3770. [Supplemental material is available online at www.genome.org.] Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.5076506. Freely available online through the Genome Research Open Access option.
Aguirre A.J., Brennan C., Baily G., Sinha R., Feng B., Leo C., Zhang Y., Zhang J., Gans J.D., Bardessy N. et al. 2004. High-resolution characterization of the pancreatic adenocarcinoma genome. Proc. Natl. Acad. Sci. 101: 90679072. Bailey J.A., Gu Z., Clark R.A., Reinert K., Samonte R.V., Schwartz S., Adams M.D., Meyers E.W., Li P.W., Eichler E.E. 2002. Recent segmental duplications in the human genome. Science 297: 10031007. Barrett M.T., Scheffer A., Ben-Dor A., Sampas N., Lipson D., Kincaid R., Tsang P., Curry B., Baird K., Meltzer P.S. et al. 2004. Comparative genomic hybridization using oligonucleotide microarrays and total genomic DNA. Proc. Natl. Acad. Sci. 101: 1776517770. Brodeur G.M. 2003. Neuroblastoma: Biological insights into a clinical enigma. Nat. Rev. Cancer 3: 203216.[CrossRef][Medline] Brodeur G.M. and Maris J.M. In Principles and practice of pediatric oncology (eds. Pizzo P.A. and Pollack D.G.) . pp. 895938. 2002. 4th ed. Lippincott Williams & Wilkins, Philadelphia PA. Eisen M.B., Spellman P.T., Brown P.O., Botstein D. 1998. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. 95: 1486314868. Grant G., Manduchi E., Cheung V., Ewens W. 1999. Significance testing for direct identity-by-descent mapping. Ann. Hum. Genet. 63: 441454.[CrossRef][Medline] Greshock J., Naylor T., Margolin A., Diskin S., Cleaver S.H., Futreal P.A., deJong P.J., Zhao S., Liebman M., Weber B.L. 2004. 1-Mb resolution array-based comparative genomic hybridization using a BAC clone set optimized for cancer gene analysis. Genome Res. 14: 179187. Hinds D.A., Kloek A.P., Jen M., Chen X., Frazer K.A. 2006. Common deletions and SNPs are in linkage disequilibrium in the human genome. Nat. Genet. 38: 8285.[Medline] Ishkanian A.S., Malloff C.A., Watson S.K., DeLeeuw R.J., Chi B., Coe B.P., Snijders A., Albertson D.G., Pinkel D., Marra M.A. et al. 2004. A tiling resolution DNA microarray with complete coverage of the human genome. Nat. Genet. 36: 299303.[CrossRef][Medline] Lai W.R., Johnson M.D., Kucherlapati R., Park P.J. 2005. Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data. Bioinformatics 19: 37633770. Lipson D., Aumann Y., Ben-Dor A., Linial N., Yakhini Z. 2005. Efficient calculation of interval scores for DNA copy number data analysis. In Proceedings of RECOMB 05. . Springer-Verlag, Cambridge MA. Lynch T.J., Bell D.W., Sordella R., Gurubhagavatula S., Okimoto R.A., Brannigan B.W., Haris P.L., Haserlat S.M., Supko J.G., Haluska F.G. et al. 2004. Activating mutations in the epidermal growth factor receptor underlying responsiveness of non-small-cell lung cancer to gefitinib. N. Engl. J. Med. 350: 21292139. Maris J.M. and Matthay K.K. 1999. Molecular biology of neuroblastoma. J. Clin. Oncol. 17: 22642279. McCarroll S.A., Hadnott T.N., Perry G.H., Sabeti P.C., Zody M.C., Barrett J.C., Dallaire S., Gabriel S.B., Lee C., Daly M.J. et al. 2006. Common deletion polymorphisms in the human genome. Nat. Genet. 38: 8692.[Medline] Mosse Y.P., Greshock J., Margolin A., Naylor T., Cole K., Khazi D., Hii G., Winter C., Shahzad S., Asziz M.U. et al. 2005. High-resolution detection and mapping of genomic DNA alterations in neuroblastoma. Genes Chromosomes Cancer 43: 390403.[CrossRef][Medline] Naylor T.L., Greshock J., Wang Y., Colligon T., Yu Q.C., Clemmer V., Zaks T.Z., Weber B.L. 2005. High resolution genomic analysis of sporadic breast cancer using array-based comparative genomic hybridization. Breast Cancer Res. 6: R1186R1198. Paez J.G., Janne P.A., Lee J.C., Tracy S., Greulich H., Gabriel S., Herman P., Kaye F.J., Lindeman N., Boggon T.J. et al. 2004. EGFR mutations in lung cancer: Correlation with clinical response to gefitinib therapy. Science 304: 14871500. Pinkel D., Segraves R., Sudar D., Clark S., Poole I., Kowbel D., Collins C., Kuo W., Chen C., Zhai Y. et al. 1998. High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat. Genet. 20: 207211.[CrossRef][Medline] Rouveirol C., Stransky N., Hupe P., La Rosa P.L., Viara E., Barillot E., Radvanyi F. 2006. Computation of recurrent minimal genomic alterations from array-CGH data. Bioinformatics 22: 849856. Sharp A.J., Locke D.P., McGrath S.D., Cheng Z., Bailey J.A., Vallente R.U., Pertz L.M., Clark R.A., Schwartz S., Segraves R. et al. 2005. Segmental duplications and copy-number variation in the human genome. Am. J. Hum. Genet. 77: 7888.[CrossRef][Medline] Snijders A.M., Nowak N., Segraves R., Blackwood S., Brown N., Conroy J., Hamilton G., Hindle A.K., Huey B., Kimura K. et al. 2001. Assembly of microarrays for genome-wide measurement of DNA copy number. Nat. Genet. 29: 263264.[CrossRef][Medline] Willenbrock H. and Fridlyand J. 2005. A comparison study: Applying segmentation to array CGH data for downstream analyses. Bioinformatics 15: 40844091. Winston J.S., Ramanaryanan J., Levine E. 2004. HER-2/neu evaluation in breast cancer are we there yet? Am. J. Clin. Pathol. 121: S33S49.
This article has been cited by other articles:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||