|
|
|
|
Genome Res. 15:269-275, 2005 ©2005 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/05 $5.00 Methods Highly multiplexed molecular inversion probe genotyping: Over 10,000 targeted SNPs genotyped in a single tube assay1 ParAllele BioScience, Inc., South San Francisco, California 94080, USA 2 Baylor College of Medicine, Human Genome Sequencing Center, Houston, Texas 77030, USA 3 Stanford Genome Technology Center, Stanford University, California 94305, USA
Large-scale genetic studies are highly dependent on efficient and scalable multiplex SNP assays. In this study, we report the development of Molecular Inversion Probe technology with four-color, single array detection, applied to large-scale genotyping of up to 12,000 SNPs per reaction. While generating 38,429 SNP assays using this technology in a population of 30 trios from the Centre d'Etude Polymorphisme Humain family panel as part of the International HapMap project, we established SNP conversion rates of 90% with concordance rates >99.6% and completeness levels >98% for assays multiplexed up to 12,000plex levels. Furthermore, these individual metrics can be "traded off" and, by sacrificing a small fraction of the conversion rate, the accuracy can be increased to very high levels. No loss of performance is seen when scaling from 6,000plex to 12,000plex assays, strongly validating the ability of the technology to suppress cross-reactivity at high multiplex levels. The results of this study demonstrate the suitability of this technology for comprehensive association studies that use targeted SNPs in indirect linkage disequilibrium studies or that directly screen for causative mutations.
Complex human diseases are known to have a significant genetic component. Despite some important successes (Altshuler et al. 2000
The remaining requirement to fully enable large-scale genetic association studies is the development of truly cost-effective and scalable SNP genotyping technologies. These methods must allow hundreds of thousands of markers to be efficiently and accurately scored in thousands of patients. The first generation of SNP genotyping technologies were based on single amplification reactions for each locus and were not appropriate for these largescale, whole-genome studies. Recent advances have assayed thousands of random SNPs using ultra high-density wafer hybridizations (Matsuzaki et al. 2004
Here, we describe an advanced Molecular Inversion Probe (MIP) genotyping technology (Hardenbol et al. 2003
MIP technology The basic concept of MIP technology has been described previously (Hardenbol et al. 2003
Multiplex tag detection
We have used a molecular tagging (or barcoding) strategy (Shoemaker et al. 1996
The performance of the 20,000 new tags was first compared to a publicly available set of Next, the experiment was repeated using the set of 20,000 21mers that we designed. Two arrays were designed using these sequences; one with 6,000 tags (TrueTag 5k array) and another with 12,000 tags (TrueTag 10k array). Performing the same experiment described above on a TrueTag 10k array using a probe pool containing 6072 probes led to 18 nonactive features with signals >10% of the average active signal; a total of 0.3% of all features. Thus, the new tag set exhibits a three-fold higher specificity than previous sets. This added specificity has allowed us to obtain high accuracy genotyping data at 12,000plex (see below).
Cluster analysis
The overall behavior of the algorithm is controlled by parameters that are used to reject unreliable data at two different levels. The first level is to reject a given data point (sample) within the context of a given marker. Here, the main criterion is the relative probability of the data point belonging to its primary cluster versus belonging to its secondary cluster. Data points where this ratio is too low are considered ambiguous and hence not called. In addition each data point is required to have a minimum value for signal (sum of allele signals) with respect to chip noise and also with respect to nonallele signal (sum of nonallele signals). The second level of discrimination is provided when the algorithm also rejects markers as a whole whenever a marker is deemed unreliable. The criteria here are based on the relative dispersion of the clusters, e.g., markers with loosely defined clusters are rejected. Since we do not use trio discordance (non-Mendelian inheritance) as a measure of marker reliability, we are able to use this measure as an unbiased estimate of the algorithm's overall accuracy. It is important to note that all of the fixed parameters that control the algorithm are fixed for the entire data set of x thousand markers times x hundred samples that is being fit at one time. Hence trio discordance is a reliable measure of the accuracy of the entire data set.
Multicolor detection A more streamlined analysis is now enabled by labeling each of the four allele specific reactions with a spectrally distinct fluorophore (Fig. 2). These reactions can then be pooled together and hybridized to a single tag array. Fluorescent images are collected using four different filters to collect emission from each single fluorescent species such that the intensities from each of the allelic reactions can be measured from a single chip feature. In addition to significant chip cost savings, this method also has the advantage of rendering the genotype calls resistant to possible feature-to-feature variation in the arrays. A further benefit is that signal ratios between alleles, which are critical to accurate genotyping, are rendered insensitive to feature saturation that can otherwise lead to loss of linearity in response.
During this study, we tested both two-color and four-color assays and showed that the four-color method was superior. For all probe batches, a two-color protocol was implemented using two arrays per individual. For batch 4, a four-color protocol was also implemented using a CCD imager. Comparison of the data demonstrates that the four-color assay generates more accurate and more complete data (Table 1). Repeatability between two- and four-color data is 99.8%.
Performance metrics for genotyping The ideal genotyping technology would identify any chosen base in the genome in all DNA samples with perfect accuracy. A series of performance metrics are now commonly applied to measure how close real data are to this ideal. "Conversion rate" is a measure of the SNPs in the genome that can be assayed and is a function of both the quality of the SNPs chosen and the technology used to score them. In addition, some SNPs can be rejected in silico, prior to probe synthesis. "Call rate" for a given marker is the percentage of DNA samples in a study whose genotype is successfully measured. For a set of converted SNPs, the percentage of total genotypes returned across all markers is defined as the "completeness" of the study. Finally, "accuracy" is the percentage of these genotypes that are correct. Ultimately, the quality of the genotype data is measured by the power that it provides in the elucidation of genetic associations. A high quality genotyping technology will have a power to find associations that closely approximates the power of the idealized technology. We have evaluated the full set of performance metrics using this new MIP assay and produced 38,429 working probes targeting SNPs on Chromosome 12 in six batches for the HapMap project. Batches 1-3 were 6,000-8,000plex pools designed using the Tag3 tag set (Affymetrix). Batch 4 was a 6,000plex and the first set developed using the new tag set (TrueTag 5k), while Batch 6 was the first batch to be designed for over 10,000 SNPs using the expanded tag set (TrueTag 10k). As a result, these two representative probe batches were chosen to analyze the performance of the technology in detail. Each batch was genotyped on 95 samples consisting of 30 trios from the CEU collection and five repeated samples to measure repeatability. Batch 4 was repeated using the four-color detection system as described above. Table 1 summarizes the performance metrics of these two batches.
Conversion rates were high regardless of multiplex level or detection method. Here conversion rate is defined to be the fraction of probes per batch that yielded strong signals with discernable clusters to indicate the different genotypes. As mentioned previously, in silico design rates are relevant in assessing overall SNP yield. More than SNPs for which only one of two alleles can be detected, termed "allelic drop-outs," are an intrinsically rare property of MIP since it uses a single probe per SNP. The allelic drop-out rate was measured to be only 0.4% of assays by comparison to publicly available data generated by other methods. The call rate of a given probe is defined to be the number of genotypes that were unambiguously clustered divided by the total number of genotypes attempted across all individuals. Completeness is the average call rate of converted probes. Completeness is high and is again unaffected by multiplex level or detection method (Table 1). Two measures of accuracy are possible for the data generated. The first is the data gathered from repeated samples. Such data allow random errors to be measured when the same marker gives nonconcordant data in repeated assays on the same individual. By this metric, the MIP probes performed with high accuracy. Repeatability ranges from 99.5%-99.9% across batches (Table 1). The second measure of accuracy is trio concordance. The use of mother-father-child trios in this project was designed to allow the accuracy to be monitored by looking at the Mendelian inheritance patterns across markers (see Methods section for detailed description). This test is able to capture some forms of systematic error that cannot be estimated by repeatability. Again, the accuracy rates based on trio concordance indicate a high level of data accuracy (>99.6%). Because the criteria for assay conversion did not take into account the trio concordance rate or the repeatability rate, these measurements are good predictors of the overall accuracy in the data set. It is important to note that all of these parameters, including conversion rate, completeness, and concordance, are related through parameters that can be changed in the clustering algorithm used to assign genotypes. A permissive algorithm can result in high levels of completeness and conversion at the expense of accuracy. In general, these changes are seen at the margin between the bulk of the probes that are very complete with high accuracy and the probes that fail to call. This is shown in Figure 3. Batch 6 is shown ordered along the x-axis such that the probe with the highest call rate is plotted at the origin while the probe with the lowest call rate is at the right. The call rate for each probe across the full sample set is then shown for two different choices of cluster parameters: A stringent set, which accepts only calls very clearly in good clusters, and a more permissive set, which accepts data at the periphery of clusters. What can be seen is that the amount of missing data can be decreased at the cost of making a small number of increased errors.
The optimal choice for this trade-off depends on the required use of the data. If single marker association is being used for common SNPs, it may be appropriate to choose fairly permissive clustering parameters because the gain in "usable" data quantity is more advantageous than the small number of errors that are being made. When rare markers (Kang et al. 2004
Table 1 shows the effect of emphasizing conversion rate, completeness, or accuracy in choosing cluster parameters. Conversion rates of >90% are possible with accuracy rates of The impact of these compromises is best exemplified in the context of a model genotyping experiment. Figure 4 shows the effect of inaccurate or incomplete data from an association study for which the causative alleles are of varying frequency as shown on the x-axis. Given a genetic disease model (genetic relative risk GRR = 2 in a multiplicative model of disease) the number of patients and controls required to achieve an 80% power is plotted assuming single marker allelic tests are performed. As can be seen, common marker associations are relatively insensitive to missing data and error rates as high as 1%. On the other hand, if rare SNP markers are under study, accuracy is very important. Making an error in 1% of the data would double the population size required to find a 1% frequency marker in this model. Similarly, missing data are also more damaging when looking for signals in less frequent markers. Overall, the performance of all the converted markers in this study is sufficient to handle the full scope of association study applications with minimal loss of power relative to the ideal technology.
This report of an advanced Molecular Inversion Probe technology now allows highly efficient, accurate, and low cost genotyping of targeted SNPs at levels >10,000plex. The unique features described in this study enable this technology to be scaled from 1,000plex levels to >10,000plex levels without any loss of performance in terms of accuracy completeness or conversion rate. At the same time the development of a four-color scanning solution for the tag arrays has resulted in decreased processing times and chip costs while insulating genotype calls from feature-to-feature chip noise. This technology can be applied to complex genetic analysis in two basic ways. First, comprehensive LD mapping can be performed using a whole-genome HapMap tagging approach; second, comprehensive direct detection of potentially functional SNPs in coding and conserved regions is possible. As shown above, maintaining a high level of accuracy and completeness is critical when analyzing rare SNPs or haplotypes. The importance of being able to target specific SNPs, and to achieve a high conversion rate, is clear in the case of direct detection of causative SNPs. SNPs in functional regions tend to be rarer, leaving most without an alternative, common surrogate in high LD. This failure to convert a SNP and its concomitant loss of power to detect its genetic effect is directly proportional to the conversion rate. Hybridization-based methods whose conversion rates are significantly <50% will be unable to achieve high power to assess these SNPs. The conversion rates demonstrated in this study indicate that >85% conversion of SNPs in unique regions of the genome is possible. Furthermore, a second manufacturing and design pass has been shown to recover approximately half of the unconverted probes that failed due to failed oligo synthesis, secondary structure in the probe-tag complex, etc. (P. Hardenbol, unpubl.). With two rounds of synthesis, it should be possible to achieve conversion rates >90% and to very comprehensively analyze these SNPs. The effect of high assay conversion rates on linkage disequilibrium mapping studies such as the HapMap is important but more subtle. The very nature of an LD mapping method presupposes a degree of redundancy in the choice of markers as some SNPs will be in high LD with each other. However, the ability to convert SNPs at a high rate does confer several advantages in large-scale HapMap approaches. First, if one cannot predict ahead of time which SNPs will fail due to a particular sequence context, a low conversion rate will increase the expense and effort required to build a map as multiple redundant SNPs will need to be attempted to find scorable tagging SNPs. This effort is onerous if several population specific maps are to be constructed. Secondly, the number of surrogates that exist for a given SNP will depend on the density of markers that one is choosing from. The first phase of the HapMap project is expected to produce an informative SNP every 5 kb. At this density, there will be a large number of SNPs that are not in LD with any others. Failure to convert these SNPs will lead to loss of power. Calculating the impact of these gaps on an association study is difficult. Larger haplotypes for which there are multiple surrogates will cover more of the genome and thus add more power than a singleton SNP in a region of high recombination, so the loss of power will not be directly proportional to the SNPs missed as in the case of direct detection. Nevertheless, it is clear that the task of building a comprehensive HapMap product is greatly simplified by a technology that can retain a high rate of assay conversion while achieving high levels of multiplexing. Further increases in the levels of multiplexing for this assay are likely. Tag set development is proceeding and early tests on a set of 40,000 tags indicate that a similar amount of crosshybridization noise is achievable (data not shown). We believe that a set of 100,000 tags should be a realistic goal in light of recent advances in array technology. The MIP assay itself has shown no evidence of producing nonspecific probe inversion as we have moved to higher levels of multiplexing. Signal-to-noise levels have remained constant between 6,000plex and 12,000plex assays. It should be noted that the total mass of genomic DNA in the reaction is 40 times the probe amounts used in a 12,000plex reaction. As a result the probe-probe interactions that are increasing with multiplexing are still at a level far below the probe-genome interactions. It seems reasonable to assume that another order of magnitude in multiplexing should be achievable. Indeed, preliminary results using 24,000plex MIP reactions have shown highly accurate results (data not shown). The final challenge is in maintaining sufficient detection signal as one splits a single amplification reaction over increasingly large numbers of amplicons. Several means are available to address this issue including: concentration of larger PCR reactions, brighter fluorescent labels, more sensitive scanners, increased hybridization times, and increased oligo density in arrays. Taking all of these considerations into account, we believe that this technology will be able to be scaled to the level of 100,000plex in the near future.
Assay design Batches of SNPs on Chromosome 12 were chosen according to criteria that emphasized even spacing. Once a batch of SNPs has been selected, a homology sequence is selected based on Tm optimization that is on average 40.4 bases long that is centered over the SNP and is complementary to the genome. This sequence is BLASTed against the genome to determine whether the sequence is unique in the genome, where unique is defined as an exact match to only one position. If the exact sequence appears more than once, the probe is not synthesized. This is the only filter used. Next, a tag sequence is added that is unique among the batch of assays and complementary to a feature on the detection chip system. No consideration is given to the degree of complementarity of sequences within the batch. Probes can target overlapping sequences since the genomic DNA is not saturated with hybridized probe. All probe batches were manufactured by ParAllele Bioscience using its proprietary MIP probe synthesis procedures and are commercially available (MegAllele kit, ParAllele BioScience). This process is a pooled procedure that results in a pool of up to 12,000 probes that are tested using pooled quality control procedures before being sent to Baylor College of Medicine.
Genotyping reactions
Two-color assay execution
Four-color assay execution
Data
This work was supported by NIH grant 1U54HG02755. We wish to acknowledge the ongoing intellectual contributions of Professor Ulf Landegren and his lab to the development of the MIP technology.
4 Corresponding author. E-mail tom{at}p-gene.com; fax (650) 228-7405. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.3185605.
Altshuler, D., Hirschhorn, J.N., Klannemark, M., Lindgren, C.M., Vohl, M.C., Nemesh, J., Lane, C.R., Schaffner, S.F., Bolk, S., Brewer, C., et al. 2000. The common PPAR Cupples, L.A., Yang, Q., Demissie, S., Copenhafer, D., and Levy, D. 2003. Description of the Framingham Heart Study data for Genetic Analysis Workshop 13. BMC Genet. 4 Suppl 1: S2. Geschwind, D.H., Sowinski, J., Lord, C., Iversen, P., Shestack, J., Jones, P., Ducat, L., and Spence, S.J. 2001. The autism genetic resource exchange: A resource for the study of autism and related neuropsychiatric conditions. Am. J. Hum. Genet. 69: 463-466.[CrossRef][Medline]
Grossman, P.D., Bloch, W., Brinson, E., Chang, C.C., Eggerding, F.A., Fung, S., Iovannisci, D.M., Woo, S., Winn-Deen, E.S., and Iovannisci, D.A. 1994. High-density multiplex detection of nucleic acid sequences: Oligonucleotide ligation assay and sequence-coded separation. Nucleic Acids Res.22: 4527-4534. Hardenbol, P., Baner, J., Jain, M., Nilsson, M., Namsaraev, E.A., Karlin-Neumann, G.A., Fakhrai-Rad, H., Ronaghi, M., Willis, T.D., Landegren, U., et al. 2003. Multiplexed genotyping with sequence-tagged molecular inversion probes. Nat. Biotechnol. 21: 673-678.[CrossRef][Medline] Hugot, J.P., Chamaillard, M., Zouali, H., Lesage, S., Cezard, J.P., Belaiche, J., Almer, S., Tysk, C., O'Morain, C.A., Gassull, M., et al. 2001. Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn's disease. Nature 411: 599-603.[CrossRef][Medline] The International HapMap Consortium. 2003. The International HapMap Project. Nature 426: 789-796.[CrossRef][Medline] Kang, S.J., Gordon, D., and Finch, S.J. 2004. What SNP genotyping errors are most costly for genetic association studies? Genet. Epidemiol. 26: 132-141.[CrossRef][Medline] Kirk, K.M. and Cardon, L.R. 2002. The impact of genotyping error on haplotype reconstruction and frequency estimation. Eur. J. Hum. Genet. 10: 616-622.[CrossRef][Medline] Kruglyak, L. 1999. Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat. Genet. 22: 139-144.[CrossRef][Medline]
Matsuzaki, H., Loi, H., Dong, S., Tsai, Y.Y., Fang, J., Law, J., Di, X., Liu, W.M., Yang, G., Liu, G., et al. 2004. Parallel genotyping of over 10,000 SNPs using a one-primer assay on a high-density oligonucleotide array. Genome Res. 14: 414-425. Oliphant, A., Barker, D.L., Stuelpnagel, J.R., and Chee, M.S. 2002. BeadArray technology: Enabling an accurate, cost-effective approach to high-throughput genotyping. Biotechniques Suppl: 56-58, 60-61. Risch, N.J. 2000. Searching for genetic determinants in the new millennium. Nature 405: 847-856.[CrossRef][Medline] Samiotaki, M., Kwiatkowski, M., Parik, J., and Landegren, U. 1994. Dual-color detection of DNA sequence variants by ligase-mediated analysis. Genomics 20: 238-242.[CrossRef][Medline] Shmulewitz, D., Auerbach, S.B., Lehner, T., Blundell, M.L., Winick, J.D., Youngman, L.D., Skilling, V., Heath, S.C., Ott, J., Stoffel, M., et al. 2001. Epidemiology and factor analysis of obesity, type II diabetes, hypertension, and dyslipidemia (syndrome X) on the Island of Kosrae, Federated States of Micronesia. Hum. Hered. 51: 8-19.[Medline] Shoemaker, D.D., Lashkari, D.A., Morris, D., Mittmann, M., and Davis, R.W. 1996. Quantitative phenotypic analysis of yeast deletion mutants using a highly parallel molecular bar-coding strategy. Nat. Genet. 14: 450-456.[CrossRef][Medline]
www.hapmap.org; Chromosome 12 data, HapMap Project Web site.
Received August 24, 2004; accepted in revised format October 14, 2004. This article has been cited by other articles:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||