|
|
|
|
|
METHODS DNA Analysis by Fluorescence Quenching Detection1Cardiovascular Research Institute and the2 Department of Dermatology, University of California, San Francisco, California 94143, USA
The analysis of human genetic variations such as single nucleotide polymorphisms (SNPs) has great applications in genome-wide association studies of complex genetic traits. We have developed an SNP genotyping method based on the primer extension assay with fluorescence quenching as the detection. The template-directed dye-terminator incorporation with fluorescence quenching detection (FQ-TDI) assay is based on the observation that the intensity of fluorescent dye R110- and R6G-labeled acycloterminators is universally quenched once they are incorporated onto a DNA oligonucleotide primer. By comparing the rate of fluorescence quenching of the two allelic dyes in real time, we have extended this method for allele frequency estimation of SNPs in pooled DNA samples. The kinetic FQ-TDI assay is highly accurate and reproducible both in genotyping and in allele frequency estimation. Allele frequencies estimated by the kinetic FQ-TDI assay correlated well with known allele frequencies, with an r2 value of 0.993. Applying this strategy to large-scale studies will greatly reduce the time and cost for genotyping hundreds and thousands of SNP markers between affected and control populations.
Single nucleotide polymorphisms (SNPs) are the most abundant sequence variation found in nature. When two human genomes are compared, single base-pair variations are found at approximately 1200-nucleotide intervals. Because of their abundance and low mutation rate, SNPs are the markers of choice in association studies to identify the genetic risk factors in common diseases (Risch and Merikangas 1996
To reduce the time and cost associated with genotyping every individual
in a study, it has been proposed that investigators work with pooled
DNA samples constructed by mixing equal amounts of DNA from groups of
individuals. The rationale is that, by definition, most of the SNPs in
the genome are not associated with a disease phenotype, and so a
cost-effective screening method that can identify significant
divergence in allele frequency between case and control populations
will not only shorten the time it takes to conduct an association study
but also reduce the cost of the study substantially. A number of
methods have been developed to estimate allele frequencies in pooled
samples (Kwok et al. 1994 We have developed a genotyping assay based on allele-specific primer extension, termed template-directed dye-terminator incorporation (TDI) assay with fluorescence quenching (FQ) detection. The FQ-TDI assay takes advantage of the fact that the fluorescence intensity of certain dyes is significantly quenched when they are attached to oligonucleotides. By monitoring changes in fluorescence intensity of dye-terminators in an allele-specific primer extension reaction, one can determine the genotype of a DNA sample. If one monitors the fluorescence intensity change in real time, the relative abundance of the alleles in a pooled sample can be determined accurately. We report here that the simple, extremely low-cost FQ-TDI assay yielded highly accurate genotypes (220 SNPs) and allele frequency estimates (six SNPs).
SNP Genotyping by the FQ-TDI Assay The fluorescence of some conjugated fluorescent dyes is sensitive to the local environment of the oligonucleotides. The fluorescence quenching due to dye-nucleotide interactions has been reported for many different fluorescent dyes (Torimura et al. 2001
The quenching pattern of R6G-acycloterminator is shifted somewhat lower than that of the R110-acycloterminator, with 88% of the 258 SNP primers tested quenching the R6G fluorescence by more than 20%. Tamra- and Texas Red-acycloterminators were quenched even less, with only 65% of the 667 SNP primer tested quenching the Tamra-acycloterminator by at least 20% and none of the 50 SNP primers tested quenching the Texas Red-acycloterminator more than 20%. Considering the fact that over 88% of SNP primers show significant quenching effects (>20%) on R110- and R6G-acycloterminators, one can monitor the changes in fluorescence intensities during the primer extension step to determine which terminator(s) get incorporated and infer the SNP genotype. This can be done in real time by a fluorescence spectrophotometer connected to a thermal cycler or at its end-point by means of a fluorescence plate reader. Figure 2 shows the real-time fluorescence intensity profiles of four representative samples tested for SNP marker rs154162 during linear amplification in the primer extension step of the TDI assay. The fluorescence readings correspond to the emission maxima for R110-G (525nm) and R6G-A (535 nm) acycloterminators normalized by multicomponent analysis. In Figure 2A, the G/G homozygous sample incorporates R110-G (but not R6G-A) and shows a progressive drop in R110 fluorescence (filled circles) but no change in R6G fluorescence (open circles) as thermal cycling proceeds. The homozygous A/A sample in Figure 2B incorporates R6G-A (but not R110-G) and shows a drop in R6G fluorescence but no change in R110 fluorescence. The heterozygous G/A sample in Figure 2C incorporates both R110-G and R6G-A and shows a drop in both R110 and R6G fluorescence. In contrast, the fluorescence intensity profile of a negative control sample (Fig. 2D) shows no change for both dyes, because neither is incorporated. The rate of change of the R110 and R6G fluorescence intensities is a reflection of the amounts of present alleles, with the initial slope of change for the heterozygote approximately half of that for the homozygote. By inspecting the changes in fluorescence intensity for R110 and R6G, one can assign the allelic status of each test sample with high confidence. We obtained 1920 genotypes across 258 SNPs with a conversion rate of 85% (220 SNPs were successfully genotyped, and >90% of the samples yielded high confidence genotype calls). Among the successfully genotyped SNPs, the average high confidence call rate is 93%, and the concordance between genotypes called by the FQ-TDI and FP-TDI assays was 99.8%.
Allele Frequency Determination in Pooled DNA Samples The FQ-TDI assay can be extended to determine the allele frequency of pooled DNA samples. Because fixed amounts of dye terminators are used in the primer extension reaction, the larger the number of target DNA molecules containing a particular allele that is present, the faster the dye terminators are used up. The initial rate of change in fluorescence intensity (the steepness of the slope) of a dye-terminator is therefore a reflection of the amount of DNA molecule containing the allele corresponding to the dye-terminator incorporated. Assuming that the DNA polymerase incorporates the two dye-terminators with equal efficiency, the rate of change in fluorescence intensity of both dyes will be the same when the allele frequency is 50% or when the sample is from a heterozygote. To measure the allele frequencies of an SNP in a pooled DNA sample, the incorporation rates of the two dye-terminators are compared kinetically, cycle by cycle. These incorporation rates are then used in determining the relative amounts of each allele present in the pooled samples, using a heterozygote as the reference. Figure 3 illustrates a typical reaction performed to determine allele frequency for pooled DNA samples. The top panel is a normalized real-time quenching curve for a heterozygous sample of rs922365 (see Methods section for normalization procedure). The linear regressions of the first eight cycles are shown in the inset, and the ratio of the two slopes is calculated as 0.93 (R6G/R110), which indicates that the AcycloPol enzyme incorporates R110-G slightly faster than R6G-A. The allele frequencies are obtained by comparing the ratios of the slopes for the pooled sample in the bottom panel of Figure 3 and the ratio of the slopes for the heterozygous sample (see Methods).
To demonstrate the validity of the allele frequency estimation approach, we performed two sets of experiments. First, we constructed DNA mixtures containing varying amounts of the two alleles for the SNP marker rs922365 using DNA from two individuals homozygous for the two alleles. Mixtures with allele frequencies in 5% steps were constructed from 5%95%. The calculated allele frequencies were strongly correlated with predetermined allele frequencies (Table 1). The calculations were based on the assumption that the relative terminator incorporation efficiency was 0.93 and the total quenching ratio was 0.4/0.55 for the two acycloterminators. The data show that this method can clearly distinguish between pools with allele frequencies that differ by less than 5% just by visual inspection. When plotted against the known allele frequencies of the mixtures, the calculated initial slope ratios fit well with a hyperbolic curve represented by the equation y = 1.02/(1 + 0.66x; Fig. 4). This curve agrees well with the predicted equation of y = 1.0/(1 + 0.67x) used for allele frequency estimation. The results of this experiment confirm that the allele frequencies can be estimated accurately, with only slightly higher standard error at the ends of the allele frequency spectrum (the largest standard deviation is 2.8% for the 95% allele frequency mixture).
The second set of experiments was done to compare the estimated allele frequencies in 21 SNP-pooled sample combinations obtained by the kinetic FQ-TDI assay against the allele frequencies obtained by genotyping the individuals in these pools using the FP-TDI assay (Chen et al. 1999
Fluorescence quenching is a well known phenomenon and is part of the detection method in several SNP genotyping assays, including molecular beacons, 5'-nuclease (Taqman) assay, and the Invader assay. Instead of using a special quencher moiety, our approach takes advantage of the inherent quenching properties of DNA. DNA does not quench all dyes effectively, however, and some DNA sequences do not quench very well. Of the fluorescent dyes tested, we have found that R110 and R6G, two commonly used dyes in nucleic acid labeling, are strongly quenched by DNA. We have also observed that >70% of the quenching is due to the primary structure of DNA and that effective quenching is always seen when there is at least one guanosine within 10 bases of the dye-terminator after it is incorporated (data not shown). Furthermore, the strongest quenching is observed when two consecutive guanosines are found immediately adjacent to the incorporated dye-terminator. Our results confirm the important role guanosine plays as a quencher of fluorescence. As the base with the highest electron donating property, guanosine promotes the formation of charge transfer complexes between the fluorophores and nucleosides (Seidel et al. 1996 A feature of the FQ-TDI assay is that the dye-terminators are used up during the course of the reaction and quenching is complete when this occurs. Therefore, the initial incorporation rate of dye-terminators is the best indicator of the amount of DNA template present in the pooled sample. We only use the fluorescence readings during the first eight cycles of the primer extension reaction in calculating the initial slope of fluorescence intensity change because, under the standard conditions used, dye-terminator incorporation during the first eight cycles is linear for all the assays studied. Moreover, mis-incorporation of dye-terminators is minimal during the first eight cycles. The data from the mixing experiments show that the FQ-TDI assay works well over the entire allele frequency spectrum, although slightly larger variations of the allele frequency are seen at both ends of the spectrum. This is largely due to the fact that we used equal amounts of R110- and R6G-acycloterminators for the primer extension reaction. For the allele with >90% allele frequency, the limited terminators will be used up too quickly to be sensitive to small allele frequency differences. The inaccuracy of estimating the slope of a flat line also contributes to the problem for the allele at <10% allele frequency. Instead of using equal amounts of two terminators, a 2:1 ratio of terminators (allele with high allele frequency vs. allele with low allele frequency) can be used to change the slopes and therefore improve the sensitivity for pools with extreme allele frequencies (data not shown). In summary, we have developed an allele frequency estimation strategy that is both accurate and cost-effective. Without any dye-labeled probes used in the assay, the development cost is minimal. Furthermore, designs for the kinetic FQ-TDI assay using universal assay conditions have been made for >1.6 million publicly available SNPs. This low-cost assay is as accurate as any other currently available allele frequency estimation method. It is a viable approach to screen populations for small allele frequency differences in genome-wide case-control studies, thereby reducing the time and cost associated with these large-scale studies.
DNA Samples and Pool Construction DNA samples of 96 anonymous individuals were obtained from the Coriell Institute for Medical Research. The pooled DNA samples included 32 individuals each from the African-American, Asian-American, and European-American panels. The concentration of individual DNA samples was determined by using both the absorbance at 260 nm and a DNA-specific fluorescence dye, PicoGreen (Molecular Probes). The population pool samples were constructed by adding equal amounts (300 ng) of DNA from each of the 32 individuals to yield a pooled sample containing a final DNA concentration of 10 ng/µL. The standard allele frequency samples were constructed by mixing two homozygous DNAs in various ratios.
SNP Markers
Assay Reagents
PCR Amplification, and Degradation of Excess PCR Primers and dNTPs
Primer Extension Reaction
Theory of FQ Detection
Initially, all dye-terminators are unincorporated and the intensity is If, because ff = 1 and fb = 0. When the reaction is driven to completion, all dye-terminators are incorporated into primer and the observed intensity becomes Ib, with ff = 0 and fb = 1. The fractions of free and bound dye-terminators change in opposite directions during the primer extension reaction, and the rate of this change depends on the amounts of the DNA template in the reaction. In general, the rate of dye-terminator incorporation is determined by two factors: the amount of starting materials (PCR-amplified DNA fragments, SNP primers, and dye-terminators), and the incorporation efficiency of the dye-terminator by the DNA polymerase. Assuming that there are X copies of PCR-amplified fragments, then there are pX copies with the allele 1 and (1 p)X copies with allele 2 (where p is the allele frequency of allele 1). Further assume that the two dye-terminators are present in equal amounts (Y molecules each). Because the SNP primers and the dye-terminators are in vast excess over the PCR products, dye-terminator incorporation is linear during initial cycles. As the dye-terminators and SNP primer are being consumed, dye-terminator incorporation becomes nonlinear, and the decrease of intensity observed reaches a plateau when dye terminators are all incorporated into primers.
During the first cycle of the primer extension reaction, each
PCR-amplified fragment has one SNP primer hybridized to it. If every
hybridized SNP primer gets extended, then pX and (1 p)X
dye-terminators are incorporated onto annealed SNP primers for the two
dye-terminators, respectively. But in reality, neither the
hybridization nor the primer extension reaction is 100% efficient.
Dissimilar incorporation efficiencies of the dye-terminators will
result in uneven incorporation of the two dye-terminators (Haff and
Smirnov 1997
Although taking just one intensity reading during the early phase of
the primer extension reaction will yield a reasonable allele frequency
estimate, the fact that this approach takes only one measurement during
the presumed linear phase of the reaction makes it highly susceptible
to random errors. Instead of taking the ratio of equations (2) and (3)
directly, however, one can monitor the intensity at the end of every
cycle and plot (If I) against cycle n for each of the
two dye-terminators. Two straight lines are obtained for the initial
linear incorporation stage, and the slopes of the lines will be
p(X/Y)V1 for allele 1 and (1 p)(X/Y)V2 for
allele 2. Because X, Y, V1, and V2 remain constant
for the same reaction, the ratio of the two slopes becomes equation
(4):
Data Analysis
http://lifesciences.perkinelmer.com/products/snp.asp; Perkin-Elmer Web site.
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
3 Corresponding author. E-MAIL kwok{at}cvrimail.ucsf.edu; FAX (415) 476-2283. Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.987803.
Received November 12, 2002; accepted in revised format March 4, 2003. 13:932-939 © by 2003 Cold Spring Harbor Laboratory Press ISSN 1088-9051/03 $5.00 Related Protocol
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||