|
|
|
|
Genome Res. 14:296-300, 2004 ©2004 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/04 $5.00 Methods Digital Detection of Genetic Mutations Using SPC-Sequencing1 Laboratory of DNA Sequencing and Chemical Biology, Columbia Genome Center, Columbia University College of Physicians and Surgeons, New York, New York 10032, USA 2 Department of Chemical Engineering, Columbia University, New York, New York 10027, USA
Deletion or insertion mutations lead to a frameshift that causes misalignment between wild-type and mutated allele sequences, making it difficult to identify such mutations unambiguously by using electrophoresis-based DNA sequencing. We have previously established the feasibility of an accurate DNA sequencing method using solid-phase capturable (SPC) dideoxynucleotides and MALDI-TOF mass spectrometry on synthetic templates, an approach we refer to as SPC-sequencing. Here, we report the application of SPC-sequencing in characterizing frameshift mutations by using the detection of the BRCA1 gene mutations 185delAG and 5382insC as examples. In this method, Sanger DNA sequencing fragments are generated in one tube by using biotinylated dideoxynucleotides. The sequencing fragments carrying a biotin moiety at the 3' end are captured on a streptavidin-coated solid phase to eliminate excess primer, primer dimers, and false stops. Only correctly terminated DNA fragments are captured, subsequently released, and analyzed by mass spectrometry to obtain digital DNA sequencing data. This method produces distinct doublet mass peaks at each point in the mass spectrum beyond the mutation site, facilitating the accurate characterization of the mutation. We have compared SPC-sequencing with electrophoresis-based sequencing in characterizing the above BRCA1 mutations, demonstrating the significant advantage offered by SPC-sequencing for the accurate identification of frameshift mutations.
With the completion of the Human Genome Project, many recent advances in genome technology have been focused on the analysis and characterization of genetic variations. The study of genetic mutations has applications in many areas of biological and medical research, including pharmacogenomics (Roses 2000
Common genetic mutations may be broadly classified as point mutations such as substitutions or frameshift mutations caused by insertions or deletions. Single base extension (SBE) has been widely used to detect point mutations. This technique, involving the extension of a primer by only one base using a single dideoxynucleotide terminator, has been very effective in the successful characterization of single nucleotide polymorphisms (SNPs; Chen et al. 1997
Direct DNA sequencing, in theory, is the most accurate technique for mutation detection because it can successfully identify and characterize most sequence variations. In this category, Sanger DNA sequencing using capillary array electrophoresis with laser-induced fluorescence detection (Smith et al. 1986
Recently, matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) has been used as an analytical tool for DNA sequencing (Roskey et al. 1996
The principal advantage of SPC-sequencing using biotinylated dideoxynucleotide terminators and MALDI-TOF MS for mutation detection is the highly accurate identification of different bases coexisting at a single position in the DNA, which is a direct consequence of the presence of substitution or frameshift mutations. MALDI-TOF MS analysis displays the sequence data in the form of distinct digital mass peaks. This facilitates a more accurate characterization of frameshift mutations compared with that of other sequencing methods that measure fluorescent or radioactive signals emitted from the labeled DNA. A schematic representation of the SPC-sequencing method for the digital detection of DNA sequences is shown in Figure 1. A wild-type DNA molecule has both copies bearing identical bases at every position in the sequence. Therefore, a mass spectrum resulting from a sequencing reaction from such a template will have a single peak at each position that will yield the identity of the base at that position (Fig. 1, left). However, in the case of a DNA molecule bearing a deletion or insertion in one of its copies, the two alleles will have a relative frameshift between them following the mutation site. Therefore, each position in the DNA sequence after the mutation site may be occupied by two different bases. With SPC-sequencing, every position before the mutation site will show single peaks, as observed in the wild-type data, whereas every position beyond the mutation site will have two distinct peaks corresponding to the masses of the two nucleotides present at that position (Fig. 1, right). By calculating the differences in masses between subsequent peaks, the sequences of the two alleles can be read simultaneously and the mutation site can be unambiguously identified.
We used the SPC-sequencing method to characterize the 185delAG and 5382insC loci in DNA from donors that were known to have the two characteristic BRCA1 mutations, and we compared the results to those obtained by using electrophoresis-based DNA sequencing. The capillary electrophoresis sequencing results for the 185delAG locus are shown in Figure 2A. It can be seen that the sequencing data is clean and well-resolved up to the "T" at position 217, after which the sequencing data become noisy and the sequences assigned by the basecaller are no longer accurate. This phenomenon can be explained by the fact that the existence of a frameshift mutation in one of the two DNA copies in that region causes misalignment between the sequences of the two alleles. Therefore, beyond the mutation site, there are two different sequences superimposed on each other, causing the fluorescence signals to be unresolvable. It might be possible to confirm the presence of a mutation by recognizing this distinct pattern in the sequencing data for a mutant in a combination with sequencing data from the reverse direction, but an accurate characterization of the mutation from these data might still prove difficult and tedious. Figure 2B shows the SPC-sequencing results for the same region around the 185delAG mutation site. The first position in the spectrum is occupied by a single large peak corresponding to a "T" in both alleles. This is followed by a doublet peak at the second position in the spectrum, which identifies this position as the mutation site. The subtraction of the mass of the previous "T" peak from the masses of these two peaks establishes their identities as C and A. Similarly, the identities of the two peaks at the next position are confirmed as T and A, and in this manner, the identities of the nucleotides at all subsequent positions in the mass spectrum can be established. Consequently, the two sequences identified from the spectrum are 3'-TCTAAGA... -5' and 3'-TAAGATT... -5'. The wild-type sequence in this region is known to be 3'-... TCTAAGATTT... -5'. Thus, after comparing the two parallel sequences obtained in the mass spectrum to the known wild-type sequence, it can be unequivocally confirmed that there is a deletion of a C and a T after the first "T" in one of the alleles. These data therefore confirm that the SPC-sequencing method identifies deletions accurately.
The data in Figure 3 demonstrate the efficacy of the SPC-sequencing method in characterizing insertion mutations. The capillary electrophoresis sequencing results for the 5382insC locus are shown in Figure 3A. As observed similarly in Figure 2A, the data are unintelligible beyond a particular position ("C" at 118) in the fluorescence electropherogram. This phenomenon is again explained by the presence of a frameshift mutation (insertion of a "C"), which causes the sequences of the two alleles to go out of phase with each other, rendering the sequence unreadable beyond the mutation site. Figure 3B shows the SPC-sequencing results for the same region around the 5382insC mutation site. After performing a similar analysis as for the 185delAG locus described above, the sequences of the two alleles in this region were identified as 5'-... CCAGGA... -3' and 5'-... CCCAGG... -3'. The wild-type sequence is known to be 5'-... CCAGGA... -3'. Thus, comparing the two sequences to the known wild-type sequence, it is straightforward to conclude that there is an insertion of a "C" in one of the two alleles.
These results show that SPC-sequencing using biotinylated dideoxynucleotides and MALDI-TOF MS is highly accurate in characterizing frameshift mutations. The data acquisition is very rapid, and the sequencing results are clear and easily interpreted. The electrophoresis-based sequencing method, which uses fluorescence detection, often experiences masking of the first few bases by the high-intensity fluorescence signals from the labeled primers or terminators. Thus, the sequencing primers must be designed a few bases away from the mutation site of interest in order to obtain coherent sequencing data for that region. However, the SPC-sequencing method does not face this limitation, because the data are obtained in the form of a mass spectrum with no false signals. Consequently, the sequencing primers can be designed very close to the mutation site, and thus, very few bases need to be sequenced to successfully characterize the mutation. This makes the SPC-sequencing method desirable for mutation analysis, despite the read length being currently limited to <100 bp. With electrophoresis-based fluorescent DNA sequencing methods, the identification of deletions and insertions requires that the sequence data be manually analyzed. In this regard, a significant advantage offered by SPC-sequencing is the potential automation of data analysis facilitated by the digital data output. When using conventional ddNTPs, the smallest mass difference between any two ddNTPs is 9 Da (between A and T), which is difficult to resolve by MALDI-TOF MS (Fei et al. 1998
The Ashkenazi BRCA1 mutations described here are but one example of a multitude of frameshift mutations that exist genomewide and can have a significant contribution to the development of diseases. For example, mutations in the p53 gene play a major role in the development of many cancers (Steele et al. 1998
Primers for PCRand DNA sequencing were obtained from Midland, Inc. The PCRprimers were designed by using the Primer3 oligonucleotide design software (Whitehead Institute) to produce DNA amplicons ranging in size from 300500 bp. The same software was used to design DNA sequencing primers, about one to two basepairs away from the mutation site on the genomic DNA. Jumpstart Red Accutaq LA DNA polymerase, 3-hydroxy-picolinic acid, and ammonium citrate were obtained from Sigma-Aldrich. Thermo Sequenase DNA polymerase and all four deoxyribonucleoside triphosphates (dNTPs) were obtained from Amersham Biosciences. Biotinylated dideoxyadenosine-5'-triphosphate (biotin-11-ddATP), biotinylated dideoxycytidine-5'-triphosphate (biotin-11-ddCTP), and biotinylated dideoxyguanidine-5'-triphosphate (biotin-11-ddGTP) were obtained from Perkin Elmer. Biotinylated dideoxyuridine-5'-triphosphate (biotin-16-ddUTP) was obtained from Enzo Life Sciences. Streptavidin-coated magnetic beads were obtained from Seradyn, Inc.
PCR Amplification
DNA Sequencing Using SPC Biotinylated Dideoxynucleotides and MALDI-TOF MS
DNA Sequencing Using a Fluorescent Capillary-Array DNA Sequencer
This work was supported by the Packard Fellowship for Science and Engineering (J.J.) and the Columbia University Genomics Initiative. The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1344104.
3 Corresponding author.
Andersen, P.S., Jespersgaard, C., Vuust, J., Christiansen, M., and Larsen, L.A. 2003. High-throughput single strand conformation polymorphism mutation detection by automated capillary array electrophoresis: Validation of the method. Human Mutat. 21: 116-122.[CrossRef][Medline]
Bowling, J.M., Bruner, K.L., Cmarik, J.L., and Tibbetts, C. 1991. Neighboring nucleotide interactions during DNA sequencing gel electrophoresis. Nucleic Acids Res. 19: 3089-3097.
Chen, X., Zehnbauer, B., Gnirke, A., and Kwok, P.-Y. 1997. Fluorescence energy transfer detection as a homogenous DNA diagnostic method. Proc. Natl. Acad. Sci. 94: 10756-10761.
Edwards, J.R., Itagaki, Y., and Ju, J. 2001. DNA sequencing using biotinylated dideoxynucleotides and mass spectrometry. Nucleic Acids Res. 29: e104. Fakhrai-Rad, H., Pourmand, N., and Ronaghi, M. 2002. Pyrosequencing: An accurate detection platform for single nucleotide polymorphisms. Human Mutat. 19: 479-485.[CrossRef][Medline]
Fei, Z., Ono, T., and Smith, L.M. 1998. MALDI-TOF mass spectrometric typing of single nucleotide polymorphisms with mass-tagged ddNTPs. Nucleic Acids Res. 26: 2827-2828. Friedman, L.S., Ostermeyer, E.A., Szabo, C.I., Dowd, P., Lynch, E.D., Rowell, S.E., and King, M.-C. 1994. Confirmation of BRCA1 by analysis of germline mutations linked to breast and ovarian cancer in ten families. Nat. Genet. 8: 399-404.[CrossRef][Medline] Fu, D.J., Tang, K., Braun, A., Reuter, D., Darnhofer-Demar, B., Little, D.P., O'Donnell, M.J., Cantor, C.R., and Koster, H. 1998. Sequencing exons 5 to 8 of the p53 gene by MALDI-TOF mass spectrometry. Nat. Biotechnol. 16: 381-384.[CrossRef][Medline] Huang, X.C., Quesada, M.A., and Mathies, R.A. 1992. DNA sequencing using capillary array electrophoresis. Anal. Chem. 64: 2149-2154.[Medline]
Ju, J., Ruan, C., Fuller, C.W., Glazer, A.N., and Mathies, R.A. 1995. Fluorescence energy transfer dye-labeled primers for DNA sequencing and analysis. Proc. Natl. Acad. Sci. 92: 4347-4351.
Kaetzke, A. and Eschrich, K. 2002. Simultaneous determination of different DNA sequences by mass spectrometric evaluation of Sanger sequencing reactions. Nucleic Acids Res. 30: e117.
Kim, S., Edwards, J.R., Deng, L., Chung, W., and Ju, J. 2002. Solid phase capturable dideoxynucleotides for multiplex genotyping using mass spectrometry. Nucleic Acids Res. 30: e85.
Kirpekar, F., Nordhoff, E., Larsen, L.K., Kristiansen, K., Roepstorff, P., and Hillenkamp, F. 1998. DNA sequence analysis by MALDI mass spectrometry. Nucleic Acids Res. 26: 2554-2559. Kourkine, I.V., Hestekin, C.N., and Barron, A.E. 2002. Technical challenges in applying capillary electrophoresis-single strand conformation polymorphism for routine genetic analysis. Electrophoresis 23: 1375-1385.[CrossRef][Medline] Kyriacou, C.P. 2002. Single gene mutations in Drosophila: What can they tell us about the evolution of sexual behaviour? Genetica 116: 197-203.[CrossRef][Medline] Li, J., Butler, J.M., Tan, Y., Lin, H., Royer, S., Ohler, L., Shaler, T.A., Hunter, J.M., Pollart, D.J., Monforte, J.A., et al. 1999. Single nucleotide polymorphism determination using primer extension and time-of-flight mass spectrometry. Electrophoresis 20: 1258-1265.[CrossRef][Medline] Pirmohamed, M. and Park, B.K. 2001. Genetic susceptibility to adverse drug reactions. Trends Pharmacol. Sci. 22: 298-305.[CrossRef][Medline]
Prober, J., Trainor, G., Dam, R., Hobbs, F., Robertson, C., Zagursky. R., Cocuzza, A., Jensen, M., and Baumeister, K. 1987. A system for rapid DNA sequencing with fluorescent chain-terminating dideoxynucleotides. Science 238: 336-341. Reynolds, R., Sensabaugh, G., and Blake E. 1991. Analysis of genetic markers in forensic DNA samples using the polymerase chain reaction. Anal. Chem. 63: 2-15.[Medline]
Ronaghi, M., Uhlén, M., and Nyrén, P. 1998. A sequencing method based on real-time pyrophosphate. Science 281: 363-365. Roses, A.D. 2000. Pharmacogenetics and the practice of medicine. Nature 405: 857-865.[CrossRef][Medline]
Roskey, M.T., Juhasz, P., Smirnov, I.P., Takach, E.J., Martin, S.A., and Haff, L.A. 1996. DNA sequencing by delayed extraction-matrix-assisted laser desorption/ionization time of flight mass spectrometry. Proc. Natl. Acad. Sci. 93: 4724-4729. Ross, P., Hall, L., Smirnov, I., and Haff, L. 1998. High level multiplex genotyping by MALDI-TOF mass spectrometry. Nat. Biotechnol. 16: 1347-1351.[CrossRef][Medline] Smith, L.M., Sanders, J.Z., Kaiser, R.J., Hughes, P., Dodd, C., Connell, C.R., Heiner, C., Kent, S.B.H., and Hood, L.E. 1986. Fluorescence detection in automated DNA sequencing analysis. Nature 321: 674-679.[CrossRef][Medline] Steele, R.J.C., Thompson, A.M., Hall, P.A., and Lane, D.P. 1998. The p53 tumour suppressor gene. Br. J. Surg. 85: 1460-1467.[CrossRef][Medline]
Stickney, H.L., Schmutz, J., Woods, I.G., Holtzer, C.C., Dickson, M.C., Kelly, P.D., Myers, R.M., and Talbot, W.S. 2002. Rapid mapping of zebrafish mutations with SNPs and oligonucleotide microarrays. Genome Res. 12: 1929-1934.
Taranenko, N.I., Allman, S.L., Golovlev, V.V., Taranenko, N.V., Isola, N.R., and Chen, C.H. 1998. Sequencing DNA using mass spectrometry for ladder detection. Nucleic Acids Res. 26: 2488-2490. Tchernitchko, D., Lamoril, J., Puy, H., Robreau, A.M., Bogard, C., Rosipal, R., Gouya, L., Deybach, J.C., and Nordmann, Y. 1999. Evaluation of mutation screening by heteroduplex analysis in acute intermittent porphyria: Comparison with denaturing gradient gel electrophoresis. Clinica Chimica Acta 279: 133-143.[CrossRef][Medline]
Received March 18, 2003;
accepted in revised format November 24, 2003.
This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||