|
|
|
|
Published online before print
September 16, 2005, 10.1101/gr.4221805 Genome Res. 15:1447-1450, 2005 ©2005 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/05 $5.00
Methods Simple, robust methods for high-throughput nanoliter-scale DNA sequencing1 Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, Canada V5Z 4S6 2 Department of Physics and Astronomy, University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z1.
We have developed high-throughput DNA sequencing methods that generate high quality data from reactions as small as 400 nL, providing an approximate order of magnitude reduction in reagent use relative to standard protocols. Sequencing of clones from plasmid, fosmid, and BAC libraries yielded read lengths (PHRED20 bases) of 765 ± 172 (n = 10,272), 621 ± 201 (n = 1824), and 647 ± 189 (n = 568), respectively. Implementation of these procedures at high-throughput genome centers could have a substantial impact on the amount of data that can be generated per unit cost.
Between February 2004 and February 2005 the NCBI Trace Archive (http://www.ncbi.nlm.nih.gov/Traces Here we describe how testing and implementation of (1) a novel nanoliter-scale reaction vessel configuration, (2) submicroliter positive pressure microsolenoid-based liquid-dispensing robotics, and (3) optimized sequencing reaction chemistry, thermocycling conditions, and capillary electrophoresis injection parameters have allowed us to generate high-throughput sequence data of equal or greater quality to standard methods, with substantial reduction in reagent consumption. All materials we describe are commercially available, such that with appropriate attention to detail the process we describe can be easily implemented at other facilities.
Validation tests were performed on 16 different libraries constructed using a variety of common low-, medium-, and high-copy number vectors (Table 1).
For plasmid libraries, mean PHRED20 read lengths in excess of 750 bp were achieved (Fig. 1A). For library 10790, which is a mock library in which every well contains the same human cDNA clone, the average read length approached 900 bp, and the longest read was 972 bp (Supplemental Fig. 1). These read lengths were comparable to those achieved with our standard DNA sequence production pipeline that uses 540 nL Big Dye Terminator Premix V3.1 (Applied Biosystems) (Table 1), and comparable to those typically submitted to the NCBI trace archive by other high-throughput centers. For all templates that were sequenced in this study, excluding the 4% failed cultures, the average sequencing success rate (samples with a PHRED20 read length of at least 100 bp) was 97%. Although all data in the present study are single-end reads, when applied to paired-end sequencing this method would be expected to give a successful paired-end rate of 0.97 x 0.97, or 94%. After sequencing these libraries using 31.25 nL of Big Dye terminators per well and obtaining adequate read length and signal strength (Table 1), we investigated the absolute lower limit for Big Dye terminator consumption. Ninety-six identical clones from library 10790 were sequenced using either 15.63 nL, 7.81 nL, or 3.13 nL of Big Dye terminators in a 400-nL total reaction that contained 20 nL of 15X reaction buffer; 1 nL, 0.5 nL, or 0.2 nL, respectively, of 100 µM -21 M13 forward primer; and an appropriate volume of Ultrapure water (Invitrogen). Mean PHRED20 read lengths obtained were 871 ± 97, 684 ± 92, and 0 bases, with maximum PHRED20 read lengths of 887, 703, and 0 bases, respectively (Fig. 1B), suggesting that the limiting volume of dye terminators in this system lies between 3.13 and 7.81 nL.
We expect that given the simplicity of the platform we describe here, and its foundation firmly within the time-tested Sanger sequencing paradigm, it will be easily implemented by any center engaged in moderate- to high-throughput DNA sequencing. The chemistry is robust for all vector types (plasmids, fosmids, BACs) typically utilized for high-throughput sequencing. While reagent, equipment, and labor costs will vary among high-throughput sequencing platforms at different institutions, implementation of these methods at our center reduces the cost of sequencing reactions by
A flow diagram summarizing the basic process is presented in Supplemental Figure 2. Template DNA from multiple libraries constructed with different vector types (Table 1) was prepared as previously described (Yang et al. 2005
After careful consideration (see Supplemental material) we determined that the most practical method for delivering DNA template to a submicroliter reaction was to transfer a relatively large volume of dilute template using standard laboratory robotics, desiccate, and then resuspend in an appropriately small volume (200-400 nL) of sequencing reaction master mix. Using a Biomek FX (Beckman-Coulter), plasmid DNA was diluted 10-fold in Ultrapure water (Invitrogen), and 2 µL ( Subsequent to template transfer, 400 nL of sequencing reaction mix containing 31.25 nL Big Dye Terminator Premix V3.1 (Applied Biosystems), 40 nL of a custom formulation of reaction buffer (25X Reaction Buffer [2M Trizma Base, 50 mM MgCl2-6 H2O] combined with an equal volume of 5X Big Dye Terminator Reaction Buffer V3.1 [Applied Biosystems]), 2 nL of 100 µM primer (Invitrogen), and 326.75 nL Ultrapure Water (Invitrogen) were added to each well using the Aurora Discoveries Flying Reagent Dispenser. This instrument was developed for high-throughput screening in the pharmaceutical industry and has not previously been applied to high-throughput sequencing applications (Supplemental material). Plates were sealed with SPRI Plug Low Volume lids (Agencourt Bioscience) to reduce residual air space in the wells and thereby reduce sample loss by evaporation. Again, the principal limiting factor for volume reduction of cycle sequencing has been loss of fluid from the sample due to elevated temperatures and rapid temperature changes (please refer to Supplemental Online Material for theoretical consideration of factors contributing to fluid loss and measurements of fluid loss using various types of plate seals). Agencourt SPRI lids recently became commercially available. Although the SPRI lids are an added cost to the process, they may be reused for up to 10 thermal cycling reactions, and their cost is offset from the savings achieved by reduced volume of sequence reaction mix.
Sealed plates were thermocycled using Tetrad peltier thermal cyclers (MJ Research). Thermal cycling conditions for the reaction format presented here were 50 x (96°C, 10 sec; 43°C, 5 sec; 60°C, 240 sec) with ramping rates of 1°C/sec. The optimal number of cycles was not evaluated, and fewer than 50 cycles may be adequate. Unincorporated nucleotides were removed from the sequence reactions by ethanol/EDTA precipitation as described in Yang et al. (2005
Purified sequence reaction products were resuspended overnight at 4°C (to maximize recovery of purified reaction products) in 10 µL Ultrapure Water (Invitrogen) and sequenced on one of seven 3730xl DNA Analyzers (Applied Biosystems) using 50-cm capillaries and POP-7 polymer (Applied Biosystems). Because the sequence reaction products from our nanoliter-scale reactions were resuspended in the same volume as our standard 4-µL reactions, the labeled DNA was at lower concentration. As such, it was necessary to optimize the electrokinetic injection parameters such that a sufficient amount of labeled reaction products was injected in each capillary electrophoresis run. For the standard Applied Biosystems 3730xl run module, it was empirically determined that doubling the injection time (to 30 sec) but keeping the injection voltage, run time, and run voltage the same (1.5 kV, 5640 sec, and 8.5 kV, respectively) loaded sufficient material into the capillaries of the sequencers for generation of equivalent read lengths to our standard 4-µL reactions. The PHRED software package (v 0.020425.C) (Ewing and Green 1998
We thank Jim Kronstad, Pieter J. de Jong, Marian Sadar, Robert Brunham, and Vancouver, B.C., Canada and IMAGE Consortium for DNA libraries used in this study. We thank members of the DNA Sequencing and Informatics Groups at Canada's Michael Smith Genome Sciences Centre for their technical assistance. In particular, we thank Ranibar Guin and Joseph Ray Santos for help with data processing. M.A.M. and R.A.H. are Michael Smith Foundation for Health Research scholars.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.4221805. Article published online before print in September 2005.
3 Corresponding author. [Supplemental material is available online at www.genome.org.]
Braslavsky, I., Hebert, B., Kartalov, E., and Quake, S.R. 2003. Sequence information can be obtained from single DNA molecules. Proc. Natl. Acad. Sci. 100: 3960-3964. Brenner, S., Johnson, M., Bridgham, J., Golda, G., Lloyd, D.H., Johnson, D., Luo, S., McCurdy, S., Foy, M., Ewan, M., et al. 2000. Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat. Biotechnol. 18: 630-634.[CrossRef][Medline]
Butterfield, Y.S., Marra, M.A., Asano, J.K., Chan, S.Y., Guin, R., Krzywinski, M.I., Lee, S.S., MacDonald, K.W., Mathewson, C.A., Olson, T.E., et al. 2002. An efficient strategy for largescale high-throughput transposon-mediated sequencing of cDNA clones. Nucleic Acids Res. 30: 2460-2468. Dovichi, N.J. 1997. DNA sequencing by capillary electrophoresis. Electrophoresis 18: 2393-2399.[CrossRef][Medline]
Ewing, B. and Green, P. 1998. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8: 186-194.
Levene, M.J., Korlach, J., Turner, S.W., Foquet, M., Craighead, H.G., and Webb, W.W. 2003. Zero-mode waveguides for single-molecule analysis at high concentrations. Science 299: 682-686. Mitra, R.D., Shendure, J., Olejnik, J., Edyta Krzymanska, O., and Church, G.M. 2003. Fluorescent in situ sequencing on polymerase colonies. Anal. Biochem. 320: 55-65.[CrossRef][Medline]
Sanger, F., Nicklen, S., and Coulson, A.R. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. 74: 5463-5467. Schein, J., Kucaba, T., Sekhon, M., Smailus, D., Waterston, R., and Marra, M. 2004. High-throughput BAC fingerprinting. Methods Mol. Biol. 255: 143-156.[Medline]
Smith, L.M., Fung, S., Hunkapiller, M.W., Hunkapiller, T.J., and Hood, L.E. 1985. The synthesis of oligonucleotides containing an aliphatic amino group at the 5' terminus: Synthesis of fluorescent DNA primers for use in DNA sequence analysis. Nucleic Acids Res. 13: 2399-2412. Smith, L.M., Sanders, J.Z., Kaiser, R.J., Hughes, P., Dodd, C., Connell, C.R., Heiner, C., Kent, S.B., and Hood, L.E. 1986. Fluorescence detection in automated DNA sequence analysis. Nature 321: 674-679.[CrossRef][Medline] Yang, G.S., Stott, J.M., Smailus, D.E., Barber, S.A., Balasundaram, M., Marra, M.A., and Holt, R.A. 2005. High-throughput sequencing: A failure mode analysis. BMC Genomics 6: 2.[CrossRef][Medline]
http://www.ncbi.nlm.nih.gov/Traces; National Center for Biotechnology Information (NCBI) Trace Archive
Received June 1, 2005; accepted in revised format August 1, 2005. This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||