|
|
|
|
Vol. 12, Issue 7, 1100-1105, July 2002
METHODS
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |
ABSTRACT |
|---|
|
|
|---|
Positional cloning of mutations in model genetic systems is a powerful method for the identification of targets of medical and agricultural importance. To facilitate the high-throughput mapping of mutations in Caenorhabditis elegans, we have identified a further 9602 putative new single nucleotide polymorphisms (SNPs) between two C. elegans strains, Bristol N2 and the Hawaiian mapping strain CB4856, by sequencing inserts from a CB4856 genomic DNA library and using an informatics pipeline to compare sequences with the canonical N2 genomic sequence. When combined with data from other laboratories, our marker set of 17,189 SNPs provides even coverage of the complete worm genome. To date, we have confirmed >1099 evenly spaced SNPs (one every 91 ± 56 kb) across the six chromosomes and validated the utility of our SNP marker set and new fluorescence polarization-based genotyping methods for systematic and high-throughput identification of genes in C. elegans by cloning several proprietary genes. We illustrate our approach by recombination mapping and confirmation of the mutation in the cloned gene, dpy-18.
[The sequence data described in this paper have been submitted to the NCBI dbSNP data library under accession nos. 4388625-4389689 and GenBank dbSTS under accession nos. 973810-974874. The following individuals and institutions kindly provided reagents, samples, or unpublished information as indicated in the paper: The C. elegans Sequencing Consortium and The Caenorhabditis Genetics Center.]
| |
INTRODUCTION |
|---|
|
|
|---|
Forward genetic screens in model organisms remain a crucial tool for
uncovering new biological information (Matthews and
Kopczynski 2001
; Sternberg 2001
). These approaches require extensive
recombination mapping of a mutation to discover the identity of a gene.
Traditional methods in model systems have typically relied on the use
of visible phenotypic markers for linkage mapping of mutations.
However, single nucleotide polymorphism (SNP) markers are currently
favored because of their relative abundance and because they can
eliminate confounding interaction with the mutant phenotype, although
in some cases outcrossing introduces genetic modifiers.
To date, the only strategy for SNP-based cloning in the nematode
Caenorhabditis elegans (C. elegans Sequencing
Consortium 1998
) is the snip-SNP approach (Wicks et al. 2001
). Here we
present an alternative tripartite approach for rapid SNP-based mapping in the worm. We first established a set of finely spaced
genome-spanning SNP markers and then combined this resource with a
tiered mapping strategy that progressively narrows the region
containing the gene of interest. Finally, we used a high-throughput SNP
assay that allowed reliable and rapid genotyping with low marker
development costs. This strategy afforded rapid gene cloning in C. elegans and can be tailored for use in other model organisms with a
sequenced genome.
| |
RESULTS |
|---|
|
|
|---|
Our initial goal was to create a set of reliable SNP markers at a
density of one marker every 100 kb across the C. elegans genome. To achieve this density, a large set of predicted SNPs scattered throughout the C. elegans genome was required. We
chose to identify polymorphisms between the Hawaiian CB4856 strain of C. elegans and the commonly used Bristol N2 laboratory strain, because the CB4856 strain is known to have the most even distribution of SNPs across the chromosomes (Koch et al. 2000
; Wicks et al. 2001
). A
small insert (1.7 kb ± 0.5 kb) library was constructed from CB4856
genomic DNA. Double-end sequencing of cloned inserts from this library
produced 16,941 high quality sequencing reads, which represented 7.3%
sequence coverage of the genome. An informatics software pipeline
(Vysotskaia et al. 2001
) was then used to predict likely polymorphisms
between CB4856 sequencing reads and the canonical N2 genomic sequence
(WS version 48). The pipeline identified a total of 10,711 predicted
polymorphisms; 9602 of these were unique from those previously
reported. These unique polymorphisms include 6902 substitutions (SNPs),
1885 deletions (one or more bases removed), and 815 insertions (one or
more bases added). We estimate the overall rate of polymorphism between
the two strains to be one substitution/insertion/deletion per 840 bases. Transitions accounted for 57% (3906 SNPs) of substitution SNPs,
whereas 43% (2996 SNPs) were transversions. These observations agreed
with previous findings from a smaller dataset (Wicks et al. 2001
).
Validated SNP data are available from NCBI dbSNP (accession nos.:
4388625-4389689) and GenBank dbSTS (accession nos.: G73810-G74874).
We combined the publicly available C. elegans CB4856 SNP information (http://genome.wustl.edu/gsc/C_elegans/SNP/index.html) with our data to obtain a total of 17,189 predicted polymorphisms throughout the C. elegans genome and then systematically chose one substitution SNP spaced approximately every 100 kb. Oligonucleotide primers were designed flanking these predicted SNPs for PCR amplification. The presence of a SNP was confirmed by sequencing the PCR product and/or by a fluorescence polarization-template directed incorporation (FP-TDI) SNP genotyping assay (see below). The latter method proved to be faster than sequencing and had a comparable failure rate of ~10%, which includes mispredictions, primer failure, and assay failure. To date, a set of 1099 markers have been confirmed and formatted for a genotyping assay; 427 (39%) of these 1099 confirmed SNPs were derived from our putative new 9602 SNPs. Our substitution SNP marker set has an average spacing of 91 kb ± 56 kb across the genome (Fig. 1A). The most telomeric SNPs on average are located ~72.5 kb from the telomeres of each chromosome, ranging from 16.6 kb (left end of chromosome V) to 178.3 kb (right end of chromosome III).
|
By use of the comprehensive SNP marker set, we implemented a mapping strategy that uses iterative phases to progressively refine a genomic region of interest (ROI) containing a mutant gene. In C. elegans forward genetic screens, mapping often begins by genotyping phenotypically mutant F2 offspring from a cross of a homozygous mutant animal (N2 background) to a wild-type animal (CB4856 background) (see Methods). High-throughput genotyping of 30-60 offspring, using a preselected marker set of 30 SNPs (Table 1
|
|
Our mapping strategy required high-throughput genotyping for maximum
speed and efficiency. Of the several available methods for SNP analysis
(Kwok 2001
), we selected the FP-TDI assay for its consistent and
interpretable results, low assay-setup cost, and automated detection.
We performed the FP-TDI assay (Chen et al. 1999
) using commercial
reagents (AcycloPrime-FP SNP Detection Kit, Perkin Elmer Life Sciences,
Inc.) in 384-well format with a standard liquid handling robot (Tecan
Genesis, Tecan). Figure 3 shows examples of
data obtained using this assay. The separation of dyes that is achieved
using control N2 and CB4856 genomic DNA is illustrated in Figure 3A and
B and shows the consistent clarity of base-calling data obtained from
six randomly chosen chromosome II SNPs on F2 recombinants (crude worm
lysates).
|
This strategy has been applied extensively for identification of novel
genes. To date, we have mapped more than 50 loci and cloned >30 genes
from several forward genetic screens. We can routinely identify a gene
of interest within a 2- to 4-mo time frame. To illustrate this process,
Fig. 3C shows results from Tiers 1 and 2 analysis of just 35 DNA
samples in the mapping of the cloned gene dpy-18 (Hill et al.
2000
) to a 2.0-Mb region of chromosome III. This simple analysis
narrows the ROI to only 2% of the genome. Figure 3D shows the 97-kb
interval bound by the two validated FP-TDI SNP markers flanking
dpy-18. This panel shows the resolution attainable with the
set of markers currently available. In addition, information from the
RNAi screen of chromosome III (Gonczy et al. 2000
) indicates the
sequence Y47D3B.10 as being a good candidate gene for
dpy-18. We sequenced this e364 allele and confirmed the
published mutation in the third exon (Hill et al. 2000
), which
introduces a premature stop codon into the coding sequence.
| |
DISCUSSION |
|---|
|
|
|---|
We have presented a tripartite, comprehensive strategy for systematic and high-throughput gene identification in C. elegans. This strategy required the development of finely spaced, genome-wide SNP markers and combined an iterative mapping approach with the high-throughput FP-TDI SNP marker assay. We optimized the FP-TDI assay for automated reaction setup and nucleotide analog detection. The FP-TDI assay is highly reliable and allows greater flexibility in selecting which SNPs are assayed, as well as how many samples are genotyped. Our strategy effectively speeds mutation detection and gene cloning in C. elegans, especially when combined with tools for candidate gene analysis such as cosmid rescue and RNAi. Many aspects of our approach are transportable to other model systems and could allow for rapid and systematic gene identification in these systems.
| |
METHODS |
|---|
|
|
|---|
Library Construction and Sequencing
Random, genome-wide DNA sequences from the Hawaiian C. elegans strain CB4856 were obtained by constructing a small insert genomic library for shotgun sequencing. Library construction was described previously (Vysotskaia et al. 2001
). Double-end sequencing of
clones was performed on ABI 3700 (Perkin Elmer) DNA sequencers.
SNP Prediction
The CB4856 sequence traces were aligned against Bristol N2 genomic
sequence (C. elegans Sequencing Consortium 1998
) using a
custom script that takes into account the quality of the neighboring sequence as well as that of the potential polymorphic base (Vysotskaia et al. 2001
). Polymorphism information can be found in NCBI dbSNP (accession nos. 4388625-4389689) and GenBank dbSTS (accession nos.:
G73810-G74874).
SNP Confirmation
We modified primer3 (http://www-genome.wi.mit.edu/genome_software/other/primer3.html) and designed primers for PCR amplicons ranging between 150 and 300 bases that contain the selected putative SNPs. An initial set of ~100 of the predicted polymorphisms between the Bristol N2 and CB4856 strains were confirmed by sequencing the PCR amplicon from each strain. Sequencing was performed using standard protocols, and products were resolved using capillary electrophoresis on ABI 3700 (Perkin Elmer) instruments.
All 1099 SNPs were also confirmed by FP-TDI (Chen et al. 1999
). We used
the SNP-kit (AcycloPrime-FP SNP Detection Kit, Perkin Elmer Life
Sciences, Inc.) and modified the volumes for compatibility with
384-well PCR. Reactions were set up on the Tecan Genesis 150 robot
(Tecan). Briefly, a 200- to 300-bp region of the genome containing the
SNP was amplified using standard PCR (6 µL reaction volume). Excess
primers and dNTPs were removed by addition of a 6-µL cocktail of
shrimp alkaline phosphatase (Roche) and E. coli Exonuclease I
(USB) reaction (12 µL final reaction volume). The single base
extension reaction (6 µL reaction volume added to above) was
performed using the SNP kit components (acyclo dideoxynucleotide triphosphate [ddNTP] terminators are used instead of fluorescently labeled ddNTPs) and a 30-mer oligonucleotide. Addition of the oligonucleotide, complementary to the sequence on one DNA strand immediately 5' of the polymorphic base, allows incorporation of one of
the two acyclo terminators in the kit depending on the sequences within
the amplified PCR product. Allelic discrimination occurs through
measuring the change in fluorescence polarization of the dyes
associated with the incorporated nucleotide.
SNP markers, the sequence of all required primers, and standard assay conditions for FP-TDI have been deposited in GenBank and dbSNP and are also available on the Exelixis web site (http://www.exelixis.com/discovery/elegans).
Crossing Strategy
Mapping a recessive mutation created in the Bristol N2 background commences by crossing homozygous mutant (N2-background) animals with wild-type CB4856 animals. The F1 progeny are then segregated away from other progeny and allowed to self-fertilize. The resulting F2 animals are then picked onto 6-cm Petri plates and phenotyped for the mutation. Alternatively, after picking of the F2 animals onto plates, the F2 can be allowed to self-fertilize and lay a brood of F3 animals, and these are then phenotyped. Only F2 animals that are homozygous for the recessive mutation (or homozygous without it) are potentially informative and are genotyped.
To map a dominant mutation, essentially the same procedure is followed except the F1 animals are backcrossed to CB4856 animals, and F2 showing the mapping phenotype (mutant/CB4856) are singled from the resulting outcross progeny.
DNA Sample Preparation
DNA samples for PCR were prepared as described previously (Williams
et al. 1992
). This procedure usually yields DNA at a concentration of
100 ng/µL. A portion of the population not used for the DNA lysate
can be saved for reconfirmation of a phenotype.
| |
WEB SITE REFERENCES |
|---|
|
|
|---|
http://genome.wustl.edu; Washington University, School of Medicine, Genome Sequencing Center.
http://www.exelixis.com; Exelixis, Inc. home page.
http://www-genome.wi.mit.edu; Whitehead Institute Center for Genome Research.
| |
ACKNOWLEDGMENTS |
|---|
We thank Candace Swimmer, Exelixis Sequencing-Core, and all our Genomics and Genetics colleagues, especially Mike Ellis, Jon Margolis, Ross Francis, Scott Ogg, Casey Kopczynski, and Geoff Duyk for their valuable comments and suggestions throughout this study. We also thank the C. elegans Sequencing Consortium for providing N2 genomic sequences and the Caenorhabditis Genetics Center for the dpy-18 (e364) mutant.
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| |
FOOTNOTES |
|---|
1 Corresponding author.
E-MAIL cancilla{at}exelixis.com; FAX (650) 837-7220.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.208902.
| |
REFERENCES |
|---|
|
|
|---|
Received February 25, 2002; accepted in revised form April 11, 2002.
This article has been cited by other articles:
![]() |
J. J. Bruinsma, D. L. Schneider, D. E. Davis, and K. Kornfeld Identification of Mutations in Caenorhabditis elegans That Cause Resistance to High Levels of Dietary Zinc and Analysis Using a Genomewide Map of Single Nucleotide Polymorphisms Scored by Pyrosequencing Genetics, June 1, 2008; 179(2): 811 - 828. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. S. Seidel, M. V. Rockman, and L. Kruglyak Widespread Genetic Incompatibility in C. Elegans Maintained by Balancing Selection Science, February 1, 2008; 319(5863): 589 - 594. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sarin, M. M. O'Meara, E. B. Flowers, C. Antonio, R. J. Poole, D. Didiano, R. J. Johnston Jr., S. Chang, S. Narula, and O. Hobert Genetic Screens for Caenorhabditis elegans Mutants Defective in Left/Right Asymmetric Neuronal Fate Specification Genetics, August 1, 2007; 176(4): 2109 - 2130. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. K. Charlie, A. M. Thomure, M. A. Schade, and K. G. Miller The Dunce cAMP Phosphodiesterase PDE-4 Negatively Regulates G{alpha}s-Dependent and G{alpha}s-Independent cAMP Pools in the Caenorhabditis elegans Synaptic Signaling Network Genetics, May 1, 2006; 173(1): 111 - 130. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. M. Schwarz, I. Antoshechkin, C. Bastiani, T. Bieri, D. Blasiar, P. Canaran, J. Chan, N. Chen, W. J. Chen, P. Davis, et al. WormBase: better software, richer content Nucleic Acids Res., January 1, 2006; 34(suppl_1): D475 - D478. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. W. Hillier, A. Coulson, J. I. Murray, Z. Bao, J. E. Sulston, and R. H. Waterston Genomics in C. elegans: So many genes, such a little worm Genome Res., December 1, 2005; 15(12): 1651 - 1660. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Chatterjee, A. Richmond, E. Putiri, D. C. Shakes, and A. Singson The Caenorhabditis elegans spe-38 gene encodes a novel four-pass integral membrane protein required for sperm function at fertilization Development, June 15, 2005; 132(12): 2795 - 2808. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. W. Davis, A. J. Birnie, A. C. Chan, A. P. Page, and E. M. Jorgensen A conserved metalloprotease mediates ecdysis in Caenorhabditis elegans Development, December 1, 2004; 131(23): 6001 - 6008. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. M. Zahler, J. D. Tuttle, and A. D. Chisholm Genetic Suppression of Intronic +1G Mutations by Compensatory U1 snRNA Changes in Caenorhabditis elegans Genetics, August 1, 2004; 167(4): 1689 - 1696. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Forche, P. T. Magee, B. B. Magee, and G. May Genome-Wide Single-Nucleotide Polymorphism Map for Candida albicans Eukaryot. Cell, June 1, 2004; 3(3): 705 - 714. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. W. Harris, N. Chen, F. Cunningham, M. Tello-Ruiz, I. Antoshechkin, C. Bastiani, T. Bieri, D. Blasiar, K. Bradnam, J. Chan, et al. WormBase: a multi-species resource for nematode biology and genomics Nucleic Acids Res., January 1, 2004; 32(90001): D411 - 417. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. L. Stickney, J. Schmutz, I. G. Woods, C. C. Holtzer, M. C. Dickson, P. D. Kelly, R. M. Myers, and W. S. Talbot Rapid Mapping of Zebrafish Mutations With SNPs and Oligonucleotide Microarrays Genome Res., December 1, 2002; 12(12): 1929 - 1934. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||