Vol 13, Issue 5, 845-855, May 2003
LETTER
Haplotype Structure, LD Blocks, and Uneven Recombination Within the LRP5 Gene
Rebecca C.J. Twells1,4,
Charles A. Mein1,
Michael S. Phillips2,
J. Fred Hess2,
Riitta Veijola1,
Matthew Gilbey1,
Matthew Bright1,
Michael Metzker2,
Benedicte A. Lie3,
Amanda Kingsnorth1,
Edward Gregory1,
Yusuke Nakagawa1,
Hywel Snook1,
William Y.S. Wang1,
Jennifer Masters1,
Gillian Johnson1,
Iain Eaves1,
Joanna M.M. Howson1,
David Clayton1,
Heather J. Cordell1,
Sarah Nutland1,
Helen Rance1,
Philippa Carr1 and
John A. Todd1
1JDRF/WT Diabetes and Inflammation Laboratory, Cambridge
Institute for Medical Research, University of Cambridge, Cambridge CB2
2XY, UK; 2Merck Research Laboratories, Department of Human
Genetics, West Point, Pennsylvania 19486, USA; 3Institute of
Immunology, Rikshospitalet University Hospital, Oslo, Norway
 |
ABSTRACT
|
|---|
Patterns of linkage disequilibrium (LD) in the human genome are
beginning to be characterized, with a paucity of haplotype diversity in
"LD blocks," interspersed by apparent "hot spots" of
recombination. Previously, we cloned and physically characterized the
low-density lipoprotein-receptor-related protein 5
(LRP5) gene. Here, we have extensively analysed both
LRP5 and its flanking three genes, spanning 269 kb, for single
nucleotide polymorphisms (SNPs), and we present a comprehensive SNP map
comprising 95 polymorphisms. Analysis revealed high levels of
recombination across LRP5, including a hot-spot region from
intron 1 to intron 7 of LRP5, where there are 109
recombinants/Mb (4882 meioses), in contrast to flanking regions of 14.6
recombinants/Mb. This region of high recombination could be delineated
into three to four hot spots, one within a 601-bp interval. For
LRP5, three haplotype blocks were identified, flanked by the
hot spots. Each LD block comprised over 80% common haplotypes,
concurring with a previous study of 14 genes that showed that common
haplotypes account for at least 80% of all haplotypes. The
identification of hot spots in between these LD blocks provides
additional evidence that LD blocks are separated by areas of higher
recombination.
[Supplementary material: primers are
available from our Web site:
http://www-gene.cimr.cam.ac.uk/todd/human_data.shtml.]
Recent studies suggest that the genome is
comprised of regions of strong intermarker linkage disequilibrium (LD),
or haplotype blocks, interspersed by presumed recombination hot spots.
Daly et al. (2001) showed that for a 500-kb region of Chromosome 5q31,
11 haplotype blocks spanned the interval from 3 to 100 kb in length.
Tiret et al. (2002) studied the LD of 50 candidate genes for
cardiovascular disease and observed gene-specific patterns of LD,
concluding that all the sequence variation of a gene should be surveyed
before attempting association studies with disease. Maniatis et al.
(2002) observed blocks of LD on Chromosome 3q21. Patil et al. (2001)
identified haplotype blocks on Chromosome 21 for which >80% of
chromosomes were represented by a few common haplotypes. Gabriel et al.
(2002) surveyed haplotype patterns in 51 autosomal regions, and
observed LD blocks from <1 to 173 kb in European samples. Jeffreys et
al. (2001) consolidated previous characterization of recombination
patterns in the major histocompatibility complex (MHC) by analysis
using sperm typing of a 216-kb subregion. They confirmed and detected
hot spots of recombination in clusters: Within each cluster the
hot-spot spacing was 17 kb, and between clusters it was 6090 kb,
with the rate of recombination varying between hot spots from 0.4140
cM/Mb (Jeffreys et al. 2001 ). The LD patterns across the hot spots
indicated that breaks in LD between LD blocks are due to chromosome
regions of relatively high recombination.
Johnson et al. (2001) analyzed 14 genes, showing that the common
haplotypes (each >5% frequency) accounted for 80% of the haplotypes
that were observed in 400 chromosomes. Nickerson et al. (1998) screened
a 9.7-kb region within the lipoprotein lipase (LPL)
gene, encompassing exons 49, and concluded that this gene has a high
haplotype diversity. However, Templeton et al. (2000a ,b ) showed that
the haplotype diversity was not exceptional if a recombination hot spot
in a 1.9-kb region close to exon 6 was taken into account.
Here we have determined the SNP profile of the low-density
lipoprotein-receptor-related protein 5 (LRP5) gene (Hey et
al. 1998 ), which maps to the putative insulin-dependent diabetes
mellitus 4 (IDDM4) region on Chromosome 11q13 (Nakagawa et
al. 1998 ). Studies of the LRP5 region in type 1 diabetes
(T1D), using two microsatellite markers, D11S1917 and
H0570polyA, indicated that the region may be associated with
disease (Nakagawa et al. 1998 ). LRP5 comprises 23 exons
spanning 160 kb of genomic sequence and encodes a 1615-amino-acid
protein (Hey et al. 1998 ; Twells et al. 2001 ). LRP5, along with the
homolog LRP6 (Brown et al. 1998 ), have recently been shown to mediate
Wnt signaling (Pinson et al. 2000 ; Tamai et al. 2000 ; Wehrli et al.
2000 ). The LRP5 intracellular domain binds to axin, resulting in T-cell
factor/lymphocyte enhancer factor (TCF/LEF) activation via -catenin
release (Mao et al. 2001 ; Nusse 2001 ). A mutation in exon 3 of the
LRP5 gene has been shown to be responsible for high-bone-mass
trait (Little et al. 2002 ), and numerous mutations identified in the
autosomal recessive disorder osteoporosispseudoglioma syndrome (Gong
et al. 2001 ). The LRP5 gene is therefore also a prime
candidate for osteoporosis (Gong et al. 2001 ; Little et al. 2002 ).
Previously, to investigate the possibility of association of the
D11S987D11S1917 region with T1D, we created a clone
contig of 400 kb encompassing these two loci, sequenced the clones, and
identified four genes: LRP5, C11orf23,
C11orf24, and CGI-85 (Hey et al. 1998 ; Twells et al.
2001 ).
In the present study, we have identified 95 SNPs in the LRP5
gene and flanking regions spanning 269 kb. We chose 46 SNPs and typed
them in 91 families (364 individuals) to assess the allele frequencies,
LD between the markers, and haplotypes of this region. We also typed 32
microsatellites and 12 SNPs in 989 families to assess recombination
across the entire IDDM4-linked region. This, in conjunction
with the LD data, identified three to four hot spots of recombination:
one in intron 1 of the LRP5 gene, one within intron 1 and/or intron
3intron 5, and one within the region from intron 5 to intron 7 of
LRP5, supporting the notion that homologous recombination
events are unevenly distributed along chromosomes.
 |
RESULTS
|
|---|
SNP Identification
The identification of SNPs in the LRP5 region was achieved
by examining sequence overlaps, direct sequencing, and dHPLC scanning.
Sequence overlaps were compared as described in Methods, identifying a
total of 38 SNPs (Table
1). The main focus of
SNP identification was within the coding region and exon/intron
boundaries of the LRP5 candidate gene, as well as putative
regulatory elements based on mouse ortholog sequence we previously
obtained (Twells et al. 2001 ). A total of 18 individuals were directly
sequenced across the exons and exon/intron boundaries of LRP5,
excluding exon 1, which had not been identified at this stage. In
these, 11 new SNPs were identified, of which five are coding. Of these,
two produce a predicted amino acid change: LRP5
g.95385G A (Exon 9 V667M) and LRP5
30g.29264C T (Exon 18 A1349V; Table 1). The 65-kb 5' region
of the LRP5 gene from D11S1917 to LRP5 exon
3 was shotgun-sequenced in two individuals, and an additional 31 SNPs
were identified (Table 1).
View this table:
[in this window]
[in a new window]
|
Table 1. SNPs Identified From the C11orf24LRP5C22orf23
Region, Showing the Method of Identification, Method of Genotyping,
and Allele Frequency in 192 UK Individuals
|
|
The above two sequencing efforts were achieved with sequencing 18 and 2
individuals, respectively. To increase the power to detect SNPs, we
rescreened the LRP5 gene in an additional 24 individuals by
dHPLC and sequenced the heteroduplex samples. The region analyzed
was a rescreening of the 23,744 bp encompassing D11S1917,
H0570POLYA, and the 5' region and exon 1 of LRP5;
thus it may contain regulatory elements of this gene (Fig.
1). In addition, all the exons of LRP5were screened (exon 1 within the 24.3-kb region), excluding
exons 4, 14, 21, and 23, which failed to amplify. Ten new SNPs and an
indel were identified from the dHPLC screening of the 23.7-kb
region. Four new SNPs were identified in intronic regions of
LRP5, close to exons 8 and 17 and two near exon 20. These are
listed in Table 1. The total number of SNPs identified by all methods
is 95.
The dHPLC in 24 individuals identified every SNP identified by the
previous methods (18 SNPs) apart from LRP5 g.6088T C,
which had been identified from the shotgun sequencing of the long-range
PCR products. This SNP was not pursued to confirm whether it was a true
variant or a sequencing artifact. The SNP harvesting performed by dHPLC
compared with direct sequencing was, therefore, very sensitive. The
allele frequencies were ascertained in 182 UK parental chromosomes. The
average allele frequency of the total number of SNPs identified by
dHPLC in 24 individuals is 0.17. The average allele frequency in the 5'
region of the LRP5 gene (14 SNPs) is 0.17, in the introns (13
SNPs) 0.18, and in the three exonic SNPs 0.12. Overall, the SNPs
identified by dHPLC average 1/911 nt in the 5' region of LRP5.
Of the coding region, the number of SNPs identified was an average of
1/804 nt. Three SNPs and a 3-bp deletion were identified from cDNA
clone overlaps in the two flanking genes C11orf23 and
C11orf24. These SNPs are all common, with an average allele
frequency of 0.33. The C11orf24 SNP is coding,
aa150AlaThr; one C11orf23 SNP is in the 5'-UTR,
and the other two are in the 3'-UTR. The majority of SNPs are
substitutions, with 6 indels.
For the LRP5 region, the nucleotide diversity, for
the dHPLC-scanned samples, is
= (2.7 x 104) ± (9 x 105).
For the non coding regions (24,601 bp),
= (2.6 x 104) ± (8.8 x 105).
The coding region of LRP5 (4251 bp) has
= (2.8 x 104) ± (1.4 x 104).
Linkage Disequilibrium
To analyze the LD in this region, 46 SNPs from the 95 identified
were selected that span the region and typed in 91 UK multiplex
families. The allele frequencies were ascertained as described in the
previous section, and the SNPs with <0.03 frequency were omitted from
further analysis. The remaining 42 SNPs were analyzed for LD using the
measure ||D|| with the parental genotypes (364
chromosomes). Figure 2A shows the pairwise
||D|| values. We used the SNPs with minor allele frequency
>0.20 to aid identification of "LD blocks" (Daly et al. 2001 ), as
these are older SNPs, more likely to identify the block structure (Fig.
2B; Jeffreys et al. 2001 ). We used the criteria of an LD block having
intermarker LD of ||D|| > 0.8 on average, which we have
observed previously in studies of the CTLA4 gene and the
INS region (H. Ueda, B. Barratt, J.A. Todd, unpubl.). Those
studies also noted that the interblock LD is on average
||D|| < 0.3 with SNPs (H. Ueda, B. Barratt, J.A. Todd,
unpubl.). A block of strong LD can be observed for the markers
C11ORF24 c.598G ALRP5 g.5257T G, a distance of
37 kb, called LD block 1. LD breaks down in the region LRP5
g.5257T GLRP5 g.28149C T, in intron 1 of the
LRP5 gene. A second block of very strong LD
(||D|| > 0.9) is observed, LRP5
g.28149C TLRP5 g.45704G A, spanning 17.6 kb of intron
1 to intron 3 of the LRP5 gene. The 3' region of the
LRP5 gene could not be characterized so well owing to the
paucity of common SNPs identified within this larger genomic distance.
LD appears to be lower from LRP5 g.45704G ALRP5
30g.24930C T, from intron 3 of LRP5 to exon 17, with
an average ||D|| of 0.72. A third LD block was identified
from LRP5 30g.24930C T to C11ORF23
c.3680G A, encompassing from exon 17 of LRP5 to the
3'-UTR of the C11orf23 gene, 110 kb.

View larger version (33K):
[in this window]
[in a new window]
|
Figure 2. Plot of pairwise ||D|| values against distance (in base
pairs) for 42 SNPs in 364 UK chromosomes. ||D|| values of
0.8 or more are shaded black. (A) Pairwise ||D||
values for all SNPs in the 300-kb region. (B) Pairwise
||D|| values for the 27 SNPs with allele frequencies
>0.20. The arrows show the location of the recombination hot spots.
|
|
Recombination
In the UK, USA, and Sardinian data sets, the sex-averaged map from
FCER1B to D11S916 is 23.7 cM, whereas the physical
map, from ENSEMBL 1.1.3 is 18.1 Mb. Therefore, in this 18.1-Mb region
overall there is on average 1.3 cM/Mb.
The obligate recombinants in the entire 18.1-Mb region were observed.
In the 4882 meioses, this analysis detected 265 obligate recombinants,
an average of 14.6 recombinants/Mb. In the 400-kb contig surrounding
LRP5, the microsatellite and SNP markers spanned 300 kb. In
this interval there were 15 obligate recombinants, an average of 50
recombinants/Mb (Fig. 3). Of these, 14 were
localized between 255CA6 and 14LCA5, and one was
localized between TAA and 18018CA. Thus, between
255CA5 and 14LCA5, there were 72.5
recombinants/Mb, and between 14LCA5 and 18018CA,
there were up to 5 recombinants/Mb. Between 255CA5 and
14LCA5, nine recombinants could be localized precisely to
within the interval LRP5g.3103C G14LCA5, with a
further four recombinants overlapping this interval (Fig. 3). These
five recombinants were localized between
EO8644419G ALRP5 g.17646G T,
D11S129614LCA5, TAA18018CA, and
D11S1296TAA, respectively. Weighting these four for distance
between the flanking markers (assuming an equal possible distribution
by physical distance), plus the recombinants within the LRP5
g.3103C G14LCA5 interval, gives 9.1 recombinants between
LRP5 g.3103C G14LCA5, a distance of 83.6 kb, resulting
in 109 recombinants/Mb. Examination of this interval in more detail
shows three subregions to which the recombinants can be mapped
precisely: TAALRP5 g.7374G A (601 bp), LRP5
g.17646G TD11S1337 (35.2 kb), and D11S133714LCA5
(33.8 kb). Again weighting recombinants for distance (assuming an even
distribution of recombination) gives a value of 2178 recombinants/Mb
for TAALRP5 g.7374G A, 98 recombinants/Mb for
LRP5 g.17646G TD11S1337, and 66 recombinants/Mb for
D11S133714LCA5. Therefore, there appear to be several
recombination hot spots within the region
LRP5g.3103C G14LCA5, which is from intron 1 to between
exons 7 and 8 of LRP5. This concurs with the LD data (Fig. 3),
showing that there is a recombination hot spot in intron 1 of
LRP5 that accounts for the decrease in LD between markers
between LD blocks 1 and 2. Within the second hot-spot region (LRP5
g.17646G TD11S1337), there is an LD block, LRP5
g.28149C TLRP5 g.45704G A; thus, it is possible the
hot spot maps to either LRP5 g.17646G TLRP5
g.28149C T or LRP5 g.45704G AD11S1337, or
both. The third hot spot maps to the region between LD blocks 2 and 3
(D11S133714LCA5).

View larger version (19K):
[in this window]
[in a new window]
|
Figure 3. Map of the LRP5 region showing the obligate recombinants
observed in 4882 meioses and putative hot spots. (Solid arrow) Putative
hot spot; (dashed arrow) either or both regions may contain a hot spot
of recombination.
|
|
Haplotypes
Having identified three hot-spot regions flanking the LD blocks, we
investigated the haplotypes in each LD block. The haplotypes were
ascertained from 364 UK parental chromosomes using the program SNPHAP.
The first LD block, C11ORF24 c.598G ALRP5
g.5257T G, is a 37-kb region encompassing exon 4 of
C11orf24 to intron 1 of LRP5. This has four
haplotypes with frequency >3%, comprising 84% of the haplotypes
(Fig. 4). The second LD block, LRP5
g.28149C TLRP5 g.45704G A, spanning from intron 1 to
intron 3 of LRP5, has 96% common haplotypes. The third LD
block, from LRP5 30g.24930C TC11ORF23 c.3680G A,
encompassing from exon 17 of LRP5 to the 3'-UTR of the
C11orf23 gene, 110 kb, has 84% of common haplotypes (Fig.
4). It is likely that this LD block extends from intron 7 of the
LRP5 gene through to the 3' region of C11orf23, based
on the low recombination rate observed between 14Lca5 (in intron 7 of
LRP5) and 18018CA (within C11orf23). However, the low
number of SNPs with frequency >0.2, in the region precludes
confirmation of the extent of this LD block.
 |
DISCUSSION
|
|---|
SNPs were characterized across the exons and 5' region of the
LRP5 gene in several stages. The comparison of clone overlaps
has been used to identify millions of nontargeted SNPs from various
regions (Dawson et al. 2001 ; Sachidanandam et al. 2001 ; Venter et al.
2001 ). In the LRP5 region, the clone overlaps only identified
common SNPs, and one rare SNP, with an average minor allele frequency
of 0.28. Eberle and Kruglyak (2000) showed that a greater number of
rare SNPs is expected even if the population has followed a constant
population model. The dHPLC of 24 individuals had >99.3% power to
detect SNPs with an allele frequency of 0.1 (Zwick et al. 2000 ), and
identified all SNPs except one that had been identified by other
methods, in the region scanned by both. The approach we used,
identifying SNPs in a cohort of individuals prior to genotyping the
SNPs in a larger family-based study, will clearly miss rare variants,
as observed by Nickerson et al. (2000) in a two-stage SNP detection and
genotyping of the APOE gene, and by Glatt et al. (2001) in the
sequencing of 450 samples for SLC6A4 and SLC18A2. It
will be important to identify rare SNPs for any candidate gene, as has
been demonstrated by the identification of rare variants at the Crohn
disease locus, NOD2 (Hugot et al. 2001 ; Ogura et al. 2001 ).
In the LRP5 region, the overall nucleotide diversity was
within the range of other genes studied (Cargill et al. 1999 ;
Halushka et al. 1999 ; Thorstenson et al. 2001 ). The nucleotide
diversity in the coding regions of LRP5
( = [2.8 x 104] ± [1.4 x 104]),
was slightly lower than that observed for some other studies such as
Cargill et al. (1999) . Cargill et al. (1999) examined 106 genes,
and the coding and noncoding diversity was similar
([5.0 x 104] ± [2.4 x 104],
[5.2 x 104] ± [2.5 x 104],
respectively). Similarly, Halushka et al. (1999) , who scanned 75
genes for SNPs, reported an average nucleotide diversity of
(8.0 x 104) ± (1.9 x 104) and
(8.5 x 104) ± (2.0 x 104), for the
coding and noncoding regions, respectively. However, in contrast, the
ATM gene had a lower nucleotide diversity in its coding
regions, (0.71 x 104) ± (0.61 x 104)
than found in the above studies, or for the LRP5 gene, and the
ATM coding nucleotide diversity was a 7.5-fold decrease
compared with the noncoding regions (Thorstenson et al. 2001 ). This
was also the case for the LPL gene,
(21 x 104) ± (10 x 104),
(5 x 104) ± (5 x 104), in the
noncoding versus coding regions, respectively (Nickerson et al. 1998 ).
Zwick et al. (2000) proposed that as the two large studies of Cargill
and Halushka scanned proportionally fewer noncoding regions than the
ATM and LPL genes, it is possible that had more
noncoding regions been analyzed, the nucleotide diversity ratio would
have increased. This was the case with 12 out of 15 of the autosomal
genes scanned by Thorstenson et al. (2001) . For LRP5, we
screened 25,623 nt of noncoding DNA (including the regions flanking
exons), which is larger than any of the above studies of individual
genes, and yet the nucleotide diversity was similar to the coding
regions of the LRP5 gene.
The magnitude of LD depends on a number of factors, including
recombination, gene conversion, mutation, genetic drift, and selection,
that give rise to the underlying haplotypes in the population studied.
These empirical data show that the ||D|| LD is not directly
ascertainable by distance, as expected. The three conserved blocks of
LD were separated by potential hot spots of recombination, ascertained
by recombination rate across the region. There was a 25-fold increase
in recombination rate from intron 1 to intron 7 of LRP5,
compared with the 3' region of LRP5, and a 10-fold increase
compared with the entire 18.1-Mb IDDM4 interval. This region
of high recombination had three separate recombination-rich regions,
ranging from 66 to 2178 recombinants/Mb. The first hot spot was within
a 601-bp region of LRP5 intron 1. The second contained a 17-kb
LD block from intron 1 to intron 3 of the LRP5 gene; thus, the
hot spot either flanked this LD block in LRP5 intron 1 or
LRP5 intron 3intron 5 or comprised two hot spots, one in
each region. The last hot spot was located within the region from
intron 5 to intron 7 of LRP5. At the MHC, Jeffreys et al.
(2001) noted that the recombination hot spots occur within discrete
clusters, each individual hot spot 11.9 kb in width, with 6090 kb
of separation between clusters. At LRP5, the first hot spot in
intron 1 was very finely mapped to a width of 601 bp, similar to the
MHC hot spots observed by Jeffreys et al. (2001) . The second hot spot
region has a hot spot either in a 10.5-kb region, 20 kb distal to the
first hot spot, proximal to LD block 2, and/or is located in a 7.1-kb
region distal to LD block 2, 38.3 kb distal to hot spot 1. The last hot
spot is within a 33.8-kb region, 45 kb from hot spot 1, and is bounded
proximally by the same SNP that limits hot spot region 2 distally, so
may be part of a cluster with hot spot region 2. The LD is higher for
both these inter-LD block regions, an average of
||D|| = 0.65 and 0.72, respectively, compared with that
observed at INS and CTLA4, which had inter LD-block
||D|| values of <0.3 (H. Ueda, B. Barratt, J.A. Todd,
unpubl.).
LRP5 is similar to LPL in that both genes have hot
spots of recombination within the gene (Templeton et al. 2000b ). At
LRP5, because of the multiple LD blocks, more than five
haplotype tag SNPs (htSNPs; Johnson et al. 2001 ) will need to be tested
to evaluate the association of LRP5 common haplotypes with T1D
and other diseases (Nakagawa et al. 1998 ; Gong et al. 2001 ; Little et
al. 2002 ). This is in contrast to the 14 genes previously studied by
Johnson et al. (2001) , in which the common haplotypes comprised 80% of
all the haplotypes for each gene region, as each gene comprised one
LD block. Consequently, our data indicate that a much larger number of
genes will need to be characterized before a representative picture of
gene-based haplotype diversity can be formed.
 |
METHODS
|
|---|
Subjects
All the families were of white European origin with two parents and
at least one diabetic child. When unaffected offspring were available,
these were also included. The families comprise: 401 UK multiplex, 236
US multiplex, 80 Yorkshire UK simplex, 32 Southwest UK simplex, 50
Bristol UK simplex, 176 Sardinian simplex, and 14 Sardinian multiplex
(Nakagawa et al. 1998 ). Informed consent was obtained from all
subjects.
Microsatellite Identification and Genotyping
Microsatellite polymorphisms were identified by microsatellite
rescue (Merriman et al. 1997 ) from a cosmid, PAC, and BAC clone contig
spanning 400-kb interval around H0570POLYA (Twells et al.
2001 ). The microsatellites 255ca5, 255ca6,
and 255ca3 were isolated from the clone RPCI-255M19;
E0864CA from the cosmid E0864; 14LCA5 and 14LCA1from CITB-14L15; and 18018 from clone RPCI-18O18.
TAA was identified from the shotgun sequence of cosmid clone
B07185 (accession number AC024124), and is 23 kb distal to
H0570POLYA. All the primers used in this study are available
from our Web site:
http://www-gene.cimr.cam.ac.uk/todd/human_data.shtml.
The above microsatellite markers were genotyped in the whole data set
as well as: FCERIB, D11S1765, UT5017,
D11S426, D11S480, D11S1883,
D11S4205, D11S457, D11S1783,
D11S913, D11S1258, PPP1CA,
D11S4155, D11S987, D11S1296,
D11S1917, H0570POLYA, D11S4087,
D11S1337, D11S4178, D11S970,
D11S971, D11S1314, and D11S916, using
fluorescently labeled primers as described elsewhere (Reed et al.
1994 ).
SNP Identification
SNPs were identified in several stages. SNPs were identified from
sequence overlaps. Sequence overlaps were compared between CITB-14L15
Contig 31 (accession no. AF283320; Twells et al. 2001 ) and the cosmids
E0864, B07185, and F08180 (accession nos. AC024125, AC024124, and
AC024126; Twells et al. 2001 ). These overlaps were 117,737 nt in total,
encompassing Contig 31 27064180 nt, 434924,136 nt, and
24,413120,664 nt. CITB-67M5 (accession no. AC024123; Twells et al.
2001 ) overlapped 79,701 nt with Contig 31, but have the same haplotype;
therefore, no new SNPs were identified.
The second approach to SNP harvesting was direct sequencing of the
exons and flanking exon/intron boundaries of the LRP5 gene.
This was achieved by designing specific primers ranging from
500800 bp flanking the regions of interest. LRP5 exon 1
was not screened, as it had not been identified at that stage. DNA
samples from 18 individuals were analyzed. These consisted of 10 UK
individuals and eight Sardinians. The UK samples were selected on the
basis of their D11S1917H0570POLYA haplotypes according to
their TDT results in our previous study (Nakagawa et al. 1998 ). The UK
samples are shown in Table 2. The Sardinian
samples also comprise a selection of different haplotypes (Table 2).
View this table:
[in this window]
[in a new window]
|
Table 2. Haplotype Content of the Individuals Screened for Polymorphisms, and
Frequency of Haplotypes in the General Population Estimated as
AFBAC Frequencies
|
|
Forward and reverse primers were tailed with sequences corresponding to
the M13 Universal primer (5'-TGTAAA ACGACGGCCAGT-3') and a modified
reverse primer (5'-GCTATGACCATGATTACGCC-3'), respectively. The reaction
volume was 50 µL with Perkin-Elmer 10x reaction buffer, 200 mM
dNTPs, 0.5 µL Taq Gold (Perkin-Elmer Corp.), 50 ng of genomic DNA,
and 20 pmole/mL of forward and reverse primers. The thermocycler
conditions were 95°C for 12 min, then 35 cycles of 95°C for 30 sec,
57°C for 30 sec, and 68°C for 2 min, followed by 72°C for 6 min
and 4°C stop. The PCR products were then purified for sequencing
using QiaQuick strips or QiaQuick 96-well plates on the QIAGEN robot
(QIAGEN). Direct BODIPY sequencing was performed according to Metzker
et al. (1996) .
The third method was a shotgun-sequencing strategy of two individuals
who were homozygous for two haplotypes at D11S1917H0570POLYA, 32
and 23. The genomic region between D11S1917 and
LRP5 exon 6 ( 70 kb) was amplified with PCR products of 5
kb using the Expand Long Template PCR System (Boehringer Mannheim) with
modifications (Barnes 1994 ). The PCR reaction was performed in a
total reaction volume of 50 µL, containing 100 ng of genomic DNA
template, 200 pmole/µL of forward and reverse primers, 500 µM
dNTPs, 1x Buffer 3, and 0.5 µL of the enzyme mix. The thermocycling
conditions were 35 cycles of 92°C for 30 sec, 60°C for 2 min, and
68°C for 10 min, followed by a 7-min extension at 72°C and 4°C
finish using a Perkin Elmer 9600 DNA Thermal Cycler. PCR products were
generated for each patient and pooled. The products were sheared by
nebulization. Shotgun libraries were made and sequenced as described in
Hey et al. (1998) . The SNPs were identified visually with the aid of
Consed (Gordon et al. 1998 ). Of the 19 long-range PCR primers designed,
four failed to amplify -31-1, 31-6 (which includes LRP5 exon
1), 31-18 and 31-19 (Fig. 1). The total amount of PCR product was
therefore 61.538 kb and ranged from CITB-14L15 Contig 31 810223,656
nt and 27,81773,801 nt (LRP5 exon 3).
The fourth strategy was to scan 24 individuals by the dHPLC WAVE
machine (Transgenomic Inc.). The 24 individuals were selected for their
haplotypes at the D11S1917LRP5 g.-5677C TH0570POLYA
loci (Table 2). The region analyzed was a rescreening of the 25 kb
encompassing D11S1917, H0570POLYA, and the 5' region
and exon 1 of LRP5. The region was repeat-masked and 49 primer
sets were designed to the nonrepetitive regions, encompassing 24,269 bp
(Contig 31 752431,793). Four repetitive regions were excluded,
totaling 5398 nt, including an LTR of 3245 nt, which was 6 kb upstream
of LRP5 exon 1 (6190 to 9437). Therefore, a total region
of 18,871 nt was screened. Primers were designed for PCR of 500 bp,
and amplified in each of the 24 individuals according to Transgenomic
Application Note 101 (Transgenomic Inc.). The PCR products were run
through the WAVE machine according to Transgenomic Application Note
101. Samples with heteroduplex or an alternative homoduplex pattern
were then directly sequenced using the PCR primers as described above.
SNP Genotyping
The Invader assay (Third Wave Technologies, Inc.) was carried out
as described in Mein et al. (2000) . RFLP and cRFLP analyses were also
carried out as described in Mein et al. (2000) . The ARMs assay was
carried out as follows with a total reaction volume of 13 µL, 40 ng
of DNA, 2 mM MgCl2, 50 ng of each primer, 0.2 U of Bioline
Taq polymerase, 360 µM dNTPs, 20 ng of control primers
(TGCCAAGTGGAGCACCCAA, GCATCTTGCTCTGTGCAGAT; M. Bunce and K. Welsh,
pers. comm.). The thermocycling conditions are 96°C for 1 min; 5
cycles of 96°C for 35 sec, 70°C for 45 sec, 72°C for 35 sec; 21
cycles of 96°C for 25 sec, 65°C for 50 sec, 72°C for 40 sec; 4
cycles of 96°C for 35 sec, 55°C for 60 sec, and 72°C for 90 sec;
25°C for 5 min. The products are visualized after electrophoresis on
a 1% agarose gel, 0.5x TBE.
Recombination, Haplotypes, and LD
The genetic multipoint map was calculated with ASPEX (D. Hinds and
N. Risch; ftp://lahmed.stanford.edu/pub/aspex). The obligate
recombinants were observed using the SHOWHAPLO program available from
Frank Dudbridge (f.dudbridge{at}hgmp.mrc.ac.uk). Haplotypes were
estimated with an EM-based algorithm, implemented in the program SNPHAP
from the parents of 91 families available from
http://www-gene.cimr.cam.ac.uk/clayton/software/.
The parental genotypes of 91 multiplex families were used to calculate
Lewontins D (Lewontin 1995 ) using PWLD
(http://www-gene.cimr.cam.ac.uk/clayton/software/) with the stata
package (http://www.stata.com). ||D|| was calculated as
||D/Dmax|| and ranges from 0 to 1. SNPs
with a frequency <0.03 (within the 95% CI for a frequency of 5%)
were not included for further analysis, as in this strategy of testing
SNPs for disease association, there will not be enough power with the
number of families (or case/controls) for the likely genetic effects
(OR < 3; Johnson et al. 2001 ). The LD of the region was then
examined using ||D|| with the markers of allele frequency
>0.2. The region was divided into apparent blocks of LD according to
Figure 2B, as there was discontinuous LD across the gene region. The
haplotypes from each block were generated with SNPs >0.03 frequency,
using the program SNPHAP (v0.2-;
http://www-gene.cimr.cam.ac.uk/clayton/software/).
 |
WEB SITE REFERENCES
|
|---|
ftp://lahmed.stanford.edu/pub/aspex; the ASPEX program (D. Hinds and
N. Risch).
http://www-gene.cimr.cam.ac.uk/clayton/software/; SNPHAP and PWLD (D.
Clayton).
http://www-gene.cimr.cam.ac.uk/todd/human_data.shtml; primers used in
this work.
http://www.stata.com; the stata package.
 |
Acknowledgements
|
|---|
We thank the Wellcome Trust, the Juvenile Diabetes Research
Foundation, and Diabetes UK for their support. This work was also
supported by a grant from Merck Research Laboratories. R.C.J.T. held a
Diabetes UK R.D. Lawrence Fellowship. We thank the anonymous reviewers
for their helpful comments.
The publication costs of this article were defrayed in part by payment
of page charges. This article must therefore be hereby marked
"advertisement" in accordance with 18 USC section 1734 solely to
indicate this fact.
 |
Footnotes
|
|---|
4 Corresponding author. 
E-MAIL rebecca.twells{at}cimr.cam.ac.uk; FAX (44) 1223 762 102.
Article and publication are at
http://www.genome.org/cgi/doi/10.1101/gr.563703.
 |
REFERENCES
|
|---|
Barnes, W. 1994. PCR amplification of up to 35-kb DNA with high fidelity and high yield from bacteriophage templates. Proc. Natl. Acad. Sci. 91: 2216-2220.[Abstract/Free Full Text]
Brown, S., Twells, R., Hey, P.J., Cox, R.D., Levy, E.R., Soderman, A.R., Metzker, M.L., Caskey, C.T., Todd, J.A., and Hess, J.F. 1998. Isolation and characterization of LRP6, a novel member of the low density lipoprotein receptor gene family. Biochem. Biophys. Res. Commun. 248: 879-888.[CrossRef][Medline]
Cargill, M., Altschuler, D., Ireland, J., Sklar, P., Ardlie, K., Patil, N., Shaw, N., Lane, C.R., Lim, E.P., Kalyanaraman, N., et al. 1999. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat. Genet. 22: 231-238.[CrossRef][Medline]
Daly, M.J., Rioux, J.D., Schaffner, S.F., Hudson, T.J., and Lander, E.S. 2001. High-resolution haplotype structure in the human genome. Nat. Genet. 29: 229-232.[CrossRef][Medline]
Dawson, E., Chen, Y., Hunt, S., Smink, L.J., Hunt, A., Rice, K., Livingston, S., Bumpstead, S., Bruskiewich, R., Sham, P., et al. 2001. A SNP resource for human Chromosome 22: Extracting dense clusters of SNPs from the genomic sequence. Genome Res. 11: 170-178.[Abstract/Free Full Text]
Eberle, M.A. and Kruglyak, L. 2000. An analysis of strategies for discovery of single-nucleotide polymorphisms. Genet. Epidemiol. 19: S29-S35.
Gabriel, S.B., Schaffner, S.F., Nguyen, H., Moore, J.M., Roy, J., Blumenstiel, B., Higgins, J., DeFelice, M., Lochner, A., Faggart, M., et al. 2002. The structure of haplotype blocks in the human genome. Science 296: 2225-2229.[Abstract/Free Full Text]
Glatt, C.E., DeYoung, J.A., Delgado, S., Service, S.K., Giacomini, K.M., Edwards, R.H., Risch, N., and Freimer, N.B. 2001. Screening a large reference sample to identify very low frequency sequence variants: Comparisons between two genes. Nat. Genet. 27: 435-438.[CrossRef][Medline]
Gong, Y., Slee, R.B., Fukai, N., Rawadi, G., Roman-Roman, S., Reginato, A.M., Wang, H., Cundy, T., Glorieux, F.H., Lev, D., et al. 2001. LDL receptor-related protein 5 (LRP5) affects bone accrual and development. Cell 107: 513-523.[CrossRef][Medline]
Gordon, D., Abajian, C., and Green, P. 1998. Consed: A graphical tool for sequence finishing. Genome Res. 8: 195-202.[Abstract/Free Full Text]
Halushka, M.K., Fan, J., Bentley, K., Hsie, L., Shen, N., Weder, A., Cooper, R., Lipshutz, R., and Chakravarti, A. 1999. Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis. Nat. Genet. 22: 239-247.[CrossRef][Medline]
Hey, P.J., Twells, R.C., Phillips, M.S., Nakagawa, Y., Brown, S.D., Kawaguchi, Y., Cox, R., Xie, G., Dugan, V., Hammond, H., et al. 1998. Cloning of a novel member of the low-density lipoprotein receptor family. Gene 216: 103-111.[CrossRef][Medline]
Hugot, J.P., Chamaillard, M., Zouali, H., Lesage, S., Cezard, J.P., Belaiche, J., Almer, S., Tysk, C., OMorain, C.A., Gassull, M., et al. 2001. Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn's disease. Nature 411: 599-603.[CrossRef][Medline]
Jeffreys, A.J., Kauppi, L., and Neumann, R. 2001. Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nat. Genet. 29: 217-222.[CrossRef][Medline]
Johnson, G.C., Esposito, L., Barratt, B.J., Smith, A.N., Heward, J., Di Genova, G., Ueda, H., Cordell, H.J., Eaves, I.A., Dudbridge, F., et al. 2001. Haplotype tagging for the identification of common disease genes. Nat. Genet. 29: 233-237.[CrossRef][Medline]
Lewontin, R.C. 1995. The detection of linkage disequilibrium in molecular sequence data. Genetics 140: 377-388.[Abstract]
Little, R., Carulli, J., Del Mastro, R., Dupuis, J., Osborne, M., Folz, C., Manning, S.P., Swain, P.M., Zhao, S.C., Eustace, B., et al. 2002. A mutation in the LDL receptor-related protein 5 gene results in the autosomal dominant high-bone-mass trait. Am. J. Hum. Genet. 70: 11-19.[CrossRef][Medline]
Maniatis, N., Collins, A., Xu, C.-F., McCarthy, L.C., Hewett, D.R., Tapper, W., Ennis, S., Ke, X., and Morton, N.E. 2002. The first linkage disequilibrium (LD) maps: Delineation of hot and cold blocks by diplotype analysis. Proc. Natl. Acad. Sci. 99: 2228-2233.[Abstract/Free Full Text]
Mao, J., Wang, J., Liu, B., Pan, W., Farr, G.H., Flynn, C., Yuan, H., Takada, S., Kimelman, D., Li, L., et al. 2001. Low-density lipoprotein receptor-related protein-5 binds to axin and regulates the canonical Wnt signaling pathway. Mol. Cell 7: 801-809.[CrossRef][Medline]
Mein, C.A., Barratt, B., Dunn, M.G., Siegmund, T., Smith, A.N., Esposito, L., Nutland, S., Stevens, H.E., Wilson, A.J., Phillips, M.S., et al. 2000. Evaluation of single nucleotide polymorphism typing with invader on PCR amplicons and its automation. Genome Res. 10: 330-343.[Abstract/Free Full Text]
Merriman, T., Twells, R., Merriman, M., Eaves, I., Cox, R., Cucca, F., McKinney, P., Shield, J., Baum, D., Bosi, E., et al. 1997. Evidence by allelic association-dependent methods for a type 1 diabetes polygene (IDDM6) on Chromosome 18q21. Hum. Mol. Genet. 6: 1003-1010.[Abstract/Free Full Text]
Metzker, M.L., Lu, J., and Gibbs, R.A. 1996. Electrophoretically uniform fluorescent dyes for automated DNA sequencing. Science 271: 1420-1422.[Abstract]
Nakagawa, Y., Kawaguchi, Y., Twells, R.C., Muxworthy, C., Hunter, K.M., Wilson, A., Merriman, M.E., Cox, R.D., Merriman, T., Cucca, F., et al. 1998. Fine mapping of the diabetes-susceptibility locus, IDDM4, on Chromosome 11q13. Am. J. Hum. Genet. 63: 547-556.[CrossRef][Medline]
Nickerson, D.A., Taylor, S., Weiss, K.M., Clark, A.G., Hutchinson, R.G., Stengard, J., Salomaa, V., Vartiainen, E., Boerwinkle, E., and Sing, C.F. 1998. DNA sequence diversity in a 97-kb region of the human lipoprotein lipase gene. Nat. Genet. 19: 233-240.[CrossRef][Medline]
Nickerson, D.A., Taylor, S., Fullerton, S.M., Weiss, K.M., Clark, A.G., Stengard, J.H., Salomaa, V., Boerwinkle, E., and Sing, C.F. 2000. Sequence diversity and large-scale typing of SNPs in the human apolipoprotein E gene. Genome Res. 10: 1532-1545.[Abstract/Free Full Text]
Nusse, R. 2001. Making head or tail of Dickkopf. Nature 411: 255-256.[CrossRef][Medline]
Ogura, Y., Bonen, D.K., Inohara, N., Nicolae, D.L., Chen, F.F., Ramos, R., Britton, H., Moran, T., Karaliuskas, R., Duerr, R.H., et al. 2001. A frameshift mutation in NOD2 associated with susceptibility to Crohn's disease. Nature 411: 603-606.[CrossRef][Medline]
Patil, N., Berno, A., Hinds, D., Barrett, W.A., Doshi, J.M., Hacker, C.R., Kautzer, C.R., Lee, D.H., Marjoribanks, C., McDonough, D.P., et al. 2001. Blocks of limited haplotype diversity revealed by high-resolution scanning of human Chromosome 21. Science 294: 1719-1723.[Abstract/Free Full Text]
Pinson, K.I., Brennan, J., Monkley, S., Avery, B.J., and Skarnes, W.C. 2000. An LDL-receptor-related protein mediates Wnt signalling in mice. Nature 407: 535-538.[CrossRef][Medline]
Reed, P.W., Davies, J., Copeman, J.B., Bennett, S.T., Palmer, S.M., Pritchard, L.E., Gough, S.C., Kawaguchi, Y., Cordell, H.J., Balfour, K.M., et al. 1994. Chromosome-specific microsatellite sets for fluorescence-based, semi-automated genome mapping. Nat. Genet. 7: 390-395.[CrossRef][Medline]
Sachidanandam, R., Weissman, D., Schmidt, S.C., Kakol, J.M., Stein, L.D., Marth, G., Sherry, S., Mullikin, J.C., Mortimore, B.J., Willey, D.L., et al. 2001. A map of human genome sequence variation containing 142 million single nucleotide polymorphisms. Nature 409: 928-933.[CrossRef][Medline]
Tamai, K., Semenov, M., Kato, Y., Spokony, R., Liu, C., Katsuyama, Y., Hess, F., Saint-Jeannet, J.P., and He, X. 2000. LDL-receptor-related proteins in Wnt signal transduction. Nature 407: 530-535.[CrossRef][Medline]
Templeton, A.R., Weiss, K., Nickerson, D.A., Boerwinkle, E., and Sing, C.F. 2000a. Cladistic structure within the human lipoprotein lipase gene and its implications for phenotypic association studies. Genetics 156: 1259-1275.[Abstract/Free Full Text]
Templeton, A.R., Clark, A.G., Weiss, K.M., Nickerson, D.A., Boerwinkle, E., and Sing, C.F. 2000b. Recombinational and mutational hotspots within the human lipoprotein lipase gene. Am. J. Hum. Genet. 66: 69-83.[CrossRef][Medline]
Thorstenson, Y.R., Shen, P., Tusher, V.G., Wayne, T.L., Davis, R.W., Chu, G., and Oefner, P.J. 2001. Global analysis of ATM polymorphism reveals significant functional constraint. Am. J. Hum. Genet. 69: 396-412.[CrossRef][Medline]
Tiret, L., Poirier, O., Nicaud, V., Barbaux, S., Herrmann, S.M., Perret, C., Raoux, S., Francomme, C., Lebard, G., Tregouet, D., et al. 2002. Heterogeneity of linkage disequilibrium in human genes has implications for association studies of common diseases. Hum. Mol. Genet. 11: 419-429.[Abstract/Free Full Text]
Twells, R.C., Metzker, M., Brown, S.D., Cox, R., Garey, C., Hammond, H., Hey, P.J., Levy, E., Nakagawa, Y., Philips, M.S., et al. 2001. The sequence and gene characterization of a 400-kb candidate region for IDDM4 on Chromosome 11q13. Genomics 72: 231-242.[CrossRef][Medline]
Venter, J.C., Adams, M., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G., Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A., et al. 2001. The sequence of the human genome. Science 291: 1304-1351.[Abstract/Free Full Text]
Wehrli, M., Dougan, S., Caldwell, K., O'Keefe, L., Schwartz, S., Vaizel-Ohayon, D., Schejter, E., Tomlinson, A., and DiNardo, S. 2000. arrow encodes an LDL-receptor-related protein essential for wingless signalling. Nature 407: 527-530.[CrossRef][Medline]
Zwick, M., Cutler, D.J., and Chakravarti, A. 2000. Patterns of genetic variation in Mendelian and complex traits. Annu. Rev. Genom. Hum. Genet. 1: 387-407.[CrossRef][Medline]
Received June 28, 2002;
accepted in revised format February 18, 2003.
13:845-855 © by 2003 Cold Spring Harbor Laboratory Press ISSN 1088-9051/03 $5.00

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
M. S. Khatkar, K. R. Zenger, M. Hobbs, R. J. Hawken, J. A. L. Cavanagh, W. Barris, A. E. McClintock, S. McClintock, P. C. Thomson, B. Tier, et al.
A Primary Assembly of a Bovine Haplotype Block Map Based on a 15,036-Single-Nucleotide Polymorphism Panel Genotyped in Holstein-Friesian Cattle
Genetics,
June 1, 2007;
176(2):
763 - 772.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B. D. Horne, N. J. Camp, J. L. Anderson, C. P. Mower, J. L. Clarke, M. J. Kolek, J. F. Carlquist, and for the Intermountain Heart Collaborative Study Gr
Multiple Less Common Genetic Variants Explain the Association of the Cholesteryl Ester Transfer Protein Gene With Coronary Artery Disease
J. Am. Coll. Cardiol.,
May 22, 2007;
49(20):
2053 - 2060.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. Greenspan and D. Geiger
Modeling Haplotype Block Variation Using Markov Chains
Genetics,
April 1, 2006;
172(4):
2583 - 2599.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. M. Greenawalt, X. Cui, Y. Wu, Y. Lin, H.-Y. Wang, M. Luo, I. V. Tereshchenko, G. Hu, J. Y. Li, Y. Chu, et al.
Strong correlation between meiotic crossovers and haplotype structure in a 2.5-Mb region on the long arm of chromosome 21
Genome Res.,
February 1, 2006;
16(2):
208 - 214.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. Weikard, C. Kuhn, T. Goldammer, G. Freyer, and M. Schwerin
The bovine PPARGC1A gene: molecular characterization and association of an SNP with variation of milk fat synthesis
Physiol Genomics,
March 21, 2005;
21(1):
1 - 13.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|