|
|
|
|
Published online before print
December 20, 2005, 10.1101/gr.4345506 Genome Res. 16:231-239, 2006 ©2006 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/06 $5.00
Letter The prion protein gene in humans revisited: Lessons from a worldwide resequencing studyUnitat de Biologia Evolutiva, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Spain
Ample evidence has accumulated showing that different coding variants of the PRNP gene confer differential susceptibility for prion diseases. Here we evaluate the patterns of nucleotide variation in PRNP exon 2, which includes all the protein-coding sequence, by resequencing a worldwide sample of 174 humans for 2378 bp. In line with previous studies, we found two main haplotypes differentiated by nonsynonymous substitution in codon 129. Our analyses reveal the worldwide pattern of variation at the PRNP gene to be inconsistent with neutral expectations, indicating instead an excess of low-frequency variants, a footprint of the action of either positive or purifying selection. A comparison of neutrality test statistics for PRNP with other human genes indicates that the signal of positive selection on PRNP is stronger than expected from a possible confounding genome-wide background signal of population expansion. Two main conclusions arise from our analysis. First, the existence of an ancient, stable, balanced polymorphism that has been claimed in a previous study and related to cannibalism can be rejected and is shown to be due to ascertainment bias. Second, our results are consistent with a complex history of selection including mainly positive selection, even if short local periods of balancing selection (Kuru-like episodes), or even a weak purifying selection model, are consistent with our data.
Transmissible spongiform encephalopathies (TSEs), or prion diseases, are a group of rare, subacute and fatal neurodegenerative disorders characterized by accumulation of the abnormal isoform of a host-encoded membrane protein. Human TSEs can be sporadic, acquired, or genetic, and include, among others, Creutzfeldt-Jakob disease (CJD), Kuru (a disease confined to a population in PapuaNew Guinea), and variant CJD (vCJD), a concept coined to designate cases potentially caused by the human consumption of cattle suffering from bovine spongiform encephalopathy (BSE).
The human prion protein (PrP) is a product of a single gene located on the short arm of Chromosome 20 (Prusiner 1991
In the case of codon 219, it has been suggested that the Lys allele acts as a protective factor against sporadic CJD (Shibuya et al. 1998
The common methionine/valine (Met/Val) polymorphism at codon 129 is generally considered to be the most important in genetic susceptibility to prion diseases. Up to 90% of sporadic CJD (sCJD) cases have occurred in individuals who are homozygous for either version of codon 129, Met-Met or Val-Val (Palmer et al. 1991
Mead et al. (2003
Variant CJD in humans is believed to be acquired from cattle infected with BSE; thus, similar selective forces could have been acting in other species. Seabury et al. (2004
The possibility that cannibalism has been a major and global factor in human history, as proposed by Mead et al. (2003
Codons 129 and 219 Typing the worldwide CEPH-HGDP panel (2128 chromosomes) for codons 129 and 219 confirmed and expanded the results obtained in a smaller previous study (Soldevila et al. 2003
Codon 219 is found to be monomorphic in Europe, Africa, and the Americas, with the 219Lys allele observed only in Asian and Pacific regions, and very rare in the Middle East/North African population (Table 1). Two homozygotes for the 219Lys allele have been detected, both from Central/South Asian regions, and have been confirmed by sequencing. Hardy-Weinberg equilibrium holds for all populations.
PRNP nucleotide sequence variation A total of 12 of the 22 variants are singletons (nine SNPs and three indels), all of which have been confirmed by two independent PCR amplifications and sequencing both strands. A detailed and careful examination of chromatograms has been performed (see Methods). Among the 18 SNPs, transitions (12) are more frequent than transversions (six). Nine of the polymorphisms are found in the coding sequence including the two deletions in the octapeptide repeat region, and seven SNPs. Moreover, 13 polymorphisms, some of them previously unreported, are detected in the 3'-UTR of the PRNP exon 2 (Table 2), including two insertions (2637226373 [TA] and 27495 [T]).
Two of the variants (at 142Ser and 232Arg) deserve special attention, as they have been related to disease. The unconservative amino acid change Gly142Ser might be involved in neurological diseases, as it had been reported for the first time in a North African man with multiple sclerosis and in a Malian woman with viral meningoencephalitis (J.L. Laplanche, unpubl.; see the Official Mad Cow Disease Web site, http://www.mad-cow.org). In our sample, we have found this substitution in four heterozygotes from Sub-Saharan Africa with a total frequency of 6%. It was also reported by Mead et al. (2003
The diversity and genealogy of PRNP haplotypes Figure 1 shows a median-joining network, describing the most plausible mutational relationships between the 28 haplotypes. The phylogenetic structure around haplotypes S1 and S2 is clearly star-like, with S1 having a slightly higher proportion of haplotypes derived by a single mutational step. Assuming that the phase of haplotypes is correctly resolved, reticulations in the network represent either homoplasy (recurrent mutation events) or recombination events. There are only four single reticulations, involving positions 129219, 21926372, 12926908, and 26893117. All are transitions and positions 117, 129, 219, and 26893 are in CpGs, making them more likely to be recurrent mutation events, although crossing-over or gene conversion events cannot be ruled out. It is interesting to note that the susceptibility polymorphisms at codons 129 and 219 are involved in three out of four cases, indicating that one or both of them may have a recurrent origin.
Testing for the impact of natural selection
A positive value of D points to the possible impact of balancing selection or population subdivision (under simple models of population structure, discussed below) on the patterns of diversity, while a negative value can indicate either recent positive selection, a population expansion, or purifying selection on slightly deleterious alleles (Tajima 1989
Negative significant results have been also obtained for Fu and Li's D* and F* statistics when just considering SNPs (D* = -3.65, P < 0.02; F* = -3.54, P < 0.02) (Table 3), and a higher significant deviation from neutral expectations is obtained when length polymorphisms are included (D* = -3.77, P < 0.002; F* = -3.69, P < 0.001). Fu's F test is negative for the combined worldwide sample (-47.28), with highly significant P-values using coalescent simulations with and without recombination.
The chimpanzee sequence has been used to infer the ancestral states of human polymorphisms (which were, with just one exception, also the most frequent in humans), which allowed us to compute Fay and Wu's H statistic. This is -1.37 (P > 0.05) (Table 3), which implies that derived alleles are not found at higher than expected frequencies, which would be the case under positive selection. However, this test would not detect old sweep events (Przeworski 2002 The McDonald and Kreitman (MK) test of neutrality has been applied to the combined worldwide sample using a range of different primate species as outgroups to compare human polymorphism. While comparisons of humans with the closest species are nonsignificant, those of humans with gibbon and siamang yielded significant results (P = 0.007 in both cases).
The pairwise distribution of mutational differences
Time to the most recent common ancestor (TMRCA) and mutation ages
The polymorphism at codon 129 polymorphism is the oldest in the tree, with an age estimated at
Geographic variation of diversity and selection statistics
In most cases, the neutrality tests for individual geographic regions are nonsignificant (Table 3), an expected result because of the lack of power caused by small sample size. It is interesting to note that under simple models of population substructure, Tajima's D is expected to be positive; nonetheless, it has been shown that with real data for humans, pooling more populations decreases its value (Ptak and Przeworski 2002
In order to distinguish between demographic and selective forces in African and European continental groups, we compared our results with values of Tajima's D obtained for 245 genes (SeattleSNPs database; September 2005) in these two groups. Using this data set as an empirical distribution, both Tajima's D-values are nonsignificant (Africa, D = -1.34; P < 0.065; Europe, D = -1.20; P < 0.073), but fall clearly in the left side tail of the distribution. In this case, substructure would not affect the values (Ptak and Przeworski 2002
A different approach to detect a deficit of polymorphism due to selection that is not biased by the history of human populations is comparing, for the same ethnic group, the values of
Geographic heterogeneity
The magnitude of the FST-values found has been empirically compared to two distributions obtained with large numbers of genes: (1) Akey et al. (2002
Ascertainment bias The effects of ascertainment bias on three parameters (Tajima's D and Fu and Li's D* and F*) are shown in Figure 4, where these parameters are given for an increasing number of polymorphisms being added in descending frequency order; thus, the first point includes only the most frequent polymorphism (codon 129), and the remaining are being added until the total of six (specified in the upper part of the figure) detected in the present study; the last value is statistically significant for the three statistics. The Tajima's D-value that would correspond to Mead's data is 0.36; this corresponds to only three polymorphisms, and does not include the information provided by low-frequency polymorphisms, which would strongly decrease the three parameters (see Fig. 4).
In an effort to better understand the pattern and the worldwide geographic structure of variation at the PRNP locus and the extent to which it has been shaped by natural selection, we have analyzed the sequence variation in a worldwide sample of 348 chromosomes for exon 2 of this gene. A resequencing approach has been followed, with careful manual inspection of the sequence traces and cloning of PCR products when needed. Most of the variants have been found at very low frequency, indicating a dearth of variation in the region. Variation at the known codons involved in susceptibility to prion diseases (129 and 219) shows a geographic stratification, even if heterogeneity (measured as FST) is not especially high compared to genome-wide values. Two other variants had been related to disease, namely, codons 142 and 232, but their presence in control populations makes them unlikely to be causative of prion diseases.
One of the major points of this study is to unravel the selective forces acting on the PRNP gene. No evidence has been found beyond the expected purifying selection in a phylogenetic perspective. In a study by Krakauer et al. (1998
Nonetheless, the forces acting on primate phylogeny may be different from the evolution acting on humans (and, as we discuss below, it may vary among human groups). In an influential paper, Mead et al. (2003
As different local selection pressures could have influenced variation at this gene, analyses have been performed for individual geographical regions. In general, the pattern of variation is very similar to that observed for the combined worldwide sample and only the Americas show some special features, difficult to interpret. Human groups are not particularly differentiated; in general, FST values are intermediate, and even values for codon 129, which is the most ancient SNP, fall in a non-extremely high position when compared to a general FST distribution (Kidd et al. 2004
Ascertainment bias has been shown to have influenced the results in previous analyses of the PRNP gene (Mead et al. 2003
Two main conclusions arise from our analysis. First, we can reject the existence of an ancient, stable, balanced polymorphism of the kind that skews the frequency spectrum to an excess of intermediate frequency variants described by Mead et al. (2003
Samples The HGDP-CEPH Human Genome Diversity Cell Line panel contains a total of 1064 samples from a broad range of different world populations (Cann et al. 2002
SNP genotyping
Sequencing PCR products with sequencing problems because of heterozygous insertions or deletions (indels) have been cloned with the pMOSBlue blunt ended cloning kit (Amersham Biosciences) following the manufacturer's instructions.
Data analysis
Neutrality tests and diversity statistics
TMRCA and age of mutations
Substitution rate
Genetic structure statistics
Most of the data have been produced in DeCode Genetics (Iceland) thanks to K. Stefánsson and A. Helgason. S. Sigurdadóttir (DeCode) helped in producing the data, and A. Helgason (DeCode) provided fruitful comments to the manuscript. Anna Di Rienzo made many useful comments on the analysis. The raw data for the FST comparisons have been kindly provided by Mark Shriver (Pennsylvania State University) and Kenneth K. Kidd (Yale University) and the divergence data for the SeattleSNP database by Deborah Nickerson (University of Washington). We also thank Oscar Lao for statistical help. The chimpanzee sample was obtained from the Barcelona Zoo under the agreement with Pompeu Fabra University. This work is supported by DGICYT (BMC2001-0772 and BOS2003-08070) and DURSI (PhD scholarship 2001FI 00632 to M.S. and grant 2001 SGR 00285).
Article published online ahead of print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.4345506.
1 Present address: Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
2 Corresponding author.
Akey, J.M., Zhang, G., Zhang, K., Jin, L., Shriver, M.D. 2002. Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 12: 1805-1814. Akey, J.M., Eberle, M.A., Rieder, M.J., Carlson, C.S., Shriver, M.D., Nickerson, D.A., and Kruglyak, L. 2004. Population history and natural selection shape patterns of genetic variation in 132 genes. PLoS Biol. 2: e286.[CrossRef][Medline] Andrews, N.J., Farrington, C.P., Ward, H.J., Cousens, S.N., Smith, P.G., Molesworth, A.M., Knight, R.S., Ironside, J.W., and Will, R.G. 2003. Deaths from variant Creutzfeldt-Jakob disease in the UK. Lancet 361: 751-752.[CrossRef][Medline] Bamshad, M. and Wooding, S.P. 2003. Signatures of natural selection in the human genome. Nat. Rev. Genet. 4: 99-111.[CrossRef][Medline] Bandelt, H.J., Forster, P., Sykes, B.C., and Richards, M.B. 1995. Mitochondrial portraits of human populations using median networks. Genetics 141: 743-753.[Abstract] Bandelt, H.J., Forster, P., and Rohl, A. 1999. Median-joining networks for inferring intraspecific phylogenies. Mol. Biol. Evol. 16: 37-48.[Abstract] Cann, H.M., de Toma, C., Cazes, L., Legrand, M.F., Morel, V., Piouffre, L., Bodmer, J., Bodmer, W.F., Bonne-Tamir, B., Cambon-Thomsen, A., et al. 2002. A human genome diversity cell line panel. Science 296: 261-262. Cervenakova, L., Goldfarb, L.G., Garruto, R., Lee, H.S., Gajdusek, D.C., and Brown, P. 1998. Phenotype-genotype studies in kuru: Implications for new variant Creutzfeldt-Jakob disease. Proc. Natl. Acad. Sci. 95: 13239-13241. Chimpanzee Sequencing and Analysis Consortium. 2005. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437: 69-87.[CrossRef][Medline] Collinge, J., Sidle, K.C., Meads, J., Ironside, J., and Hill, A.F. 1996. Molecular analysis of prion strain variation and the aetiology of `new variant' CJD. Nature 383: 685-690.[CrossRef][Medline] Excoffier, L., Laval, G., and Schneider, S. 2005. Arlequin version 3.0: An integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online 1: 47-50. Gordo, I., Navarro, A., and Charlesworth, B. 2002. Muller's ratchet and the pattern of variation at a neutral locus. Genetics 161: 835-848. Harris, E.E. and Hey, J. 2001. Human populations show reduced DNA sequence variation at the factor IX locus. Curr. Biol. 11: 774-778.[CrossRef][Medline] Hedrick, P.W. 2003. A heterozygote advantage. Science 302: 57. Hitoshi, S., Nagura, H., Yamanouchi, H., and Kitamoto, T. 1993. Double mutations at codon 180 and codon 232 of the PRNP gene in an apparently sporadic case of Creutzfeldt-Jakob disease. J. Neurol. Sci. 120: 208-212.[CrossRef][Medline] Hoque, M.Z., Kitamoto, T., Furukawa, H., Muramoto, T., and Tateishi, J. 1996. Mutation in the prion protein gene at codon 232 in Japanese patients with Creutzfeldt-Jakob disease: A clinicopathological, immunohistochemical and transmission study. Acta Neuropathol. (Berl) 92: 441-446.[CrossRef][Medline] Kidd, K.K., Pakstis, A.J., Speed, W.C., and Kidd, J.R. 2004. Understanding human DNA sequence variation. J. Hered. 95: 406-420. Kitamoto, T., Ohta, M., Doh-ura, K., Hitoshi, S., Terao, Y., and Tateishi, J. 1993. Novel missense variants of prion protein in Creutzfeldt-Jakob disease or Gerstmann-Straussler syndrome. Biochem. Biophys. Res. Commun. 191: 709-714.[CrossRef][Medline] Kong, A., Gudbjartsson, D.F., Sainz, J., Jonsdottir, G.M., Gudjonsson, S.A., Richardsson, B., Sigurdardottir, S., Barnard, J., Hallbeck, B., Masson, G., et al. 2002. A high-resolution recombination map of the human genome. Nat. Genet. 31: 241-247.[CrossRef][Medline] Krakauer, D.C., Zanotto, P.M., and Pagel, M. 1998. Prion's progress: Patterns and rates of molecular evolution in relation to spongiform disease. J. Mol. Evol. 47: 133-145.[CrossRef][Medline] Kreitman, M. and Di Rienzo, A. 2004. Balancing claims for balancing selection. Trends Genet. 20: 300-304.[CrossRef][Medline] Laplanche, J.L., Hachimi, K.H., Durieux, I., Thuillet, P., Defebvre, L., Delasnerie-Laupretre, N., Peoc'h, K., Foncin, J.F., and Destee, A. 1999. Prominent psychiatric features and early onset in an inherited prion disease with a new insertional mutation in the prion protein gene. Brain 122 (Pt 12): 2375-2386. Lee, I.Y., Westaway, D., Smit, A.F., Wang, K., Seto, J., Chen, L., Acharya, C., Ankener, M., Baskin, D., Cooper, C., et al. 1998. Complete genomic sequence and analysis of the prion protein gene region from three mammalian species. Genome Res. 8: 1022-1037. Livingston, R.J., von Niederhausern, A., Jegga, A.G., Crawford, D.C., Carlson, C.S., Rieder, M.J., Gowrisankar, S., Aronow, B.J., Weiss, R.B., and Nickerson, D.A. 2004. Pattern of sequence variation across 213 environmental response genes. Genome Res. 14: 1821-1831. Martinez-Arias, R., Calafell, F., Mateu, E., Comas, D., Andres, A., and Bertranpetit, J. 2001. Sequence variability of a human pseudogene. Genome Res. 11: 1071-1085. Mead, S., Mahal, S.P., Beck, J., Campbell, T., Farrall, M., Fisher, E., and Collinge, J. 2001. Sporadicbut not variantCreutzfeldt-Jakob disease is associated with polymorphisms upstream of PRNP exon 1. Am. J. Hum. Genet. 69: 1225-1235.[CrossRef][Medline] Mead, S., Stumpf, M.P., Whitfield, J., Beck, J.A., Poulter, M., Campbell, T., Uphill, J.B., Goldstein, D., Alpers, M., Fisher, E.M., et al. 2003. Balancing selection at the prion protein gene consistent with prehistoric kurulike epidemics. Science 300: 640-643. Palmer, M.S., Dryden, A.J., Hughes, J.T., and Collinge, J. 1991. Homozygous prion protein genotype predisposes to sporadic Creutzfeldt-Jakob disease. Nature 352: 340-342.[CrossRef][Medline] Prusiner, S.B. 1991. Molecular biology of prion diseases. Science 252: 1515-1522. Przeworski, M. 2002. The signature of positive selection at randomly chosen loci. Genetics 160: 1179-1189. Ptak, S.E. and Przeworski, M. 2002. Evidence for population growth in humans is confounded by fine-scale population structure. Trends Genet. 18: 559-563.[CrossRef][Medline] Rogers, A.R. and Harpending, H. 1992. Population growth makes waves in the distribution of pairwise genetic differences. Mol. Biol. Evol. 9: 552-569.[Abstract] Seabury, C.M., Honeycutt, R.L., Rooney, A.P., Halbert, N.D., and Derr, J.N. 2004. Prion protein gene (PRNP) variants and evidence for strong purifying selection in functionally important regions of bovine exon 3. Proc. Natl. Acad. Sci. 101: 15142-15147. Shibuya, S., Higuchi, J., Shin, R.W., Tateishi, J., and Kitamoto, T. 1998. Codon 219 Lys allele of PRNP is not found in sporadic Creutzfeldt-Jakob disease. Ann. Neurol. 43: 826-828.[CrossRef][Medline] Soldevila, M., Calafell, F., Andres, A.M., Yague, J., Helgason, A., Stefansson, K., and Bertranpetit, J. 2003. Prion susceptibility and protective alleles exhibit marked geographic differences. Hum. Mutat. 22: 104-105. Soldevila, M., Calafell, F., Helgason, A., Stefansson, K., and Bertranpetit, J. 2005. Assessing the signatures of selection in PRNP from polymorphism data: Results support Kreitman and Di Rienzo's opinion. Trends Genet. 21: 389-391.[CrossRef][Medline] Stephens, J.C., Schneider, J.A., Tanguay, D.A., Choi, J., Acharya, T., Stanley, S.E., Jiang, R., Messer, C.J., Chew, A., Han, J.H., et al. 2001. Haplotype variation and linkage disequilibrium in 313 human genes. Science 293: 489-493. Tajima, F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123: 585-595. Takahata, N., Satta, Y., and Klein, J. 1995. Divergence time and population size in the lineage leading to modern humans. Theor. Pop. Biol. 48: 198-221.[CrossRef][Medline] Tavare, S., Balding, D.J., Griffiths, R.C., and Donnelly, P. 1997. Inferring coalescence times from DNA sequence data. Genetics 145: 505-518.[Abstract] Thompson, E.E., Kuttab-Boulos, H., Witonsky, D., Yang, L., Roe, B.A., and Di Rienzo, A. 2004. CYP3A variation and the evolution of salt-sensitivity variants. Am. J. Hum. Genet. 75: 1059-1069.[CrossRef][Medline] Valleron, A.J., Boelle, P.Y., Will, R., and Cesbron, J.Y. 2001. Estimation of epidemic size and incubation time based on age characteristics of vCJD in the United Kingdom. Science 294: 1726-1728. Wooding, S., Kim, U.K., Bamshad, M.J., Larsen, J., Jorde, L.B., and Drayna, D. 2004. Natural selection and molecular evolution in PTC, a bitter-taste receptor gene. Am. J. Hum. Genet. 74: 637-646.[CrossRef][Medline]
Received June 27, 2005; accepted in revised format November 7, 2005.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||