|
|
|
|
Genome Res. 15:1222-1231, 2005 ©2005 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/05 $5.00 OPEN ACCESS ARTICLE Letter Why do human diversity levels vary at a megabase scale?1 Max-Planck-Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany 2 Department of Statistics, Harvard University, Cambridge, Massachusetts 02138, USA 3 Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02141, USA
Levels of diversity vary across the human genome. This variation is caused by two forces: differences in mutation rates and the differential impact of natural selection. Pertinent to the question of the relative importance of these two forces is the observation that both diversity within species and interspecies divergence increase with recombination rates. This suggests that mutation and recombination are either directly coupled or linked through some third factor. Here, we test these possibilities using the recently generated sequence of the chimpanzee genome and new estimates of human diversity. We find that measures of GC and CpG content, simple-repeat structures, as well as the distance from the centromeres and the telomeres predict diversity as well as divergence. After controlling for these factors, large-scale recombination rates measured from pedigrees are still significant predictors of human diversity and human-chimpanzee divergence. Furthermore, the correlation between human diversity and recombination remains significant even after controlling for human-chimpanzee divergence. Two plausible and non-mutually exclusive explanations are, first, that natural selection has shaped the patterns of diversity seen in humans and, second, that recombination rates across the genome have changed since humans and chimpanzees shared a common ancestor, so that current recombination rates are a better predictor of diversity than of divergence. Because there are indications that recombination rates may have changed rapidly during human evolution, we favor the latter explanation.
Levels of nucleotide diversity within humans vary substantially across the genome at the megabase scale (The International SNP MAP Working Group 2001
To begin with, if mutation rates vary, so will levels of diversity. Moreover, for neutrally evolving regions, the divergence rate (total divergence between two species divided by the divergence time) is equal to the mutation rate (Li 1997
Some of this variation in mutation rates is due to sequence properties. On a local scale, only CpG dinucleotides have a strong impact on mutation rates (Hwang and Green 2004
Natural selection may also lead to variation in diversity levels. Indeed, while the rate of divergence at neutral sites does not depend on selection at linked sites, diversity levels do. In particular, both selective sweeps, that is, positive selection driving one allele to fixation (Smith and Haigh 1974
As an alternative to this selective explanation, Lercher and Hurst (2002
Although the association of recombination and mutation can be explained if recombination is mutagenic (and there is some evidence to support this; see, e.g., Strathern et al. 1995
In this study, we examine which factors might contribute to variation in diversity rates across the human genome using the recently assembled chimpanzee genome and human diversity estimates from the recent shotgun re-sequencing effort at the Baylor College of Medicine Human Genome Sequencing Center (BCM-HGSC) and the Broad Institute (The International HapMap Consortium 2003
Sequence features, diversity, and divergence It has been known for some time that mutation rates vary across the genome and that this variation is not because of pure stochasticity, but due to inherent differences of genome structure. For example, in a previous study we showed an association of diversity and divergence with recombination rates. Recombination rates, however, also show a significant correlation with 12 out of 18 other sequence features that we consider here as predictors of divergence (Supplemental Table 1). In an attempt to tease apart the effects of recombination from those of other genome properties, we analyzed all these factors jointly using multiple linear regression implemented in a stepwise procedure (see Methods). This technique can help unravel relationships that are masked by other factors and only keeps factors that remain significant after inclusion of other variables, that is, we were asking questions like: Do we find a significant correlation between recombination rates and divergence just because both features vary with gene content?
In order to exclude variance that is due to differing selective constraints, we tried to measure divergence and diversity for neutrally evolving sites only. Thus, we used two mutually exclusive measures of intergenic divergence and diversity. First, we counted only fixed differences or polymorphisms outside of interspersed repeats, and as a second measure, only inside interspersed repeats. Using the first measure, there might still be a non-negligible fraction of functionally important sites. However, although we are most confident that interspersed repeats are not functional, the second measure might not be as representative of the genomic region (Gu et al. 2000 We found that 12 sequence features explain 53% of the variance of our nonrepeat divergence and 11 sequence features explain 32% of the variance in diversity (Table 1). Recombination rates, GC content, CpG content, poly(A/T) content, simple-repeat content, poly(R/Y) content, CpG-island content, and poly(G/C) content, as well as distance to telomeres and centromeres are significant predictors of human diversity and human-chimpanzee divergence (Table 1; Fig. 1; Supplemental Fig. 1). For divergence, although not for diversity, poly(CA) and gene content also improve the model significantly. Similarly, SINE count improves the prediction of diversity, but not divergence. Poly(CA) content and SINE count are, however, only weak predictors, whereas gene content is a rather strong one. The single best predictor for divergence is simple-repeat content (R2 = 0.209); for diversity, it is recombination rates (R2 = 0.164). For both divergence and diversity, most of the variance is explained by only three predictors, namely, simple-repeat content, recombination rates, and poly(R/Y) content.
When divergence and diversity are measured in interspersed repeats, the overall amount of variance (divergence, R2 = 0.45; diversity, R2 = 0.26) explained by the 13 and eight significant predictors is lower than if nonrepetitive sequences are considered (Table 2). This might be explained by the higher levels of noise due to the fact that less sequence per window is scored. The finding that the best set of predictors for repeat diversity contains three predictors less than when using the nonrepeat measure, might also be attributed to more noise. For repeat divergence, two additional factors are significant: LINE count and male germ-line expression. Both are only marginally significant predictors of repeat divergence and are nearly significant when divergence is measured using intergenic, nonrepetitive sites. Thus, this model difference is of a quantitative rather than of a qualitative nature. Another difference between the two measures of divergence (diversity) is the importance of poly(R/Y) content. Poly(R/Y) is one of the three best predictors of intergenic divergence, whereas it is not significant for repeat divergence. In this respect, it is important to mention that there is considerable overlap among the various simple-repeat motifs that are significant predictors of divergence (diversity): poly(R/Y) contains part of poly(A/T) and poly(G/C), and the majority of all three is contained in the simple-repeat category. Thus, since these various repeat structures are highly correlated, the fact that different measures of divergence (diversity) retain different repeat structures should not be overinterpreted. Since the two measures of divergence (diversity) do not lead to differences in our conclusions, we only discuss the results from the measure excluding repeats.
We also tried several predictors related to gene expression, namely, average expression breadth, the average expression strength across 63 tissues or expression strength in the germ line (testes and/or ovaries), and the average number of genes expressed the germ line in a given window. None of these variables explains any additional part of the variance in divergence or diversity, except male germ-line expression for divergence measured in interspersed repeats only. The rank correlation of these predictors with human diversity and human-chimpanzee divergence is listed in Supplemental Table 2. The nonsignificance of these predictors does not prove that they have no effect on mutation rate, since predictors may fail to be significant if the magnitude of their effect is swamped by measurement error.
Recombination, diversity, and divergence
It has been hypothesized that recombination could be mutagenic (Lercher and Hurst 2002
Since the results suggest a link between mutation and recombination, this raises the possibility that the correlation between recombination and diversity is simply the result of this association, rather than of natural selection. To examine this, we assessed the relationship between diversity and human recombination rates in a multiple linear regression, including divergence as a predictor (Fig. 2B). This effectively asks if recombination predicts diversity beyond what is expected from the relationship of recombination and divergence. In contrast to our previous findings (Hellmann et al. 2003
This observation reopens the question of whether the variation in diversity levels could be, in part, due to variation-reducing selection. In order to gain a sense of how plausible a selective explanation would be, we assessed the effect of different selective models on the relationship between recombination and diversity using parameters that seem realistic for humans. In particular, we ran simulations to determine how three models of selection influence diversity at neutral loci: (1) recurrent selective sweeps, (2) background selection, and (3) the combination of these two. The results of the simulations were then compared to the relationship between the residuals of diversity and recombination rates after a regression on divergence. Based on a comparison of the standardized regression slopes for different sets of selection parameters, we cannot rule out any of the three scenarios (Supplemental Table 3), and any conclusions can only be tentative as the models are simplistic (see Methods). This said, the selective sweep model predicts a logarithmic relationship between diversity and recombination, while the relationship of our data appears to be linear. In particular, for the selective sweep model, we would expect much lower diversity in regions of low recombination than we observe. Furthermore, it predicts little effect on levels of diversity for all but the lowest rates of recombination (Fig. 3A). On the other hand, the model of background selection, as well as the model including both background selection and selective sweeps, predict an approximately linear relationship between diversity and recombination rates (Fig. 3B,C). However, we note that the frequency of positively and negatively selected alleles in a given window increases with the number of functionally important sites. Thus, we might expect gene-rich regions to show lower diversity. Thus, in addition to recombination rates, gene content should remain a significant predictor of diversity after correction for divergence. This is not the case (p = 0.114). Moreover, assuming that similar selective forces (i.e., similar frequencies and strength of selective events) acted to shape human and chimpanzee diversity, we would expect to find a correlation between chimpanzee diversity and recombination as well. Yet human recombination rates are not a significant predictor of chimpanzee diversity, after correction for human-chimpanzee divergence (Clint-diversity: R2 = 0.0016, p = 0.13; Central-Western diversity: R2 = 0.0022, p = 0.10) (Fig. 2D). This could reflect lack of power, as the chimpanzee diversity estimates are based on less data and possibly have more errors than the human estimates. However, when the amount of human diversity data is reduced to what is available for the chimpanzee, the relationship of recombination rates with human diversity only becomes as weak as observed for chimpanzee diversity if unrealistic amounts of error are added (Table 3). Thus, we should see a significant relationship in chimpanzees if the putative effect of selection on diversity were as strong as it is in humans.
In summary, while models including background selection are roughly consistent with the data, the absence of a significant relationship of diversity with gene content (controlling for recombination rates and divergence) and the absence of a similar relationship in chimpanzees cast doubt on their plausibility.
Predictors of mutation rates Altogether, we considered 19 features of DNA sequences that could have an influence on mutation rates. Of these, 12 and 11 were significant predictors of divergence and diversity, respectively (Table 1; Methods). The sequence features are also correlated with each other. Out of the 171 possible pairwise comparisons, 117 show a significant correlation (Supplemental Table 1). Nevertheless, the fact that 12 (11) predictors are retained in the regression model indicates that each explains a unique aspect of divergence (diversity). The use of multiple linear regression allows us to gauge the extent to which the relationship between divergence (or diversity) and a second factor is confounded by the fact that they both correlate with additional variables. For example, GC content shows a significant pairwise correlation with 16 of the 18 other traits that might affect divergence and/or diversity. In fact, the variation in GC content has even been suggested to be causatively linked to recombination via biased gene conversion (Meunier and Duret 2004
At first, the finding of a linear and negative relationship of divergence (diversity) and GC content in the full model seems to contradict previous reports: The International SNP Map Working Group (2001
An intuitive explanation for the quadratic relationship between divergence and GC content is that the probability of observing a CpG is a function of GC-squared. Since CpGs are mutational hotspots (Shen et al. 1994
As with CpG dinucleotides, it is not clear whether the other significant predictors covary with mutation rate or together predict a higher-level feature such as chromatin structure or replication timing, which might be more directly linked to mutation rates (Goldman et al. 1984
The distances from centromeres and telomeres also predict sequence divergence and diversity. Telomeres are unusual genomic areas in many respects. They are relatively GC-rich, have a high gene density, and exhibit higher recombination rates. However, even with these features added to the model, distance to the telomeres is retained as a predictor of sequence divergence (Table 1). Furthermore, unlike in most other regions of the genome, open chromatin is not necessarily associated with replication timing in telomeres (Gilbert et al. 2004 Altogether, we find that the 12 parameters explain 53% of the variance in human-chimpanzee divergence and the 11 parameters explain 32% of the variation in human diversity (Table 1). Most of the variance for both divergence and diversity is explained by just three predictors: simple-repeat content, poly(R/Y) content, and recombination rate. Of these parameters, 10 are in common to divergence and diversity. Moreover, the pairwise correlations of those parameters with divergence and diversity are not dissimilar, suggesting that the difference in predictors kept is quantitative rather than qualitative (Tables 1 and 2).
Recombination, divergence, and diversity
Given that divergence and recombination are positively correlated, we wanted to ask whether this could explain the correlation between diversity and recombination. Contrary to our previous study (Hellmann et al. 2003 One longstanding hypothesis for the positive relationship between diversity and recombination is that it reflects a signature of variation-reducing selection in the human genome. We reexamined the three main models for variation-reducing selection and found that only models including background selection are consistent with the observed data (Fig. 3). Moreover, two lines of evidence suggest that selection may not be the best explanation for this correlation. First, windows with a high gene density should be more likely to experience selection, yet gene density does not explain a significant portion of the variance in diversity levels (after correction for recombination rates). Second, we would have to postulate that selection is more frequent or stronger in humans than in chimpanzees, since we find no correlation between chimpanzee diversity and human recombination rates (Fig. 2). Thus, it seems unlikely that selection alone explains our results.
An alternative explanation for our finding is that large-scale recombination rates across the genome changed during primate evolution so that they differ between humans and chimpanzees. Unfortunately, this hypothesis is difficult to evaluate, as we do not have a genetic map for chimpanzees. However, it seems possible that recombination rates could have changed rapidly during human and chimpanzee evolution. Heritable variation for recombination does exist (see Brooks 1988
Obviously, the above explanation does not exclude the action of natural selection. In fact, we know that natural selection has acted on particular regions of the genome (e.g., Harding et al. 2000 In conclusion, we demonstrate that a wide variety of sequence motifs are correlated with human diversity and human-chimpanzee divergence and are likely to influence mutation rates. Many of these motifs had not been previously implicated as potential factors in explaining variation in mutation rates. Despite the inclusion of these additional factors, recombination remains an important predictor of both diversity and divergence. Moreover, the correlation between recombination and diversity cannot be explained solely by the link between mutation and recombination. This suggests some salient feature is missing from our model, such as natural selection and/or the rapid evolution of large-scale recombination rates. Given our data and recent comparisons of fine-scale recombination rates between humans and chimpanzees, the latter explanation seems more plausible.
Recombination, diversity, and divergence We determined the average human-chimpanzee divergence and human diversity within 834 consecutive 3-Mb windows across autosomes for which sex-averaged recombination rates could be estimated (Kong et al. 2002
Estimates of human diversity for the windows were taken from the alignment of sequences from whole-genome shotgun libraries of eight African-American individuals generated at the BCM-HGSC and sequenced at the BCM-HGSC and the Broad Institute (http://www.cardiogene.org/bpr/background.htm; http://www.hapmap.org) to the human genome (build 34). Only high-quality bases were taken into account (Altshuler et al. 2000 Similarly, we estimated human-chimpanzee divergence from the alignment of shotgun reads to the human genome, using the same quality criteria as for the alignments of the human reads. Again, if multiple reads were aligned and disagreed with respect to a substitution, we picked one read at random. This approach provided us with the number of differences and the total number of base pairs compared. Since humans and chimpanzees are very similar at the nucleotide level, we did not apply a correction for multiple substitutions per site.
Estimates of chimpanzee diversity were obtained from two sources. One was based on overlapping alignments of shotgun reads from the different alleles of one chimpanzee (Clint) to the human genome. A second was obtained by comparing any of the two other Western Chimpanzees with any of the three Central Chimpanzees sequenced to 0.1x coverage each for the chimpanzee genome project (The Chimpanzee Sequencing and Analysis Consortium 2005
We generated two sets of divergence and diversity estimates: (1) unique sequence outside of genes and (2) interspersed repeats outside of genes as annotated in the UCSC Genome Browser (http://genome.ucsc.edu). In both cases, we excluded regions annotated in the human genome as duplicated regions (Bailey et al. 2002
Potential linking factors between recombination and diversity and divergence
To consider the effect of gene expression on divergence, we took the count of expressed genes in testis germ cells or ovaries (Su et al. 2004
Multiple linear regression
Modeling recurrent selective sweeps
Modeling background selection
To model both background selection and recurrent sweeps, we applied this reduction (fb) to two parameters in the selective sweep program:
Accounting for possible errors in the chimpanzee sequence
To test how much sequencing errors decrease the underlying relationship of diversity and recombination rates, we drew an error rate from the
We thank Molly Przeworski for many helpful suggestions, critical reading of the manuscript, and for helping us with her software msHH; Laurent Duret for helpful comments; Michael L. Frigge for providing an updated human genetic map; and all members of the chimpanzee genome consortium for helpful discussions. Furthermore, we thank the members of the Baylor College of Medicine Human Genome Sequencing Center and the Broad Institute for generation and early release of the DNA sequence data used to derive the estimates of human diversity in this study. We also thank the Max Planck Society and the Bundesministerium für Forschung for financial support.
4 Corresponding author. E-mail hellmann{at}eva.mpg.de; fax 49-341-3550-555. [Supplemental material is available online at www.genome.org.] Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.3461105. Freely available online through the Genome Research Immediate Open Access option.
Altshuler, D., Pollara, V.J., Cowles, C.R., Van Etten, W.J., Baldwin, J., Linton, L., and Lander, E.S. 2000. An SNP map of the human genome generated by reduced representation shotgun sequencing. Nature 407: 513-516.[CrossRef][Medline] Andolfatto, P. 2001. Adaptive hitchhiking effects on genome variability. Curr. Opin. Genet. Dev. 11: 635-641.[CrossRef][Medline] Aquadro, C.F., Bauer DuMont, V., and Reed, F.A. 2001. Genome-wide variation in the human and fruitfly: A comparison. Curr. Opin. Genet. Dev. 11: 627-634.[CrossRef][Medline]
Bailey, J.A., Gu, Z., Clark, R.A., Reinert, K., Samonte, R.V., Schwartz, S., Adams, M.D., Myers, E.W., Li, P.W., and Eichler, E.E. 2002. Recent segmental duplications in the human genome. Science 297: 1003-1007. Begun, D.J. and Aquadro, C.F. 1994. Evolutionary inferences from DNA variation at the 6-phosphogluconate dehydrogenase locus in natural populations of Drosophila: Selection and geographic differentiation. Genetics 136: 155-171.[Abstract]
Bolstad, B.M., Irizarry, R.A., Astrand, M., and Speed, T.P. 2003. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19: 185-193. Brooks, L.D. 1988. The evolution of recombination rates. In The evolution of sex: An examination of current ideas (eds. R.E. Michod and B.R. Levin), pp. 87-105. Sinauer Associates, Sutherland, MA. Charlesworth, B., Morgan, M.T., and Charlesworth, D. 1993. The effect of deleterious mutations on neutral molecular variation. Genetics 134: 1289-1303.[Abstract] Charlesworth, D., Charlesworth, B., and Morgan, M.T. 1995. The pattern of neutral molecular variation under the background selection model. Genetics 141: 1619-1632.[Abstract] Chen, F.C. and Li, W.H. 2001. Genomic divergences between humans and other hominoids and the effective population size of the common ancestor of humans and chimpanzees. Am. J. Hum. Genet. 68: 444-456.[CrossRef][Medline] The Chimpanzee Sequencing and Analysis Consortium. 2005. Initial sequencing of the chimpanzee genome and comparison with the human genome. Nature (in press).
Cooper, G.M., Brudno, M., Stone, E.A., Dubchak, I., Batzoglou, S., and Sidow, A. 2004. Characterization of evolutionary rates and constraints in three mammalian genomes. Genome Res. 14: 539-548. Ebersberger, I., Metzler, D., Schwarz, C., and Paabo, S. 2002. Genomewide comparison of DNA sequences between humans and chimpanzees. Am. J. Hum. Genet. 70: 1490-1497.[CrossRef][Medline] Enard, W., Przeworski, M., Fisher, S.E., Lai, C.S., Wiebe, V., Kitano, T., Monaco, A.P., and Paabo, S. 2002. Molecular evolution of FOXP2, a gene involved in speech and language. Nature 418: 869-872.[CrossRef][Medline] Gilbert, N., Boyle, S., Fiegler, H., Woodfine, K., Carter, N.P., and Bickmore, W.A. 2004. Chromatin architecture of the human genome; gene-rich domains are enriched in open chromatin fibers. Cell 118: 555-566.[CrossRef][Medline]
Goldman, M.A., Holmquist, G.P., Gray, M.C., Caston, L.A., and Nag, A. 1984. Replication timing of genes and middle repetitive sequences. Science 224: 686-692. Gu, Z., Wang, H., Nekrutenko, A., and Li, W.H. 2000. Densities, length proportions, and other distributional features of repetitive sequences in the human genome estimated from 430 megabases of genomic sequence. Gene 259: 81-88.[CrossRef][Medline] Hamblin, M.T., Thompson, E.E., and Di Rienzo, A. 2002. Complex signatures of natural selection at the Duffy blood group locus. Am. J. Hum. Genet. 70: 369-383.[CrossRef][Medline] Harding, R.M., Healy, E., Ray, A.J., Ellis, N.S., Flanagan, N., Todd, C., Dixon, C., Sajantila, A., Jackson, I.J., Birch-Machin, M.A., et al. 2000. Evidence for variable selective pressures at MC1R. Am. J. Hum. Genet. 66: 1351-1361.[CrossRef][Medline]
Hardison, R.C., Roskin, K.M., Yang, S., Diekhans, M., Kent, W.J., Weber, R., Elnitski, L., Li, J., O'Connor, M., Kolbe, D., et al. 2003. Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res. 13: 13-26. Hellmann, I., Ebersberger, I., Ptak, S.E., Paabo, S., and Przeworski, M. 2003. A neutral explanation for the correlation of diversity with recombination rates in humans. Am. J. Hum. Genet. 72: 1527-1535.[CrossRef][Medline] Holmquist, G.P. 1992. Chromosome bands, their chromatin flavors, and their functional features. Am. J. Hum. Genet. 51: 17-37.[Medline] Holmquist, G.P. and Caston, L.A. 1986. Replication time of interspersed repetitive DNA sequences in hamsters. Biochim. Biophys. Acta 868: 164-177.[Medline]
Hudson, R.R. 1994. How can the low levels of DNA sequence variation in regions of the Drosophila genome with low recombination rates be explained? Proc. Natl. Acad. Sci. 91: 6815-6818. Hudson, R.R. and Kaplan, N.L. 1995. Deleterious background selection with recombination. Genetics 141: 1605-1617.[Abstract]
Hwang, D.G. and Green, P. 2004. Bayesian Markov chain Monte Carlo sequence analysis reveals varying neutral substitution patterns in mammalian evolution. Proc. Natl. Acad. Sci. 101: 13994-14001. The International HapMap Consortium. 2003. The International HapMap Project. Nature 426: 789-796.[CrossRef][Medline] The International SNP MAP Working Group. 2001. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409: 928-933.[CrossRef][Medline]
Jensen-Seaman, M.I., Furey, T.S., Payseur, B.A., Lu, Y., Roskin, K.M., Chen, C.F., Thomas, M.A., Haussler, D., and Jacob, H.J. 2004. Comparative recombination rates in the rat, mouse, and human genomes. Genome Res. 14: 528-538. Jorgenson, E., Tang, H., Gadde, M., Province, M., Leppert, M., Kardia, S., Schork, N., Cooper, R., Rao, D.C., Boerwinkle, E., et al. 2005. Ethnicity and human genetic linkage maps. Am. J. Hum. Genet. 76: 276-290.[CrossRef][Medline]
Kaplan, N.L., Hudson, R.R., and Langley, C.H. 1989. The "hitchhiking effect" revisited. Genetics 123: 887-899.
Kim, Y. and Stephan, W. 2000. Joint effects of genetic hitchhiking and background selection on neutral variation. Genetics 155: 1415-1427. Kong, A., Gudbjartsson, D.F., Sainz, J., Jonsdottir, G.M., Gudjonsson, S.A., Richardsson, B., Sigurdardottir, S., Barnard, J., Hallbeck, B., Masson, G., et al. 2002. A high-resolution recombination map of the human genome. Nat. Genet. 31: 241-247.[CrossRef][Medline] Kong, A., Barnard, J., Gudbjartsson, D.F., Thorleifsson, G., Jonsdottir, G., Sigurdardottir, S., Richardsson, B., Jonsdottir, J., Thorgeirsson, T., Frigge, M.L., et al. 2004. Recombination rate and reproductive success in humans. Nat. Genet. 36: 1203-1206.[CrossRef][Medline] Lercher, M.J. and Hurst, L.D. 2002. Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet. 18: 337-340.[CrossRef][Medline] Li, W.H. 1997. Molecular evolution. p. 49. Sinauer, Sunderland, MA.
Makova, K.D., Ramsay, M., Jenkins, T., and Li, W.H. 2001. Human DNA sequence variation in a 6.6-kb region containing the melanocortin 1 receptor promoter. Genetics 158: 1253-1268.
Malcom, C.M., Wyckoff, G.J., and Lahn, B.T. 2003. Genic mutation rates in mammals: Local similarity, chromosomal heterogeneity, and X-versus-autosome disparity. Mol. Biol. Evol. 20: 1633-1641.
Meunier, J. and Duret, L. 2004. Recombination drives the evolution of GC-content in the human genome. Mol. Biol. Evol. 21: 984-990. Montgomery, D.C., Peck, E.A., and Vining, G.G. 2001. Introduction to linear regression analysis. John Wiley, New York. Mouse Genome Sequencing Consortium. 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420: 520-562.[CrossRef][Medline] Nachman, M.W. 1997. Patterns of DNA variability at X-linked loci in Mus domesticus. Genetics 147: 1303-1316.[Abstract] ____. 2001. Single nucleotide polymorphisms and recombination rate in humans. Trends Genet. 17: 481-485.[CrossRef][Medline]
Nachman, M.W. and Crowell, S.L. 2000. Estimate of the mutation rate per nucleotide in humans. Genetics 156: 297-304.
Nachman, M.W., Bauer, V.L., Crowell, S.L., and Aquadro, C.F. 1998. DNA variability and recombination rates at X-linked loci in humans. Genetics 150: 1133-1141. Nordborg, M., Charlesworth, B., and Charlesworth, D. 1996. The effect of recombination on background selection. Genet. Res. 67: 159-174.[Medline]
Przeworski, M. 2002. The signature of positive selection at randomly chosen loci. Genetics 160: 1179-1189. Przeworski, M., Hudson, R.R., and Di Rienzo, A. 2000. Adjusting the focus on human variation. Trends Genet. 16: 296-302.[CrossRef][Medline] Ptak, S.E., Roeder, A.D., Stephens, M., Gilad, Y., Paabo, S., and Przeworski, M. 2004a. Absence of the TAP2 human recombination hotspot in chimpanzees. PLoS Biol. 2: 849-855.
Ptak, S.E., Voelpel, K., and Przeworski, M. 2004b. Insights into recombination from patterns of linkage disequilibrium in humans. Genetics 167: 387-397. Ptak, S.E., Hinds, D.A., Koehler, K., Nickel, B., Patil, N., Ballinger, D.G., Przeworski, M., Frazer, K.A., and Paabo, S. 2005. Fine-scale recombination patterns differ between chimpanzees and humans. Nat. Genet. 37: 429-434.[CrossRef][Medline]
Rattray, A.J., McGill, C.B., Shafer, B.K., and Strathern, J.N. 2001. Fidelity of mitotic double-strand-break repair in Saccharomyces cerevisiae: A role for SAE2/COM1. Genetics 158: 109-122. Rogers, J., Mahaney, M.C., Witte, S.M., Nair, S., Newman, D., Wedel, S., Rodriguez, L.A., Rice, K.S., Slifer, S.H., Perelygin, A., et al. 2000. A genetic linkage map of the baboon (Papio hamadryas) genome based on human microsatellite polymorphisms. Genomics 67: 237-247.[CrossRef][Medline] Sabeti, P.C., Reich, D.E., Higgins, J.M., Levine, H.Z., Richter, D.J., Schaffner, S.F., Gabriel, S.B., Platko, J.V., Patterson, N.J., McDonald, G.J., et al. 2002. Detecting recent positive selection in the human genome from haplotype structure. Nature 419: 832-837.[CrossRef][Medline]
Shen, J.C., Rideout III, W.M., and Jones, P.A. 1994. The rate of hydrolytic deamination of 5-methylcytosine in double-stranded DNA. Nucleic Acids Res. 22: 972-976. Simonsen, K.L., Churchill, G.A., and Aquadro, C.F. 1995. Properties of statistical tests of neutrality for DNA polymorphism data. Genetics 141: 413-429.[Abstract] Smith, J.M. and Haigh, J. 1974. The hitch-hiking effect of a favourable gene. Genet Res. 23: 23-35.[Medline] Strathern, J.N., Shafer, B.K., and McGill, C.B. 1995. DNA synthesis errors associated with double-strand-break repair. Genetics 140: 965-972.[Abstract]
Su, A.I., Wiltshire, T., Batalov, S., Lapp, H., Ching, K.A., Block, D., Zhang, J., Soden, R., Hayakawa, M., Kreiman, G., et al. 2004. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl. Acad. Sci. 101: 6062-6067.
Surralles, J., Ramirez, M.J., Marcos, R., Natarajan, A.T., and Mullenders, L.H. 2002. Clusters of transcription-coupled repair in the human genome. Proc. Natl. Acad. Sci. 99: 10571-10574.
Wall, J.D., Frisse, L.A., Hudson, R.R., and Di Rienzo, A. 2003. Comparative linkage-disequilibrium analysis of the Wolfe, K.H., Sharp, P.M., and Li, W.H. 1989. Mutation rates differ among regions of the mammalian genome. Nature 337: 283-285.[CrossRef][Medline]
Woodfine, K., Fiegler, H., Beare, D.M., Collins, J.E., McCann, O.T., Young, B.D., Debernardi, S., Mott, R., Dunham, I., and Carter, N.P. 2004. Replication timing of the human genome. Hum. Mol. Genet. 13: 191-202.
http://genome.ucsc.edu/; UCSC human genome browser. http://www.cardiogene.org/bpr/background.htm; Baylor Polymorphism Resource. http://www.hapmap.org/; International HapMap Project. http://www.r-project.org/; The R project for statistical computing.
Received November 10, 2004; accepted in revised format February 26, 2005. |