Genome Res. 15:1576-1583, 2005
©2005 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/05 $5.00
Methods
Calibrating a coalescent simulation of human genome sequence variation
Stephen F. Schaffner1,5,
Catherine Foo1,
Stacey Gabriel1,
David Reich1,2,
Mark J. Daly1 and
David Altshuler1,2,3,4
1 Program in Medical and Population Genetics, The Broad Institute, Cambridge, Massachusetts 02139, USA
2 Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
3 Department of Medicine, Harvard Medical School, Boston, Massachusetts 02115, USA
4 Department of Molecular Biology and Diabetes Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
Population genetic models play an important role in human genetic research, connecting empirical observations about sequence variation with hypotheses about underlying historical and biological causes. More specifically, models are used to compare empirical measures of sequence variation, linkage disequilibrium (LD), and selection to expectations under a "null" distribution. In the absence of detailed information about human demographic history, and about variation in mutation and recombination rates, simulations have of necessity used arbitrary models, usually simple ones. With the advent of large empirical data sets, it is now possible to calibrate population genetic models with genome-wide data, permitting for the first time the generation of data that are consistent with empirical data across a wide range of characteristics. We present here the first such calibrated model and show that, while still arbitrary, it successfully generates simulated data (for three populations) that closely resemble empirical data in allele frequency, linkage disequilibrium, and population differentiation. No assertion is made about the accuracy of the proposed historical and recombination model, but its ability to generate realistic data meets a long-standing need among geneticists. We anticipate that this model, for which software is publicly available, and others like it will have numerous applications in empirical studies of human genetics.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.3709305. Freely available online through the Genome Research Immediate Open Access option.
5 Corresponding author. E-mail sfs{at}broad.mit.edu; fax (617) 252-1902.

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
J.-F. Lefebvre and D. Labuda
Fraction of Informative Recombinations: A Heuristic Approach to Analyze Recombination Rates
Genetics,
April 1, 2008;
178(4):
2069 - 2079.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. R. Browning
Estimation of Pairwise Identity by Descent From Dense Genetic Marker Data in a Population Sample of Haplotypes
Genetics,
April 1, 2008;
178(4):
2123 - 2132.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. L. Kelley and W. J. Swanson
Dietary Change and Adaptive Evolution of enamelin in Humans and Among Primates
Genetics,
March 1, 2008;
178(3):
1595 - 1603.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. Li and M. Li
GWAsimulator: a rapid whole-genome simulation program
Bioinformatics,
January 1, 2008;
24(1):
140 - 142.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. Moreno-Estrada, F. Casals, A. Ramirez-Soriano, B. Oliva, F. Calafell, J. Bertranpetit, and E. Bosch
Signatures of Selection in the Human Olfactory Receptor OR5I1 Gene
Mol. Biol. Evol.,
January 1, 2008;
25(1):
144 - 154.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
N. J. R. Fagundes, N. Ray, M. Beaumont, S. Neuenschwander, F. M. Salzano, S. L. Bonatto, and L. Excoffier
Statistical evaluation of alternative models of human evolution
PNAS,
November 6, 2007;
104(45):
17614 - 17619.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. J. Hoggart, M. Chadeau-Hyam, T. G. Clark, R. Lampariello, J. C. Whittaker, M. De Iorio, and D. J. Balding
Sequence-Level Population Simulations Over Large Genomic Regions
Genetics,
November 1, 2007;
177(3):
1725 - 1731.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
X. D. Ding, H. Simianer, and Q. Zhang
A New Method for Haplotype Inference Including Full-Sib Information
Genetics,
November 1, 2007;
177(3):
1929 - 1940.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B. Ferwerda, M. B. B. McCall, S. Alonso, E. J. Giamarellos-Bourboulis, M. Mouktaroudi, N. Izagirre, D. Syafruddin, G. Kibiki, T. Cristea, A. Hijmans, et al.
From the Cover: TLR4 polymorphisms, infectious diseases, and evolutionary pressure during migration of modern humans
PNAS,
October 16, 2007;
104(42):
16645 - 16650.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
F. A. Wright, H. Huang, X. Guan, K. Gamiel, C. Jeffries, W. T. Barry, F. Pardo-Manuel de Villena, P. F. Sullivan, K. C. Wilhelmsen, and F. Zou
Simulating association studies: a data-based resampling method for candidate regions or whole genome scans
Bioinformatics,
October 1, 2007;
23(19):
2581 - 2588.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Asthana, W. S. Noble, G. Kryukov, C. E. Grant, S. Sunyaev, and J. A. Stamatoyannopoulos
Widely distributed noncoding purifying selection in the human genome
PNAS,
July 24, 2007;
104(30):
12410 - 12415.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Liang, S. Zollner, and G. R. Abecasis
GENOME: a rapid coalescent-based whole genome simulator
Bioinformatics,
June 15, 2007;
23(12):
1565 - 1567.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. Hellenthal and M. Stephens
msHOT: modifying Hudson's ms simulator to incorporate crossover and gene conversion hotspots
Bioinformatics,
February 15, 2007;
23(4):
520 - 521.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
V. Bansal, A. Bashir, and V. Bafna
Evidence for large inversion polymorphisms in the human genome from HapMap data
Genome Res.,
February 1, 2007;
17(2):
219 - 230.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. Fearnhead
SequenceLDhot: detecting recombination hotspots
Bioinformatics,
December 15, 2006;
22(24):
3061 - 3066.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Heuertz, E. De Paoli, T. Kallman, H. Larsson, I. Jurman, M. Morgante, M. Lascoux, and N. Gyllenstrand
Multilocus Patterns of Nucleotide Diversity, Linkage Disequilibrium and Demographic History of Norway Spruce [Picea abies (L.) Karst]
Genetics,
December 1, 2006;
174(4):
2095 - 2105.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. Fearnhead
Perfect Simulation From Nonneutral Population Genetic Models: Variable Population Size and Population Subdivision
Genetics,
November 1, 2006;
174(3):
1397 - 1406.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. Verdu, L. B. Barreiro, E. Patin, A. Gessain, O. Cassar, J. R. Kidd, K. K. Kidd, D. M. Behar, A. Froment, E. Heyer, et al.
Evolutionary insights into the high worldwide prevalence of MBL2 deficiency alleles
Hum. Mol. Genet.,
September 1, 2006;
15(17):
2650 - 2658.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. L. Bubb, D. Bovee, D. Buckley, E. Haugen, M. Kibukawa, M. Paddock, A. Palmieri, S. Subramanian, Y. Zhou, R. Kaul, et al.
Scan of Human Genome Reveals No New Loci Under Ancient Balancing Selection
Genetics,
August 1, 2006;
173(4):
2165 - 2177.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. M. Teshima, G. Coop, and M. Przeworski
How reliable are empirical genomic scans for selective sweeps?
Genome Res.,
June 1, 2006;
16(6):
702 - 712.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|