|
|
|
|
Published online before print
October 31, 2007, 10.1101/gr.6554007 Genome Res. 17:1787-1796, 2007 ©2007 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/07 $5.00 OPEN ACCESS ARTICLE
Resource Sequence-based estimation of minisatellite and microsatellite repeat variability1 FAS Center for Systems Biology, Harvard University, Cambridge, Massachusetts 02138, USA; 2 Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; 3 Centre of Microbial and Plant Genetics, Department of Molecular and Microbial Systems, Katholieke Universiteit Leuven, Faculty of Applied Bioscience and Engineering, B-3001 Leuven (Heverlee), Belgium
Variable tandem repeats are frequently used for genetic mapping, genotyping, and forensics studies. Moreover, variation in some repeats underlies rapidly evolving traits or certain diseases. However, mutation rates vary greatly from repeat to repeat, and as a consequence, not all tandem repeats are suitable genetic markers or interesting unstable genetic modules. We developed a model, "SERV," that predicts the variability of a broad range of tandem repeats in a wide range of organisms. The nonlinear model uses three basic characteristics of the repeat (number of repeated units, unit length, and purity) to produce a numeric "VARscore" that correlates with repeat variability. SERV was experimentally validated using a large set of different artificial repeats located in the Saccharomyces cerevisiae URA3 gene. Further in silico analysis shows that SERV outperforms existing models and accurately predicts repeat variability in bacteria and eukaryotes, including plants and humans. Using SERV, we demonstrate significant enrichment of variable repeats within human genes involved in transcriptional regulation, chromatin remodeling, morphogenesis, and neurogenesis. Moreover, SERV allows identification of known and candidate genes involved in repeat-based diseases. In addition, we demonstrate the use of SERV for the selection and comparison of suitable variable repeats for genotyping and forensic purposes. Our analysis indicates that tandem repeats used for genotyping should have a VARscore between 1 and 3. SERV is publicly available from http://hulsweb1.cgr.harvard.edu/SERV/.
4 These authors contributed equally to this work. E-mail kverstrepen{at}cgr.harvard.edu; fax (617) 495-2196. [Supplemental material is available online at www.genome.org.] Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.6554007
Related Protocol
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||