Genome Res. 14:693-699, 2004
©2004 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/04 $5.00
Methods
MAVID: Constrained Ancestral Alignment of Multiple Sequences
Nicolas Bray and
Lior Pachter1
Department of Mathematics, University of California at Berkeley, Berkeley, California 94720, USA
We describe a new global multiple-alignment program capable of aligning a large number of genomic regions. Our progressive-alignment approach incorporates the following ideas: maximum-likelihood inference of ancestral sequences, automatic guide-tree construction, protein-based anchoring of ab-initio gene predictions, and constraints derived from a global homology map of the sequences. We have implemented these ideas in the MAVID program, which is able to accurately align multiple genomic regions up to megabases long. MAVID is able to effectively align divergent sequences, as well as incomplete unfinished sequences. We demonstrate the capabilities of the program on the benchmark CFTR region, which consists of 1.8 Mb of human sequence and 20 orthologous regions in marsupials, birds, fish, and mammals. Finally, we describe two large MAVID alignments, an alignment of all the available HIV genomes and a multiple alignment of the entire human, mouse, and rat genomes.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1960404.
1 Corresponding author. E-MAIL lpachter{at}math.berkeley.edu; FAX (510) 642-8204.
[Supplemental material is available online at http://baboon.math.berkeley.edu/mavid/data.]

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
K. Katoh and H. Toh
Recent developments in the MAFFT multiple sequence alignment program
Brief Bioinform,
July 1, 2008;
9(4):
286 - 298.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
N. Gal-Mark, S. Schwartz, and G. Ast
Alternative splicing of Alu exons--two arms are better than one
Nucleic Acids Res.,
April 1, 2008;
36(6):
2012 - 2023.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. Lunter, A. Rocco, N. Mimouni, A. Heger, A. Caldeira, and J. Hein
Uncertainty in homology inferences: Assessing and improving genomic sequence alignment
Genome Res.,
February 1, 2008;
18(2):
298 - 309.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
X. Wang, J. Gu, M. Q. Zhang, and Y. Li
Identification of phylogenetically conserved microRNA cis-regulatory elements across 12 Drosophila species
Bioinformatics,
January 15, 2008;
24(2):
165 - 171.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. F. Lin, J. W. Carlson, M. A. Crosby, B. B. Matthews, C. Yu, S. Park, K. H. Wan, A. J. Schroeder, L. S. Gramates, S. E. St. Pierre, et al.
Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes
Genome Res.,
December 1, 2007;
17(12):
1823 - 1836.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. K. Bradley and I. Holmes
Transducers: an emerging probabilistic framework for modeling indels on trees
Bioinformatics,
December 1, 2007;
23(23):
3258 - 3262.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. A. F. Noor, D. A. Garfield, S. W. Schaeffer, and C. A. Machado
Divergence Between the Drosophila pseudoobscura and D. persimilis Genome Sequences in Relation to Chromosomal Inversions
Genetics,
November 1, 2007;
177(3):
1417 - 1428.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Brudno, A. Poliakov, S. Minovitsky, I. Ratnere, and I. Dubchak
Multiple whole genome alignments and novel biomedical applications at the VISTA portal
Nucleic Acids Res.,
July 13, 2007;
35(suppl_2):
W669 - W674.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Brandstrom and H. Ellegren
The Genomic Landscape of Short Insertion and Deletion Polymorphisms in the Chicken (Gallus gallus) Genome: A High Frequency of Deletions in Tandem Duplicates
Genetics,
July 1, 2007;
176(3):
1691 - 1701.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
F. Denoeud, P. Kapranov, C. Ucla, A. Frankish, R. Castelo, J. Drenkow, J. Lagarde, T. Alioto, C. Manzano, J. Chrast, et al.
Prominent use of distal 5' transcription start sites and discovery of a large number of additional exons in ENCODE regions
Genome Res.,
June 1, 2007;
17(6):
746 - 759.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. H. Margulies, G. M. Cooper, G. Asimenos, D. J. Thomas, C. N. Dewey, A. Siepel, E. Birney, D. Keefe, A. S. Schwartz, M. Hou, et al.
Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome
Genome Res.,
June 1, 2007;
17(6):
760 - 774.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. Zheng, A. Frankish, R. Baertsch, P. Kapranov, A. Reymond, S. W. Choo, Y. Lu, F. Denoeud, S. E. Antonarakis, M. Snyder, et al.
Pseudogenes in the ENCODE regions: Consensus annotation, analysis of transcription, and evolution
Genome Res.,
June 1, 2007;
17(6):
839 - 851.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. J. Johnson, S. Kariyawasam, Y. Wannemuehler, P. Mangiamele, S. J. Johnson, C. Doetkott, J. A. Skyberg, A. M. Lynne, J. R. Johnson, and L. K. Nolan
The Genome Sequence of Avian Pathogenic Escherichia coli Strain O1:K1:H7 Shares Strong Similarities with Human Extraintestinal Pathogenic E. coli Genomes
J. Bacteriol.,
April 15, 2007;
189(8):
3228 - 3236.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. L.H. Lo, S. P. Yip, P. K.C. Cheng, T. S.S. To, W. W.L. Lim, and P. H.M. Leung
One-Step Rapid Reverse Transcription-PCR Assay for Detecting and Typing Dengue Viruses with GC Tail and Induced Fluorescence Resonance Energy Transfer Techniques for Melting Temperature and Color Multiplexing
Clin. Chem.,
April 1, 2007;
53(4):
594 - 599.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Kumar and A. Filipski
Multiple sequence alignment: In pursuit of homologous DNA positions
Genome Res.,
February 1, 2007;
17(2):
127 - 135.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
X. Zhang and T. Kahveci
QOMA: quasi-optimal multiple alignment of protein sequences
Bioinformatics,
January 15, 2007;
23(2):
162 - 168.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
N. A. Baird, D. W. Turnbull, and E. A. Johnson
Induction of the Heat Shock Pathway during Hypoxia Requires Regulation of Heat Shock Factor by Hypoxia-inducible Factor-1
J. Biol. Chem.,
December 15, 2006;
281(50):
38675 - 38681.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Hain, C. Steinweg, C. T. Kuenne, A. Billion, R. Ghai, S. S. Chatterjee, E. Domann, U. Karst, A. Goesmann, T. Bekel, et al.
Whole-Genome Sequence of Listeria welshimeri Reveals Common Steps in Genome Reduction with Listeria innocua as Compared to Listeria monocytogenes
J. Bacteriol.,
November 1, 2006;
188(21):
7405 - 7415.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Flannick, A. Novak, B. S. Srinivasan, H. H. McAdams, and S. Batzoglou
Graemlin: General and robust alignment of multiple large interaction networks
Genome Res.,
September 1, 2006;
16(9):
1169 - 1181.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. GuhaThakurta
Computational identification of transcriptional regulatory elements in DNA sequence
Nucleic Acids Res.,
July 19, 2006;
34(12):
3585 - 3598.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. L. Halligan and P. D. Keightley
Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison
Genome Res.,
July 1, 2006;
16(7):
875 - 884.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. A. Gauthier and R. S. Hewes
Transcriptional regulation of neuropeptide and peptide hormone expression by the Drosophila dimmed and cryptocephal genes
J. Exp. Biol.,
May 15, 2006;
209(10):
1803 - 1815.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. N. Dewey and L. Pachter
Evolution at the nucleotide level: the problem of multiple whole-genome alignment.
Hum. Mol. Genet.,
April 15, 2006;
15(suppl_1):
R51 - R56.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. Caspi and L. Pachter
Identification of transposable elements using multiple alignments of related genomes
Genome Res.,
February 1, 2006;
16(2):
260 - 270.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Tran, P. Havlak, and J. Miller
MicroRNA enrichment among short 'ultraconserved' sequences in insects.
Nucleic Acids Res.,
January 1, 2006;
34(9):
e65 - e65.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. E. EREMEEVA, A. MADAN, C. D. SHAW, K. TANG, and G. A. DASCH
New Perspectives on Rickettsial Evolution from New Genome Sequences of Rickettsia, particularly R. canadensis, and Orientia tsutsugamushi
Ann. N.Y. Acad. Sci.,
December 1, 2005;
1063(1):
47 - 63.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. Li, N. Chen, R. L. Roper, Z. Feng, A. Hunter, M. Danila, E. J. Lefkowitz, R. M. L. Buller, and C. Upton
Complete coding sequences of the rabbitpox virus genome
J. Gen. Virol.,
November 1, 2005;
86(11):
2969 - 2977.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
N. Kouprina, A. Pavlicek, V. N. Noskov, G. Solomon, J. Otstot, W. Isaacs, J. D. Carpten, J. M. Trent, J. Schleutker, J. C. Barrett, et al.
Dynamic structure of the SPANX gene cluster mapped to the prostate cancer susceptibility locus HPCX at Xq27
Genome Res.,
November 1, 2005;
15(11):
1477 - 1486.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. Pavlicek, R. House, A. J. Gentles, J. Jurka, and B. E. Morrow
Traffic of genetic information between segmental duplications flanking the typical 22q11.2 deletion in velo-cardio-facial syndrome/DiGeorge syndrome
Genome Res.,
November 1, 2005;
15(11):
1487 - 1495.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. M. Likos, S. A. Sammons, V. A. Olson, A. M. Frace, Y. Li, M. Olsen-Rasmussen, W. Davidson, R. Galloway, M. L. Khristova, M. G. Reynolds, et al.
A tale of two clades: monkeypox viruses
J. Gen. Virol.,
October 1, 2005;
86(10):
2661 - 2672.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. D. Keightley, G. V. Kryukov, S. Sunyaev, D. L. Halligan, and D. J. Gaffney
Evolutionary constraints in conserved nongenic sequences of mammals
Genome Res.,
October 1, 2005;
15(10):
1373 - 1378.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. Antonell, O. de Luis, X. Domingo-Roura, and L. A. Perez-Jurado
Evolutionary mechanisms shaping the genomic structure of the Williams-Beuren syndrome chromosomal region at human 7q11.23
Genome Res.,
September 1, 2005;
15(9):
1179 - 1188.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Flannick and S. Batzoglou
Using multiple alignments to improve seeded local alignment algorithms
Nucleic Acids Res.,
August 12, 2005;
33(14):
4563 - 4577.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. Siepel, G. Bejerano, J. S. Pedersen, A. S. Hinrichs, M. Hou, K. Rosenbloom, H. Clawson, J. Spieth, L. W. Hillier, S. Richards, et al.
Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes
Genome Res.,
August 1, 2005;
15(8):
1034 - 1050.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. Alkan, E. Tuzun, J. Buard, F. Lethiec, E. E. Eichler, J. A. Bailey, and S. C. Sahinalp
Manipulating multiple sequence alignments via MaM and WebMaM
Nucleic Acids Res.,
July 1, 2005;
33(suppl_2):
W295 - W298.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. Berezikov, V. Guryev, and E. Cuppen
CONREAL web server: identification and visualization of conserved transcription factor binding sites
Nucleic Acids Res.,
July 1, 2005;
33(suppl_2):
W447 - W450.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. D. McAuliffe, M. I. Jordan, and L. Pachter
Subtree power analysis and species selection for comparative genomics
PNAS,
May 31, 2005;
102(22):
7900 - 7905.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Ye and X. Huang
MAP2: multiple alignment of syntenic genomic sequences
Nucleic Acids Res.,
January 7, 2005;
33(1):
162 - 170.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. Pavlicek, V. N. Noskov, N. Kouprina, J. C. Barrett, J. Jurka, and V. Larionov
Evolution of the tumor suppressor BRCA1 locus in primates: implications for cancer predisposition
Hum. Mol. Genet.,
November 15, 2004;
13(22):
2737 - 2751.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. A. Frazer, L. Pachter, A. Poliakov, E. M. Rubin, and I. Dubchak
VISTA: computational tools for comparative genomics
Nucleic Acids Res.,
July 1, 2004;
32(suppl_2):
W273 - W279.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|