Published online before print
December 30, 2002, 10.1101/gr.731003
Vol 13, Issue 1, 81-90, January 2003
METHODS
The Phusion Assembler
James C. Mullikin1 and
Zemin Ning
Informatics Department, The Wellcome Trust Sanger Institute,
Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
The Phusion assembler has assembled the mouse genome from the
whole-genome shotgun (WGS) dataset collected by the Mouse Genome
Sequencing Consortium, at 7.5x sequence coverage, producing a
high-quality draft assembly 2.6 gigabases in size, of which 90% of
these bases are in 479 scaffolds. For the mouse genome, which is a
large and repeat-rich genome, the input dataset was designed to include
a high proportion of paired end sequences of various size selected
inserts, from 2200 kbp lengths, into various host vector templates.
Phusion uses sequence data, called reads, and information about reads
that share common templates, called read pairs, to drive the assembly
of this large genome to highly accurate results. The preassembly stage,
which clusters the reads into sensible groups, is a key element of the
entire assembler, because it permits a simple approach to
parallelization of the assembly stage, as each cluster can be treated
independent of the others. In addition to the application of Phusion to
the mouse genome, we will also present results from the WGS assembly of
Caenorhabditis briggsae sequenced to about 11x coverage. The
C. briggsae assembly was accessioned through EMBL,
http://www.ebi.ac.uk/services/index.html, using the series
CAAC01000001CAAC01000578, however, the Phusion mouse assembly
described here was not accessioned. The mouse data was generated by the
Mouse Genome Sequencing Consortium. The C. briggsae sequence
was generated at The Wellcome Trust Sanger Institute and the Genome
Sequencing Center, Washington University School of Medicine.
1 Corresponding author.
E-MAIL jcm{at}sanger.ac.uk; FAX 44-1223-494-919
Article and publication are at
http://www.genome.org/cgi/doi/10.1101/gr.731003. Article published online before print in December
2002.

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
D. Hernandez, P. Francois, L. Farinelli, M. Osteras, and J. Schrenzel
De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer
Genome Res.,
May 1, 2008;
18(5):
802 - 809.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. R. Zerbino and E. Birney
Velvet: Algorithms for de novo short read assembly using de Bruijn graphs
Genome Res.,
May 1, 2008;
18(5):
821 - 829.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. Denisov, B. Walenz, A. L. Halpern, J. Miller, N. Axelrod, S. Levy, and G. Sutton
Consensus generation and variant detection by Celera Assembler
Bioinformatics,
April 15, 2008;
24(8):
1035 - 1040.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. U. Pontius, J. C. Mullikin, D. R. Smith, Agencourt Sequencing Team, K. Lindblad-Toh, S. Gnerre, M. Clamp, J. Chang, R. Stephens, B. Neelam, et al.
Initial sequence and comparative analysis of the cat genome
Genome Res.,
November 1, 2007;
17(11):
1675 - 1689.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. C. Dohm, C. Lottaz, T. Borodina, and H. Himmelbauer
SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing
Genome Res.,
November 1, 2007;
17(11):
1697 - 1706.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. L. Warren, D. Varabei, D. Platt, X. Huang, D. Messina, S.-P. Yang, J. W. Kronstad, M. Krzywinski, W. C. Warren, J. W. Wallis, et al.
Physical map-assisted whole-genome shotgun sequence assemblies.
Genome Res.,
June 1, 2006;
16(6):
768 - 775.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
X. Huang, S.-P. Yang, A. T. Chinwalla, L. W. Hillier, P. Minx, E. R. Mardis, and R. K. Wilson
Application of a superword array in genome assembly
Nucleic Acids Res.,
January 5, 2006;
34(1):
201 - 205.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. C. Sullivan, J. F. Ryan, J. A. Watson, J. Webb, J. C. Mullikin, D. Rokhsar, and J. R. Finnerty
StellaBase: The Nematostella vectensis Genomics Database
Nucleic Acids Res.,
January 1, 2006;
34(suppl_1):
D495 - D499.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. L. Salzberg and J. A. Yorke
Beware of mis-assembled genomes
Bioinformatics,
December 15, 2005;
21(24):
4320 - 4321.
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. E. Galagan, M. R. Henn, L.-J. Ma, C. A. Cuomo, and B. Birren
Genomics of the fungal kingdom: Insights into eukaryotic biology
Genome Res.,
December 1, 2005;
15(12):
1620 - 1631.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. P. Vinson, D. B. Jaffe, K. O'Neill, E. K. Karlsson, N. Stange-Thomann, S. Anderson, J. P. Mesirov, N. Satoh, Y. Satou, C. Nusbaum, et al.
Assembly of polymorphic genomes: Algorithms and application to Ciona savignyi
Genome Res.,
August 1, 2005;
15(8):
1127 - 1135.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. Bartels, S. Kespohl, S. Albaum, T. Druke, A. Goesmann, J. Herold, O. Kaiser, A. Puhler, F. Pfeiffer, G. Raddatz, et al.
BACCardI--a tool for the validation of genomic assemblies, assisting genome finishing and intergenome comparison
Bioinformatics,
April 1, 2005;
21(7):
853 - 859.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. A. Pevzner, H. Tang, and G. Tesler
De Novo Repeat Classification and Fragment Assembly
Genome Res.,
September 1, 2004;
14(9):
1786 - 1796.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. Muller, M. Denis, L. Gentzbittel, and T. Faraut
The Iccare web server: an attempt to merge sequence and mapping information for plant and animal species
Nucleic Acids Res.,
July 1, 2004;
32(suppl_2):
W429 - W434.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. Havlak, R. Chen, K. J. Durbin, A. Egan, Y. Ren, X.-Z. Song, G. M. Weinstock, and R. A. Gibbs
The Atlas Genome Assembly System
Genome Res.,
April 1, 2004;
14(4):
721 - 732.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. DeMarco, A. T. Kowaltowski, A. A. Machado, M. B. Soares, C. Gargioni, T. Kawano, V. Rodrigues, A. M. B. N. Madeira, R. A. Wilson, C. F. M. Menck, et al.
Saci-1, -2, and -3 and Perere, Four Novel Retrotransposons with High Transcriptional Activities from the Human Parasite Schistosoma mansoni
J. Virol.,
March 15, 2004;
78(6):
2967 - 2978.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Pop, D. S. Kosack, and S. L. Salzberg
Hierarchical Scaffolding With Bambus
Genome Res.,
January 1, 2004;
14(1):
149 - 159.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
X. Huang, J. Wang, S. Aluru, S.-P. Yang, and L. Hillier
PCAP: A Whole-Genome Assembly Program
Genome Res.,
September 1, 2003;
13(9):
2164 - 2170.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|