Genome Research songbird

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


Published online before print December 14, 2007, 10.1101/gr.7088808
Genome Res. 18:324-330, 2008
©2008 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/08 $5.00
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
gr.7088808v1
18/2/324    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Google Scholar
Right arrow Articles by Chaisson, M. J.
Right arrow Articles by Pevzner, P. A.
PubMed
Right arrow PubMed Citation
Right arrow Articles by Chaisson, M. J.
Right arrow Articles by Pevzner, P. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Methods

Short read fragment assembly of bacterial genomes

Mark J. Chaisson1 and Pavel A. Pevzner2,3

1 Bioinformatics Program, University of California San Diego, La Jolla, California 92093, USA; 2 Department of Computer Science and Engineering, University of California San Diego, La Jolla, California 92093, USA

In the last year, high-throughput sequencing technologies have progressed from proof-of-concept to production quality. While these methods produce high-quality reads, they have yet to produce reads comparable in length to Sanger-based sequencing. Current fragment assembly algorithms have been implemented and optimized for mate-paired Sanger-based reads, and thus do not perform well on short reads produced by short read technologies. We present a new Eulerian assembler that generates nearly optimal short read assemblies of bacterial genomes and describe an approach to assemble reads in the case of the popular hybrid protocol when short and long Sanger-based reads are combined.


3 Corresponding author.

E-mail ppevzner{at}cs.ucsd.edu; fax (858) 534-7029.

Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.7088808

4 Most reads in this study are shorter than 120 bases.

5 The paper by Pevzner et al. (1989) illustrates some potential advantages of the Eulerian path approach over the "overlap-layout-consensus" approach for fragment assembly. For example, while the study by Pevzner et al. (1989) described a simple algorithm for constructing the SBH repeat graph, it is not immediately clear how to generalize the approaches in the studies by Huang et al. (2003) and Jaffe et al. (2003) for efficient construction of the repeat graph even in the simple case of SBH "reads."

6 E. coli is 4,639,675 bp long, S. pneumoniae is 2,160,837 bp long, and BAC is 173,427 bp long (chromosome 6, bases 30537344–30710771).

7 As described above, EULER-SR does not try to estimate the multiplicities of tandem repeats and misses three copies of tandem repeats in E. coli (approximately 1600, 1000, and 300 nucleotides) and one copy of a tandem repeat in S. pneumoniae (~500 nt).

8 The surprising deterioration of N50 statistics for 10-kb spacing (as comparing with 2.5-kb spacing) reflects ambiguities in mapping longer paths between mate-pairs in highly tangled de Bruijn graphs.

9 123, 72, 31, 9, and 3 mate-pairs map to 3, 4, 5, 6, and more than 6 edges correspondingly.

10 Also in cases where coverage across the tandem repeat is low, tandem repeats may be collapsed into a single copy.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?





Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.
Copyright © 2008 by Cold Spring Harbor Laboratory Press.