Genome Research songbird

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Ning, Z.
Right arrow Articles by Mullikin, J. C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ning, Z.
Right arrow Articles by Mullikin, J. C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Vol. 11, Issue 10, 1725-1729, October 2001

METHODS
SSAHA: A Fast Search Method for Large DNA Databases

Zemin Ning,1 Anthony J. Cox,1 and James C. Mullikin2

Informatics Division, The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK

We describe an algorithm, SSAHA (Sequence Search and Alignment by Hashing Algorithm), for performing fast searches on databases containing multiple gigabases of DNA. Sequences in the database are preprocessed by breaking them into consecutive k-tuples of k contiguous bases and then using a hash table to store the position of each occurrence of each k-tuple. Searching for a query sequence in the database is done by obtaining from the hash table the "hits" for each k-tuple in the query sequence and then performing a sort on the results. We discuss the effect of the tuple length k on the search speed, memory usage, and sensitivity of the algorithm and present the results of computational experiments which show that SSAHA can be three to four orders of magnitude faster than BLAST or FASTA, while requiring less memory than suffix tree methods. The SSAHA algorithm is used for high-throughput single nucleotide polymorphism (SNP) detection and very large scale sequence assembly. Also, it provides Web-based sequence search facilities for Ensembl projects.


1 Both authors contributed equally to this paper.

2 Corresponding author.


11:1725-1729 ©2001 by Cold Spring Harbor Laboratory Press  ISSN 1088-9051/01 $5.00

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Biol. Bull.Home page
R. M. Freeman JR., M. Wu, M-M. Cordonnier-Pratt, L. H. Pratt, C. E. Gruber, M. Smith, E. S. Lander, N. Stange-Thomann, C. J. Lowe, J. Gerhart, et al.
cDNA Sequences for Transcription Factors and Signaling Proteins of the Hemichordate Saccoglossus kowalevskii: Efficacy of the Expressed Sequence Tag (EST) Approach for Evolutionary and Developmental Studies of a New Organism
Biol. Bull., June 1, 2008; 214(3): 284 - 302.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
O. Gotoh
A space-efficient and accurate method for mapping and aligning cDNA sequences onto genomic sequence
Nucleic Acids Res., May 1, 2008; 36(8): 2630 - 2638.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Li, Y. Li, K. Kristiansen, and J. Wang
SOAP: short oligonucleotide alignment program
Bioinformatics, March 1, 2008; 24(5): 713 - 714.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. Flicek, B. L. Aken, K. Beal, B. Ballester, M. Caccamo, Y. Chen, L. Clarke, G. Coates, F. Cunningham, T. Cutts, et al.
Ensembl 2008
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D707 - D714.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
L. Ilie and S. Ilie
Multiple spaced seeds for homology search
Bioinformatics, November 15, 2007; 23(22): 2969 - 2977.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
C. M. Bergman and H. Quesneville
Discovering and detecting transposable elements in genome sequences
Brief Bioinform, November 1, 2007; 8(6): 382 - 392.
[Abstract] [Full Text] [PDF]


Home page
Brief Funct Genomic ProteomicHome page
G. Spudich, X. M. Fernandez-Suarez, and E. Birney
Genome browsing with Ensembl: a practical overview
Brief Funct Genomic Proteomic, October 29, 2007; (2007) elm025v1.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
G. R. Bignell, T. Santarius, J. C.M. Pole, A. P. Butler, J. Perry, E. Pleasance, C. Greenman, A. Menzies, S. Taylor, S. Edkins, et al.
Architectures of somatic genomic rearrangement in human cancer amplicons at sequence-level resolution
Genome Res., September 1, 2007; 17(9): 1296 - 1303.
[Abstract] [Full Text] [PDF]


Home page
Poult. Sci.Home page
P. B. Antin, S. Kaur, S. Stanislaw, S. Davey, J. H. Konieczka, T. A. Yatskievych, and D. K. Darnell
Gallus Expression In Situ Hybridization Analysis: A Chicken Embryo Gene Expression Database
Poult. Sci., July 1, 2007; 86(7): 1472 - 1477.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Vishnoi, R. Roy, and A. Bhattacharya
Comparative analysis of bacterial genomes: identification of divergent regions in mycobacterial strains using an anchor-based approach
Nucleic Acids Res., June 28, 2007; 35(11): 3654 - 3667.
[Abstract] [Full Text] [PDF]


Home page
FASEB J.Home page
M. R. de la Vega, R. G. Sevilla, A. Hermoso, J. Lorenzo, S. Tanco, A. Diez, L. D. Fricker, J. M. Bautista, and F. X. Aviles
Nna1-like proteins are active metallocarboxypeptidases of a new and diverse M14 subfamily
FASEB J, March 1, 2007; 21(3): 851 - 865.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
X. Huang and D. L. Brutlag
Dynamic use of multiple parameter sets in sequence alignment
Nucleic Acids Res., January 28, 2007; 35(2): 678 - 686.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. S. Alioto
U12DB: a database of orthologous U12-type spliceosomal introns
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D110 - D115.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. J. P. Hubbard, B. L. Aken, K. Beal, B. Ballester, M. Caccamo, Y. Chen, L. Clarke, G. Coates, F. Cunningham, T. Cutts, et al.
Ensembl 2007
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D610 - D617.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
S. H. Nagaraj, R. B. Gasser, and S. Ranganathan
A hitchhiker's guide to expressed sequence tag (EST) analysis
Brief Bioinform, January 1, 2007; 8(1): 6 - 21.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
J. Dostie, T. A. Richmond, R. A. Arnaout, R. R. Selzer, W. L. Lee, T. A. Honan, E. D. Rubio, A. Krumm, J. Lamb, C. Nusbaum, et al.
Chromosome Conformation Capture Carbon Copy (5C): A massively parallel solution for mapping interactions between genomic elements
Genome Res., October 1, 2006; 16(10): 1299 - 1309.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
B. Khulan, R. F. Thompson, K. Ye, M. J. Fazzari, M. Suzuki, E. Stasiek, M. E. Figueroa, J. L. Glass, Q. Chen, C. Montagna, et al.
Comparative isoschizomer profiling of cytosine methylation: The HELP assay
Genome Res., August 1, 2006; 16(8): 1046 - 1055.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Szafranski, N. Jahn, and M. Platzer
tuple_plot: Fast pairwise nucleotide sequence comparison with noise suppression
Bioinformatics, August 1, 2006; 22(15): 1917 - 1918.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
X. Huang, S.-P. Yang, A. T. Chinwalla, L. W. Hillier, P. Minx, E. R. Mardis, and R. K. Wilson
Application of a superword array in genome assembly
Nucleic Acids Res., January 5, 2006; 34(1): 201 - 205.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. Tran, P. Havlak, and J. Miller
MicroRNA enrichment among short 'ultraconserved' sequences in insects.
Nucleic Acids Res., January 1, 2006; 34(9): e65 - e65.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. S. Nord, P. J. Chang, B. R. Conklin, A. V. Cox, C. A. Harper, G. G. Hicks, C. C. Huang, S. J. Johns, M. Kawamoto, S. Liu, et al.
The International Gene Trap Consortium Website: a portal to all publicly available gene trap cell lines in mouse
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D642 - D648.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
T. L. Petryshen, A. Kirby, R. P. Hammer Jr., S. Purcell, S. B. O'Leary, J. B. Singer, A. E. Hill, J. H. Nadeau, M. J. Daly, and P. Sklar
Two Quantitative Trait Loci for Prepulse Inhibition of Startle Identified on Mouse Chromosome 16 Using Chromosome Substitution Strains
Genetics, December 1, 2005; 171(4): 1895 - 1904.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. J. Kim, A. Boyd, B. D. Athey, and J. M. Patel
miBLAST: scalable evaluation of a batch of nucleotide sequence queries with BLAST
Nucleic Acids Res., August 1, 2005; 33(13): 4335 - 4344.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. D. Wu and C. K. Watanabe
GMAP: a genomic mapping and alignment program for mRNA and EST sequences
Bioinformatics, May 1, 2005; 21(9): 1859 - 1875.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
R. W. Blakesley, N. F. Hansen, J. C. Mullikin, P. J. Thomas, J. C. McDowell, B. Maskeri, A. C. Young, B. Benjamin, S. Y. Brooks, B. I. Coleman, et al.
An intermediate grade of finished genomic sequence suitable for comparative analyses
Genome Res., November 1, 2004; 14(11): 2235 - 2244.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
V. Guryev, E. Berezikov, R. Malik, R. H.A. Plasterk, and E. Cuppen
Single Nucleotide Polymorphisms Associated With Rat Expressed Sequences
Genome Res., July 1, 2004; 14(7): 1438 - 1443.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Reneker, C.-R. Shyu, P. Zeng, J. C. Polacco, and W. Gassmann
ACMES: fast multiple-genome searches for short repeat sequences with concurrent cross-species information retrieval
Nucleic Acids Res., July 1, 2004; 32(suppl_2): W649 - W653.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
C. Simillion, K. Vandepoele, Y. Saeys, and Y. Van de Peer
Building Genomic Profiles for Uncovering Segmental Homology in the Twilight Zone
Genome Res., June 1, 2004; 14(6): 1095 - 1106.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
C. A. Stewart, R. Horton, R. J.N. Allcock, J. L. Ashurst, A. M. Atrazhev, P. Coggill, I. Dunham, S. Forbes, K. Halls, J. M.M. Howson, et al.
Complete MHC Haplotype Sequencing for Common Disease Gene Mapping
Genome Res., June 1, 2004; 14(6): 1176 - 1187.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
J. Stalker, B. Gibbins, P. Meidl, J. Smith, W. Spooner, H.-R. Hotz, and A. V. Cox
The Ensembl Web Site: Mechanics of a Genome Browser
Genome Res., May 1, 2004; 14(5): 951 - 955.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
K. J. Kalafus, A. R. Jackson, and A. Milosavljevic
Pash: Efficient Genome-Scale Sequence Anchoring by Positional Hashing
Genome Res., April 1, 2004; 14(4): 672 - 678.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
H. Zimdahl, G. Nyakatura, P. Brandt, H. Schulz, O. Hummel, B. Fartmann, D. Brett, M. Droege, J. Monti, Y.-A. Lee, et al.
A SNP Map of the Rat Genome Generated from cDNA Sequences
Science, February 6, 2004; 303(5659): 807 - 807.
[Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
G. E. Crawford, I. E. Holt, J. C. Mullikin, D. Tai, National Institutes of Health Intramural Sequencin, R. Blakesley, G. Bouffard, A. Young, C. Masiello, E. D. Green, et al.
From the Cover: Identifying gene regulatory elements by genome-wide recovery of DNase hypersensitive sites
PNAS, January 27, 2004; 101(4): 992 - 997.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
I. Nikaido, C. Saito, A. Wakamoto, Y. Tomaru, T. Arakawa, Y. Hayashizaki, and Y. Okazaki
EICO (Expression-based Imprint Candidate Organizer): finding disease-related imprinted genes
Nucleic Acids Res., January 1, 2004; 32(90001): D548 - 551.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. C.-C. Shih and W.-H. Li
GS-Aligner: A Novel Tool for Aligning Genomic Sequences Using Bit-Level Operations
Mol. Biol. Evol., August 1, 2003; 20(8): 1299 - 1309.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. A. Thanaraj, F. Clark, and J. Muilu
Conservation of human alternative splice events in mouse
Nucleic Acids Res., May 15, 2003; 31(10): 2544 - 2552.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. Brudno, C. B. Do, G. M. Cooper, M. F. Kim, E. Davydov, N. C. S. Program, E. D. Green, A. Sidow, and S. Batzoglou
LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale Multiple Alignment of Genomic DNA
Genome Res., April 1, 2003; 13(4): 721 - 731.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
T.-J. Chuang, W.-C. Lin, H.-C. Lee, C.-W. Wang, K.-L. Hsiao, Z.-H. Wang, D. Shieh, S. C. Lin, and L.-Y. Ch'ang
A Complexity Reduction Algorithm for Analysis and Annotation of Large Genomic Sequences
Genome Res., February 1, 2003; 13(2): 313 - 322.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Brooksbank, E. Camon, M. A. Harris, M. Magrane, M. J. Martin, N. Mulder, C. O'Donovan, H. Parkinson, M. A. Tuli, R. Apweiler, et al.
The European Bioinformatics Institute's data resources
Nucleic Acids Res., January 1, 2003; 31(1): 43 - 50.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Roberg-Perez, C. M. Carlson, and D. A. Largaespada
MTID: a database of Sleeping Beauty transposon insertions in mice
Nucleic Acids Res., January 1, 2003; 31(1): 78 - 81.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
J. C. Mullikin and Z. Ning
The Phusion Assembler
Genome Res., January 1, 2003; 13(1): 81 - 90.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
S. Schwartz, W. J. Kent, A. Smit, Z. Zhang, R. Baertsch, R. C. Hardison, D. Haussler, and W. Miller
Human-Mouse Alignments with BLASTZ
Genome Res., January 1, 2003; 13(1): 103 - 107.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. C. Rouchka, W. Gish, and D. J. States
Comparison of whole genome assemblies of the human genome
Nucleic Acids Res., November 15, 2002; 30(22): 5004 - 5014.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. L. Delcher, A. Phillippy, J. Carlton, and S. L. Salzberg
Fast algorithms for large-scale genome alignment and comparison
Nucleic Acids Res., June 1, 2002; 30(11): 2478 - 2483.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
T. A. Down and T. J. P. Hubbard
Computational Detection and Location of Transcription Start Sites in Mammalian Genomic DNA
Genome Res., March 1, 2002; 12(3): 458 - 461.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
W. J. Kent
BLAT---The BLAST-Like Alignment Tool
Genome Res., April 1, 2002; 12(4): 656 - 664.
[Abstract] [Full Text] [PDF]




Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.