|
Vol. 9, Issue 11, 1116-1127, November 1999
METHODS
Detecting and Analyzing DNA Sequencing Errors: Toward a Higher Quality of the Bacillus subtilis Genome Sequence
Claudine
Médigue,1,2,4
Matthias
Rose,3
Alain
Viari,2 and
Antoine
Danchin1
1 Institut Pasteur REG, F-75724 Paris Cedex 15, France;
2 Atelier de BioInformatique, Université Paris VI, 75005 Paris, France; 3 Goethe-Universitae Frankfurt, Institut
für Mikrobiologie, D-60439 Frankfurt/Main, Germany
During the determination of a DNA sequence, the introduction of
artifactual frameshifts and/or in-frame stop codons in putative genes
can lead to misprediction of gene products. Detection of such errors
with a method based on protein similarity matching is only possible
when related sequences are available in databases. Here, we present a
method to detect frameshift errors in DNA sequences that is based on
the intrinsic properties of the coding sequences. It combines the
results of two analyses, the search for translational initiation/termination sites and the prediction of coding regions. This
method was used to screen the complete Bacillus subtilis genome sequence and the regions flanking putative errors were resequenced for verification. This procedure allowed us to correct the
sequence and to analyze in detail the nature of the errors. Interestingly, in several cases in-frame termination codons or frameshifts were not sequencing errors but confirmed to be present in
the chromosome, indicating that the genes are either nonfunctional (pseudogenes) or subject to regulatory processes such as programmed translational frameshifts. The method can be used for checking the quality of
the sequences produced by any prokaryotic genome sequencing project.
4
Corresponding author.
9:1116-1127 ©1999 by Cold Spring Harbor Laboratory Press ISSN 1088-9051/99 $5.00

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
N. Gupta, J. Benhamida, V. Bhargava, D. Goodman, E. Kain, I. Kerman, N. Nguyen, N. Ollikainen, J. Rodriguez, J. Wang, et al.
Comparative proteogenomics: Combining mass spectrometry and comparative genomics to analyze multiple genomes
Genome Res.,
July 1, 2008;
18(7):
1133 - 1142.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. Perrodou, C. Deshayes, J. Muller, C. Schaeffer, A. Van Dorsselaer, R. Ripp, O. Poch, J.-M. Reyrat, and O. Lecompte
ICDS database: interrupted CoDing sequences in prokaryotic genomes
Nucleic Acids Res.,
January 1, 2006;
34(suppl_1):
D338 - D343.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Cruveiller, J. Le Saux, D. Vallenet, A. Lajus, S. Bocs, and C. Medigue
MICheck: a web tool for fast checking of syntactic annotations of bacterial genomes
Nucleic Acids Res.,
July 1, 2005;
33(suppl_2):
W471 - W479.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Bocs, S. Cruveiller, D. Vallenet, G. Nuel, and C. Medigue
AMIGene: Annotation of MIcrobial Genes
Nucleic Acids Res.,
July 1, 2003;
31(13):
3723 - 3726.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
I. Moszer, L. M. Jones, S. Moreira, C. Fabry, and A. Danchin
SubtiList: the reference database for the Bacillus subtilis genome
Nucleic Acids Res.,
January 1, 2002;
30(1):
62 - 65.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. FUKUNISHI and Y. HAYASHIZAKI
Amino acid translation program for full-length cDNA sequences with frameshift errors
Physiol Genomics,
March 8, 2001;
5(2):
81 - 87.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|