|
|
|
|
Vol. 10, Issue 9, 1333-1341, September 2000 Detection of Spurious Interruptions of Protein-Coding Regions in Cloned cDNA Sequences by GeneMark Analysis
Kazusa DNA Research Institute, Kisarazu, Chiba 292-0812, Japan
cDNA is an artificial copy of mRNA and, therefore, no cDNA can be
completely free from suspicion of cloning errors. Because overlooking
these cloning errors results in serious misinterpretation of cDNA
sequences, development of an alerting system targeting spurious
sequences in cloned cDNAs is an urgent requirement for massive cDNA
sequence analysis. We describe here the application of a modified
GeneMark program, originally designed for prokaryotic gene finding, for
detection of artifacts in cDNA clones. This program serves to provide a
warning when any spurious split of protein-coding regions is detected
through statistical analysis of cDNA sequences based on Markov models.
In this study, 817 cDNA sequences deposited in public databases by us
were subjected to analysis using this alerting system to assess its
sensitivity and specificity. The results indicated that any spurious
split of protein-coding regions in cloned cDNAs could be sensitively detected and systematically revised by means of this system after the
experimental validation of the alerts. Furthermore, this study offered
us, for the first time, statistical data regarding the rates and types
of errors causing protein-coding splits in cloned cDNAs obtained by
conventional cloning methods.
1 Corresponding author. 10:1333-1341 ©2000 by Cold Spring Harbor Laboratory Press ISSN 1088-9051/00 $5.00 This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||