|
|
|
|
Genome Res. 14:1756-1766, 2004 ©2004 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/04 $5.00 Methods EGPred: Prediction of Eukaryotic Genes Using Ab Initio Methods After Combining With Sequence Similarity ApproachesInstitute of Microbial Technology, Sector 39A, Chandigarh-160036. India
EGPred is a Web-based server that combines ab initio methods and similarity searches to predict genes, particularly exon regions, with high accuracy. The EGPred program proceeds in the following steps: (1) an initial BLASTX search of genomic sequence against the RefSeq database is used to identify protein hits with an E-value <1; (2) a second BLASTX search of genomic sequence against the hits from the previous run with relaxed parameters (E-values <10) helps to retrieve all probable coding exon regions; (3) a BLASTN search of genomic sequence against the intron database is then used to detect probable intron regions; (4) the probable intron and exon regions are compared to filter/remove wrong exons; (5) the NNSPLICE program is then used to reassign splicing signal site positions in the remaining probable coding exons; and (6) finally ab initio predictions are combined with exons derived from the fifth step based on the relative strength of start/stop and splice signal sites as obtained from ab initio and similarity search. The combination method increases the exon level performance of five different ab initio programs by 4%-10% when evaluated on the HMR195 data set. Similar improvement is observed when ab initio programs are evaluated on the Burset/Guigo data set. Finally, EGPred is demonstrated on an
1 Corresponding author. E-MAIL raghava{at}imtech.res.in; FAX +91-172-269-0632 or +91-172-269-0585. [Supplemental material is available online at www.genome.org and http://www.imtech.res.in/raghava/egpred/supl/.] Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.2524704.
This article has been cited by other articles:
|
|||||||||||||