Genome Research cityscape

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


Genome Res. 13:1542-1551, 2003
©2003 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/03 $5.00
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Supplemental Research Data
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Kasukawa, T.
Right arrow Articles by Quackenbush, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kasukawa, T.
Right arrow Articles by Quackenbush, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Resources

Development and Evaluation of an Automated Annotation Pipeline and cDNA Annotation System

Takeya Kasukawa1,2, Masaaki Furuno1, Itoshi Nikaido1, Hidemasa Bono1, David A. Hume3, Carol Bult4, David P. Hill4, Richard Baldarelli4, Julian Gough5, Alexander Kanapin6, Hideo Matsuda7, Lynn M. Schriml8, Yoshihide Hayashizaki1,9, Yasushi Okazaki1,11 and John Quackenbush10,11

1Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 2Multimedia Development Center, Advanced Technology Development Department, NTT Software Corporation, Yokohama, Kanagawa 231-8554, Japan 3Institute for Molecular Bioscience and ARC Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, Australia 4Mouse Genome Informatics Group, The Jackson Laboratory, Bar Harbor, Maine 04609, USA 5Structural Studies, MRC Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, UK 6The European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK 7Graduate School of Information Science and Technology, Osaka University, Toyonaka, Osaka 560-8531, Japan 8The National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland 20894, USA 9Genome Science Laboratory, RIKEN, Hirosawa, Wako, Saitama 351-0198, Japan 10The Institute for Genomic Research, Rockville, Maryland 20850, USA

Manual curation has long been held to be the "gold standard" for functional annotation of DNA sequence. Our experience with the annotation of more than 20,000 full-length cDNA sequences revealed problems with this approach, including inaccurate and inconsistent assignment of gene names, as well as many good assignments that were difficult to reproduce using only computational methods. For the FANTOM2 annotation of more than 60,000 cDNA clones, we developed a number of methods and tools to circumvent some of these problems, including an automated annotation pipeline that provides high-quality preliminary annotation for each sequence by introducing an "uninformative filter" that eliminates uninformative annotations, controlled vocabularies to accurately reflect both the functional assignments and the evidence supporting them, and a highly refined, Web-based manual annotation tool that allows users to view a wide array of sequence analyses and to assign gene names and putative functions using a consistent nomenclature. The ultimate utility of our approach is reflected in the low rate of reassignment of automated assignments by manual curation. Based on these results, we propose a new standard for large-scale annotation, in which the initial automated annotations are manually investigated and then computational methods are iteratively modified and improved based on the results of manual curation.


Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.992803.

11 Corresponding authors.
E-MAIL rgscerg{at}gsc.riken.go.jp; FAX 81-45-503-9216.
E-MAIL johnq{at}tigr.org; FAX +1-301-838-0208.

[Supplemental material is available online at www.genome.org.]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Genome Res.Home page
J. L. Guenet
The mouse genome
Genome Res., December 1, 2005; 15(12): 1729 - 1740.
[Abstract] [Full Text] [PDF]


Home page
J Mol EndocrinolHome page
A. Droit, G. G Poirier, and J. M Hunter
Experimental and bioinformatic approaches for interrogating protein-protein interactions to determine protein function
J. Mol. Endocrinol., April 1, 2005; 34(2): 263 - 280.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. J. Bult, J. A. Blake, J. E. Richardson, J. A. Kadin, J. T. Eppig, and the Mouse Genome Database Group
The Mouse Genome Database (MGD): integrating biology with the genome
Nucleic Acids Res., January 1, 2004; 32(90001): D476 - 481.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
I. Nikaido, C. Saito, A. Wakamoto, Y. Tomaru, T. Arakawa, Y. Hayashizaki, and Y. Okazaki
EICO (Expression-based Imprint Candidate Organizer): finding disease-related imprinted genes
Nucleic Acids Res., January 1, 2004; 32(90001): D548 - 551.
[Abstract] [Full Text] [PDF]




Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.
Copyright © 2003 by Cold Spring Harbor Laboratory Press.