Genome Res. 14:934-941, 2004
©2004 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/04 $5.00
ENSEMBL Special
The Ensembl Analysis Pipeline
Simon C. Potter1,
Laura Clarke1,
Val Curwen1,
Stephen Keenan1,
Emmanuel Mongin2,
Stephen M.J. Searle1,
Arne Stabenau2,
Roy Storey1 and
Michele Clamp3,4
1 The Wellcome Trust Sanger Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
2 EMBL European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
3 The Broad Institute, Cambridge, Massachusetts 02141, USA
ABSTRACT
The Ensembl pipeline is an extension to the Ensembl system which allows automated annotation of genomic sequence. The software comprises two parts. First, there is a set of Perl modules ("Runnables" and "RunnableDBs") which are `wrappers' for a variety of commonly used analysis tools. These retrieve sequence data from a relational database, run the analysis, and write the results back to the database. They inherit from a common interface, which simplifies the writing of new wrapper modules. On top of this sits a job submission system (the "RuleManager") which allows efficient and reliable submission of large numbers of jobs to a compute farm. Here we describe the fundamental software components of the pipeline, and we also highlight some features of the Sanger installation which were necessary to enable the pipeline to scale to whole-genome analysis.
Footnotes
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1859804.
4 Corresponding author. E-MAIL mclamp{at}broad.mit.edu; FAX (617) 258-0903.

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
A. Rogers, I. Antoshechkin, T. Bieri, D. Blasiar, C. Bastiani, P. Canaran, J. Chan, W. J. Chen, P. Davis, J. Fernandes, et al.
WormBase 2007
Nucleic Acids Res.,
January 11, 2008;
36(suppl_1):
D612 - D617.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. Flicek, B. L. Aken, K. Beal, B. Ballester, M. Caccamo, Y. Chen, L. Clarke, G. Coates, F. Cunningham, T. Cutts, et al.
Ensembl 2008
Nucleic Acids Res.,
January 11, 2008;
36(suppl_1):
D707 - D714.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. Spudich, X. M. Fernandez-Suarez, and E. Birney
Genome browsing with Ensembl: a practical overview
Brief Funct Genomic Proteomic,
October 29, 2007;
(2007)
elm025v1.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. J. P. Hubbard, B. L. Aken, K. Beal, B. Ballester, M. Caccamo, Y. Chen, L. Clarke, G. Coates, F. Cunningham, T. Cutts, et al.
Ensembl 2007
Nucleic Acids Res.,
January 12, 2007;
35(suppl_1):
D610 - D617.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Hubbard, D. Andrews, M. Caccamo, G. Cameron, Y. Chen, M. Clamp, L. Clarke, G. Coates, T. Cox, F. Cunningham, et al.
Ensembl 2005
Nucleic Acids Res.,
January 1, 2005;
33(suppl_1):
D447 - D453.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. L. Ashurst, C.-K. Chen, J. G. R. Gilbert, K. Jekosch, S. Keenan, P. Meidl, S. M. Searle, J. Stalker, R. Storey, S. Trevanion, et al.
The Vertebrate Genome Annotation (Vega) database
Nucleic Acids Res.,
January 1, 2005;
33(suppl_1):
D459 - D465.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|