Genome Research

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


Published online before print May 7, 2008, 10.1101/gr.069104.107
Genome Res. 18:888-899, 2008
©2008 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/08 $5.00
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Supplemental Research Data
Right arrow All Versions of this Article:
gr.069104.107v1
gr.069104.107v2
18/6/888    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Google Scholar
Right arrow Articles by Larsson, P.
Right arrow Articles by Söderbom, F.
PubMed
Right arrow PubMed Citation
Right arrow Articles by Larsson, P.
Right arrow Articles by Söderbom, F.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Letter

De novo search for non-coding RNA genes in the AT-rich genome of Dictyostelium discoideum: Performance of Markov-dependent genome feature scoring

Pontus Larsson1, Andrea Hinas2,4, David H. Ardell3,5,6, Leif A. Kirsebom1, Anders Virtanen1, and Fredrik Söderbom2,6

1 Department of Cell and Molecular Biology, Biomedical Center, Uppsala University, SE-75124 Uppsala, Sweden; 2 Department of Molecular Biology, Biomedical Center, Swedish University of Agricultural Sciences, SE-75124 Uppsala, Sweden; 3 Linnaeus Centre for Bioinformatics, Biomedical Center, SE-751 24 Uppsala, Sweden

Genome data are increasingly important in the computational identification of novel regulatory non-coding RNAs (ncRNAs). However, most ncRNA gene-finders are either specialized to well-characterized ncRNA gene families or require comparisons of closely related genomes. We developed a method for de novo screening for ncRNA genes with a nucleotide composition that stands out against the background genome based on a partial sum process. We compared the performance when assuming independent and first-order Markov-dependent nucleotides, respectively, and used Karlin-Altschul and Karlin-Dembo statistics to evaluate the significance of hits. We hypothesized that a first-order Markov-dependent process might have better power to detect ncRNA genes since nearest-neighbor models have been shown to be successful in predicting RNA structures. A model based on a first-order partial sum process (analyzing overlapping dinucleotides) had better sensitivity and specificity than a zeroth-order model when applied to the AT-rich genome of the amoeba Dictyostelium discoideum. In this genome, we detected 94% of previously known ncRNA genes (at this sensitivity, the false positive rate was estimated to be 25% in a simulated background). The predictions were further refined by clustering candidate genes according to sequence similarity and/or searching for an ncRNA-associated upstream element. We experimentally verified six out of 10 tested ncRNA gene predictions. We conclude that higher-order models, in combination with other information, are useful for identification of novel ncRNA gene families in single-genome analysis of D. discoideum. Our generalizable approach extends the range of genomic data that can be searched for novel ncRNA genes using well-grounded statistical methods.


4 Present addresses: Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Room 3050, Cambridge, MA 02138, USA;

5 School of Natural Sciences, University of California, Merced, CA 95344, USA.

6 Corresponding authors.

E-mail dardell{at}ucmerced.edu; fax (209) 228-4060.

E-mail fredde{at}xray.bmc.uu.se; fax 46-18-536971.

[Supplemental material is available online at http://www.genome.org. The sequence data from this study have been submitted to GenBank under accession nos. EF551319 and EF551320.]

Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.069104.107.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?





Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.
Copyright © 2008 by Cold Spring Harbor Laboratory Press.