Genome Research

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


Published online before print February 12, 2004, 10.1101/gr.1481104
Genome Res. 14:463-471, 2004
©2004 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/04 $5.00
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
1481104v1
14/3/463    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Porcel, B. M.
Right arrow Articles by Weissenbach, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Porcel, B. M.
Right arrow Articles by Weissenbach, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Resources

Numerous Novel Annotations of the Human Genome Sequence Supported by a 5'-End–Enriched cDNA Collection

Betina M. Porcel1,5, Olivier Delfour1, Vanina Castelli1, Veronique De Berardinis1, Lucie Friedlander1,2, Corinne Cruaud1, Abel Ureta-Vidal1,3, Claude Scarpelli1, Patrick Wincker1, Vincent Schächter1, William Saurin1,4, Gabor Gyapay1, Marcel Salanoubat1 and Jean Weissenbach1

1 Genoscope-Centre National de Séquençage and CNRS UMR-8030, 91000 Evry, France

A collection of 90,000 human cDNA clones generated to increase the fraction of "full-length" cDNAs available was analyzed by sequence alignment on the human genome assembly. Five hundred fifty-two gene models not found in LocusLink, with coding regions of at least 300 bp, were defined by using this collection. Exon composition proposed for novel genes showed an average of 4.7 exons per gene. In 20% of the cases, at least half of the exons predicted for new genes coincided with evolutionary conserved regions defined by sequence comparisons with the pufferfish Tetraodon nigroviridis. Among this subset, CpG islands were observed at the 5' end of 75%. In-frame stop codons upstream of the initiator ATG were present in 49% of the new genes, and 16% contained a coding region comprising at least 50% of the cDNA sequence. This cDNA resource also provided candidate small protein-coding genes, usually not included in genome annotations. In addition, analysis of a sample from this cDNA collection indicates that ~380 gene models described in LocusLink could be extended at their 5' end by at least one new exon. Finally, this cDNA resource provided an experimental support for annotations based exclusively on predictions, thus representing a resource substantially improving the human genome annotation.


Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1481104. Article published online before print in February 2004.

5 Corresponding author.
E-MAIL betina{at}genoscope.cns.fr; FAX 33-1-60-87-25-14.

2 Present address: LGI-BioInformatic, Aventis Pharma S.A., 94400, Vitry-Sur-Seine, France

3 Present address: European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB101SD, UK

4 Present address: Genomining, 92120, Montrouge, France.

[The sequence data from this study have been submitted to EMBL under accession nos. BX323813, BX323814, BX324295–BX465182, AL513551–AL583711.]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Genome Res.Home page
B. A. Peters, B. St. Croix, T. Sjoblom, J. M. Cummins, N. Silliman, J. Ptak, S. Saha, K. W. Kinzler, C. Hatzis, and V. E. Velculescu
Large-scale identification of novel transcripts in the human genome
Genome Res., March 1, 2007; 17(3): 287 - 292.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. V. Kochetov
AUG codons at the beginning of protein coding sequences are frequent in eukaryotic mRNAs with a suboptimal start codon context
Bioinformatics, April 1, 2005; 21(7): 837 - 840.
[Abstract] [Full Text] [PDF]




Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.
Copyright © 2004 by Cold Spring Harbor Laboratory Press.