Genome Research

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Miller, R. T.
Right arrow Articles by Hide, W. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Miller, R. T.
Right arrow Articles by Hide, W. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Vol. 9, Issue 11, 1143-1155, November 1999

RESOURCE
A Comprehensive Approach to Clustering of Expressed Human Gene Sequence: The Sequence Tag Alignment and Consensus Knowledge Base

Robert T. Miller,1 Alan G. Christoffels,1 Chella Gopalakrishnan,1 John Burke,2 Andrey A. Ptitsyn,1 Tania R. Broveak,3 and Winston A. Hide1,4

1 South African National Bioinformatics Institute, Private Bag X17, Bellville 7535, University of the Western Cape, South Africa; 2 Pangea Systems, Oakland, California 94612 USA; 3 Electric Genetics, Observatory, 7925, Cape Town, South Africa

The expressed human genome is being sequenced and analyzed by disparate groups producing disparate data. The majority of the identified coding portion is in the form of expressed sequence tags (ESTs). The need to discover exonic representation and expression forms of full-length cDNAs for each human gene is frustrated by the partial and variable quality nature of this data delivery. A highly redundant human EST data set has been processed into integrated and unified expressed transcript indices that consist of hierarchically organized human transcript consensi reflecting gene expression forms and genetic polymorphism within an index class. The expression index and its intermediate outputs include cleaned transcript sequence, expression, and alignment information and a higher fidelity subset, SANIGENE. The STACK_PACK clustering system has been applied to dbEST release 121598 (GenBank version 110). Sixty-four percent of 1,313,103 Homo sapiens ESTs are condensed into 143,885 tissue level multiple sequence clusters; linking through clone-ID annotations produces 68,701 total assemblies, such that 81% of the original input set is captured in a STACK multiple sequence or linked cluster. Indexing of alignments by substituent EST accession allows browsing of the data structure and its cross-links to UniGene. STACK metaclusters consolidate a greater number of ESTs by a factor of 1.86 with respect to the corresponding UniGene build. Fidelity comparison with genome reference sequence AC004106 demonstrates consensus expression clusters that reflect significantly lower spurious repeat sequence content and capture alternate splicing within a whole body index cluster and three STACK v.2.3 tissue-level clusters. Statistics of a staggered release whole body index build of STACK v.2.0 are presented.


4 Corresponding author.


9:1143-1155 ©1999 by Cold Spring Harbor Laboratory Press  ISSN 1088-9051/99 $5.00

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
S. Hazelhurst, W. Hide, Z. Liptak, R. Nogueira, and R. Starfield
An overview of the wcd EST clustering tool
Bioinformatics, July 1, 2008; 24(13): 1542 - 1546.
[Abstract] [PDF]


Home page
BiostatisticsHome page
E. Garrett-Mayer, G. Parmigiani, X. Zhong, L. Cope, and E. Gabrielson
Cross-study validation and combined analysis of gene expression microarray data
Biostat., April 1, 2008; 9(2): 333 - 354.
[Abstract] [Full Text] [PDF]


Home page
J Exp BotHome page
K.-S. Chow, K.-L. Wan, Mohd. N. M. Isa, A. Bahari, S.-H. Tan, K Harikrishna, and H.-Y. Yeang
Insights into rubber biosynthesis from transcriptome analysis of Hevea brasiliensis latex
J. Exp. Bot., July 1, 2007; 58(10): 2429 - 2440.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
A. P.M. Weber, K. L. Weber, K. Carr, C. Wilkerson, and J. B. Ohlrogge
Sampling the Arabidopsis Transcriptome with Massively Parallel Pyrosequencing
Plant Physiology, May 1, 2007; 144(1): 32 - 42.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
S. H. Nagaraj, R. B. Gasser, and S. Ranganathan
A hitchhiker's guide to expressed sequence tag (EST) analysis
Brief Bioinform, January 1, 2007; 8(1): 6 - 21.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Malde, K. Schneeberger, E. Coward, and I. Jonassen
RBR: library-less repeat detection for ESTs
Bioinformatics, September 15, 2006; 22(18): 2232 - 2236.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
S. Park, N. Sugimoto, M. D. Larson, R. Beaudry, and S. van Nocker
Identification of Genes with Potential Roles in Apple Fruit Development and Biochemistry through Large-Scale Statistical Analysis of Expressed Sequence Tags
Plant Physiology, July 1, 2006; 141(3): 811 - 824.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
R. Stepanyan, K. Day, J. Urban, D. L. Hardin, R. S. Shetty, C. D. Derby, B. W. Ache, and T. S. McClintock
Gene expression and specificity in the mature zone of the lobster olfactory organ.
Physiol Genomics, April 13, 2006; 25(2): 224 - 233.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
M. Lemaire-Chamley, J. Petit, V. Garcia, D. Just, P. Baldet, V. Germain, M. Fagard, M. Mouassite, C. Cheniclet, and C. Rothan
Changes in Transcriptional Profiles Are Associated with Early Fruit Tissue Specialization in Tomato
Plant Physiology, October 1, 2005; 139(2): 750 - 769.
[Abstract] [Full Text] [PDF]


Home page
J. Immunol.Home page
C. Yu, M. Dong, X. Wu, S. Li, S. Huang, J. Su, J. Wei, Y. Shen, C. Mou, X. Xie, et al.
Genes "Waiting" for Recruitment by the Adaptive Immune System: The Insights from Amphioxus
J. Immunol., March 15, 2005; 174(6): 3493 - 3500.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
W. Nobis, X. Ren, S. P. Suchyta, T. R. Suchyta, A. J. Zanella, and P. M. Coussens
Development of a porcine brain cDNA library, EST database, and microarray resource
Physiol Genomics, December 16, 2003; 16(1): 153 - 159.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Kalyanaraman, S. Aluru, S. Kothari, and V. Brendel
Efficient clustering of large EST data sets on parallel computers
Nucleic Acids Res., June 1, 2003; 31(11): 2963 - 2974.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Sorek and H. M. Safer
A novel algorithm for computational identification of contaminated EST libraries
Nucleic Acids Res., February 1, 2003; 31(3): 1067 - 1074.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
N. Osato, M. Itoh, H. Konno, S. Kondo, K. Shibata, P. Carninci, T. Shiraki, A. Shinagawa, T. Arakawa, S. Kikuchi, et al.
A Computer-Based Method of Selecting Clones for a Full-Length cDNA Project: Simultaneous Collection of Negligibly Redundant and Variant cDNAs
Genome Res., July 1, 2002; 12(7): 1127 - 1134.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
L. Skrabanek and F. Campagne
TissueInfo: high-throughput identification of tissue expression profiles and specificity
Nucleic Acids Res., November 1, 2001; 29(21): e102 - e102.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Christoffels, A. v. Gelder, G. Greyling, R. Miller, T. Hide, and W. Hide
STACK: Sequence Tag Alignment and Consensus Knowledgebase
Nucleic Acids Res., January 1, 2001; 29(1): 234 - 238.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
J. Muilu, P. Rodriguez-Tomé, and A. Robinson
GBuilder---An Application for the Visualization and Integration of EST Cluster Data
Genome Res., January 1, 2001; 11(1): 179 - 184.
[Abstract] [Full Text]


Home page
Genome Res.Home page
I. Abdrakhmanov, D. Lodygin, P. Geroth, H. Arakawa, A. Law, J. Plachy, B. Korn, and J.-M. Buerstedde
A Large Database of Chicken Bursal ESTs as a Resource for the Analysis of Vertebrate Gene Function
Genome Res., December 1, 2000; 10(12): 2062 - 2069.
[Abstract] [Full Text]


Home page
Mol. Endocrinol.Home page
S. Y. Hsu and A. J. W. Hsueh
Discovering New Hormones, Receptors, and Signaling Mediators in the Genomic Era
Mol. Endocrinol., May 1, 2000; 14(5): 594 - 604.
[Full Text]


Home page
Genome Res.Home page
H. Konno, Y. Fukunishi, K. Shibata, M. Itoh, P. Carninci, Y. Sugahara, and Y. Hayashizaki
Computer-Based Methods for the Mouse Full-Length cDNA Encyclopedia: Real-Time Sequence Clustering for Construction of a Nonredundant cDNA Library
Genome Res., February 1, 2001; 11(2): 281 - 289.
[Abstract] [Full Text]


Home page
Genome Res.Home page
D. Zhuo, W. D. Zhao, F. A. Wright, H.-Y. Yang, J.-P. Wang, R. Sears, T. Baer, D.-H. Kwon, D. Gordon, S. Gibbs, et al.
Assembly, Annotation, and Integration of UNIGENE Clusters into the Human Genome Draft
Genome Res., May 1, 2001; 11(5): 904 - 918.
[Abstract] [Full Text]




Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
Genes Dev. Learn. Mem.
Protein Science RNA Genome Res.