Published online before print
June 18, 2002, 10.1101/gr.75202
Vol. 12, Issue 7, 1127-1134, July 2002
METHODS
A Computer-Based Method of Selecting Clones for a Full-Length cDNA Project: Simultaneous Collection of Negligibly Redundant and Variant cDNAs
Naoki
Osato,1
Masayoshi
Itoh,2
Hideaki
Konno,1
Shinji
Kondo,1
Kazuhiro
Shibata,2
Piero
Carninci,2
Toshiyuki
Shiraki,2
Akira
Shinagawa,1
Takahiro
Arakawa,1
Shoshi
Kikuchi,3
Kouji
Sato,3
Jun
Kawai,1,2,4 and
Yoshihide
Hayashizaki1,2
1 Laboratory for Genome Exploration Research Group, RIKEN
Genomic Sciences Center (GSC), Yokohama, 230-0045, Japan;
2 Genome Science Laboratory, RIKEN Wako Main Campus, Wako,
351-0198, Japan; 3 Department of Molecular Biology, National
Institute of Agrobiological Sciences,
Tsukuba, 305-8602, Japan
We describe a computer-based method that selects representative
clones for full-length sequencing in a full-length cDNA project. Our
method classifies end sequences using two kinds of criteria, grouping,
and clustering. Grouping places together variant cDNAs, family genes,
and cDNAs with sequencing errors. Clustering separates those cDNA
clones into distinct clusters. The full-length sequences of the clones
selected by grouping are determined preferentially, and then the
sequences selected by clustering are determined. Grouping reduced the
number of rice cDNA clones for full-length sequencing to 21% and mouse
cDNA clones to 25%. Rice full-length sequences selected by grouping
showed a 1.07-fold redundancy. Mouse full-length sequences showed a
1.04-fold redundancy, which can be reduced by ~30% from the
selection using our previous method. To estimate the coverage of unique
genes, we used FANTOM (Functional Annotation of RIKEN Mouse cDNA
Clones) clusters (the RIKEN Genome Exploration Research Group 2001).
Grouping covered almost all unique genes (93% of FANTOM clusters), and
clustering covered all genes. Therefore, our method is useful for the
selection of appropriate representative clones for full-length
sequencing, thereby greatly reducing the cost, labor, and time
necessary for this process.
[The programs used in this paper
are available online at http://genome.gsc.riken.go.jp/software/2C.]
4
Corresponding author.
12:1127-1134 ©2002 by Cold Spring Harbor Laboratory Press ISSN 1088-9051/02 $5.00

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
T. Nanjo, N. Futamura, M. Nishiguchi, T. Igasaki, K. Shinozaki, and K. Shinohara
Characterization of Full-length Enriched Expressed Sequence Tags of Stress-treated Poplar Leaves
Plant Cell Physiol.,
December 15, 2004;
45(12):
1738 - 1748.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Yamada, J. Lim, J. M. Dale, H. Chen, P. Shinn, C. J. Palm, A. M. Southwick, H. C. Wu, C. Kim, M. Nguyen, et al.
Empirical Analysis of Transcriptional Activity in the Arabidopsis Genome
Science,
October 31, 2003;
302(5646):
842 - 846.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Nishiyama, T. Fujita, T. Shin-I, M. Seki, H. Nishide, I. Uchiyama, A. Kamiya, P. Carninci, Y. Hayashizaki, K. Shinozaki, et al.
Comparative genomics of Physcomitrella patens gametophytic transcriptome and Arabidopsis thaliana: Implication for land plant evolution
PNAS,
June 24, 2003;
100(13):
8007 - 8012.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. Carninci, K. Waki, T. Shiraki, H. Konno, K. Shibata, M. Itoh, K. Aizawa, T. Arakawa, Y. Ishii, D. Sasaki, et al.
Targeting a Complex Transcriptome: The Construction of the Mouse Full-Length cDNA Encyclopedia
Genome Res.,
June 1, 2003;
13(6):
1273 - 1289.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|