|
|
|
|
Genome Res. 13:1455-1465, 2003 ©2003 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/03 $5.00 Letter Kinesin Superfamily Proteins (KIFs) in the Mouse Transcriptome1Department of Cell Biology and Anatomy, Graduate School of Medicine, University of Tokyo, Hongo, Bunkyo-ku, Tokyo 113-0033, Japan 2Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan 3Genome Science Laboratory, RIKEN, Hirosawa, Wako, Saitama 351-0198, Japan
In the post genomic era where virtually all the genes and the proteins are known, an important task is to provide a comprehensive analysis of the expression of important classes of genes, such as those that are required for intracellular transport. We report the comprehensive analysis of the Kinesin Superfamily, which is the first and only large protein family whose constituents have been completely identified and confirmed in silico and at the cDNA, mRNA level. In FANTOM2, we have found 90 clones from 33 Kinesin Superfamily Protein (KIF) gene loci. The clones were analyzed in reference to sequence state, library of origin, detection methods, and alternative splicing. More than half of the representative transcriptional units (TU) were full length. The FANTOM2 library also contains novel splice variants previously unreported. We have compared and evaluated various protein classification tools and protein search methods using this data set. This report provides a foundation for future research of the intracellular transport along microtubules and proves the significance of intracellular transport protein transcripts as part of the transcriptome.
The mouse has been proven to be an excellent genetic model for the understanding of human biology. The availability of the genomic sequence of both organisms also allows for a comprehensive analysis of the catalog of classes of genes (Hattori et al. 2000
The trafficking of proteins is tightly regulated and various different types of proteins are known to be involved. Members of the Kinesin Superfamily Proteins (KIFs) have been shown to transport organelles, protein complexes, and mRNAs to specific destinations in a microtubule- and ATP-dependent manner (Hirokawa 1996
KIFs can be divided into three classes depending on the location of the motor domain in the molecule. N kinesins and M kinesins, containing motor domains close to their N terminal or center, have been reported to possess microtubule plus end-directed motility. There are three KIFs containing a motor domain proximal to the C terminus and possessing minus end-directed motility. Microtubule plus end-directed transport is mainly driven by KIFs, whereas cytoplasmic dynein is responsible for the bulk of microtubule minus end-directed transport. The Kinesin Superfamily is the first and only large protein family whose constituents have been completely identified and confirmed in silico and at the cDNA or mRNA level (Miki et al. 2001 To set the foundation for functional genomics of intracellular transport network in the transcriptome, we have analyzed the Kinesin Superfamily, an essential component of the microtubule (MT)-dependent transport system in the largest cDNA library to date, the FANTOM2 library.
KIF Clones in FANTOM2 Of the 45 KIF loci identified in the genome, representative transcripts of 33 loci were found in the FANTOM2 library (Table 1). The 33 representative sequences arise from a total of 90 clones deriving from 49 libraries.
Seventeen representative transcripts were full length (51.5%); two sequences had problems other than truncation (6.1%), specifically, one had a 1.5-kb deletion in the middle of the coding region and one locus was represented by an unspliced genomic fragment (Fig. 1). Seven representative transcripts were 3' truncated (21.2%) and six were 5' truncated (18.2%). One representative clone was 5' and 3' truncated. Twenty out of the 90 KIF clones did not contain the signature motor domain motif.
KIF Clones in Phase I
Detection in Neural Tissue
Alternative Splicing
Comparing the two KIF3B transcripts, C13003 [GenBank] 5P16 and D030068F10, the former is an unreported isoform and the latter is identical to the sequence deposited in GenBank (NM_008444 [GenBank] ). The new isoform has only two exons. The first exon is shared until base 1095. There the novel form splices and connects to the second exon, which is unique to the variant and is located in the genome between the sixth and seventh exons of the conventional form. The intron between the first and second exon starts with the nucleic sequence GA and ends with AG. Twenty-three ESTs in the public database specifically support the previously reported form whereas one EST is specific for the novel form. The open reading frame (ORF) of the original form translates into 747 amino acids, the new form into 329 residues, excluding a one-base insertion that is not supported by the original clone nor the genomic sequence. Clone E03001 [GenBank] 9L05 is identical to the KIF9 sequence in GenBank (NM_010628 [GenBank] ). Clone 4921509F14 is identical until the 774th amino acid residue, after which the conventional KIF9 sequence has 16 residues whereas the novel one has a different 36 residues. The two isoforms share the first 17 exons. The conventional form has an 18th and 19th exon, which are located downstream of the last exon of the variant in the genome. Eleven ESTs in the NCBI mouse EST database support the original isoform; 29 support the novel isoform. KIF17 has a previously unpublished variant, 5930435E01. This splice form lacks the 8th, 9th, and 15th exons of the published form (accession no. AB008867 [GenBank] ). As a result, the first 940 amino acids are shared excluding residues 411649. Because of a frame-shift resulting from the deletion of the 15th exon, the last 8 amino acid residues are specific to the novel isoform. The presence of the 8th and 9th exons are supported by 2 ESTs and the 15th exon is supported by an additional 2 ESTs. There is one EST lacking the 15th exon deposited. In the FANTOM2 library, there are three KIF24 clones, all of which contain different sequences resulting from alternative splicing. Clone 430019P19, the longest of the three, is encoded by 10 exons. Clone D030003D17 ends in the 8th exon of clone 430019P19 without any in-frame stop codon but contains four exons between the 3rd and 4th exons of the former clone. Clone 4933425J19 shares the first seven exons with clone D030003D17. However, the 7th exon is extended beyond the splice site for D030003D17 and yields an in-frame stop codon. Clone 4933425J19 contains three separate bases dispersed throughout the transcript not found in the other two clones nor in the genome and not considered in this study. The 3' end of the longest clone, 4933425J19, matches eight ESTs in the public database. One EST supports the four exons included in clone 4933425J19; in contrast, no EST was found that agreed with D030003D17 in leaving out the four exons. Clone D030003D17 encodes an 862-amino-acid protein, 430019P19, 747 residues and 4933425J19, 371 residues, ignoring the 3 base insertion.
Identification of KIF Clones
BLASTN and TBLASTN searches using the nucleotide and amino acid sequences of KIFs did not reveal any further clones in the FANTOM2 set.
Phylogeny of KIFs in FANTOM2
When including KIFs found in Phase I, all classes and subfamilies are represented.
Two sets of molecular motors, KIFs and dyneins, use the microtubule cytoskeleton as rails. Of the 45 KIF loci in mouse, representative transcripts from 33 loci were found in FANTOM2 along with 5 novel isoforms. Adding the 2 isoforms of KIF1B, the resulting TU coverage for KIFs in FANTOM2 is 86.7%. When considering the Phase I clones, the coverage rises to 94.1%, both values in good agreement with the overall FANTOM2 TU coverage of 90.1%. Twelve KIFs were not found in FANTOM2. The lack of KIFs normally abundant in other cDNA libraries may reflect the thorough subtraction of abundant transcripts conducted during the development of FANTOM2. Despite subtraction, 25 KIFs out of 33 found in FANTOM2, equivalent to 75.8%, derived from neural tissue or mixtures of neural and other tissue. Including sequences found in the Phase I set, the percentage is 78.6%. Previously, we have reported that a similar percentage, 84.4%, of the KIFs (38 out of 45) have been detected in neural tissue (Miki et al. 2001
The Phase I data set is comprised of 547,149 5' end sequences and 1,442,236 3 ' end sequences collected to select clones with unique 5' and 3' end sequences for the FANTOM2 clone set (Okazaki et al. 2002
There are two previously reported isoforms of KIF1B (Nangaku et al. 1994
The KIF motor domain is comprised of highly conserved ATP-binding and microtubule-binding motifs, which are required for motility. The p-loop binds ATP, whereas switch 1 and switch 2 form a salt bridge that is broken upon release of Of the 57 KIFs identified by Pfam, 28 were identified by all other methods. These numbers implicate the accuracy of Pfam in identifying KIFs. However, Pfam did not succeed in detecting 33 clones, including 12 clones that contain an intact switch 2 consensus sequence, which is used for identifying KIFs. The reason these 12 clones were not selected cannot be inferred from the amino acid sequence. There is a possibility that the algorithm used can be improved to increase sensitivity. InterPro hits contained one kinesin light chain clone that comprises a separate category. InterPro uses several criteria including p-loop, switch 1, and switch 2 sequences. These two protein motif search engines had a very low false positive rate but had a high false negative rate even for clones containing an intact signature motif. Pfam and InterPro search and identify protein motifs. These motifs are then used to classify proteins by Gene Ontology. Gene Ontology classification categorized seven clones falsely as KIFs but was successful in detecting more KIF sequences than the aforementioned two motif search engines. Auto-annotation picked up the most false positives, detecting 18 false KIFs. These hits include kinesin light chains and GenBank entries containing words such as "similar to Rab6 kinesin". These 18 false auto-annotations were decreased to 8 by human annotation. The decrease in false positives and high detection rate infer the necessity of human curation. These two methods correctly choose 84 and 72 clones out of ninety, respectively. Auto-annotation missed two full-length clones with identical sequences deposited in GenBank. By human annotation, no clone containing the signature motif was neglected. All other clones that were not selected by the two methods, the false negatives, only contained UTR sequences or needed to be reversed and complemented or lacked exons existing in the GenBank sequence. These clones are difficult to identify unless thoroughly familiar with various KIF sequences. Therefore, to identify pre-existing KIFs deposited in databases such as GenBank, human annotation may be the best method having a high detection rate, the advantage of not requiring motor domain consensus sequences in truncated clones, and the reduction of false positives by human curation. However, protein motif search engines may categorize better new full-length clones not previously deposited in any database where there would be no exactly matching reference. False positives can be reduced by the exclusion of kinesin light chains that do not contain motor domains and GenBank deposit sequences that are titled "similar to kinesin," etc. Twenty out of the 90 KIF clones did not contain the signature motif, equivalent to 22.2%. This percentage is similar to the 25% lower protein motifs found in the over all CDS in FANTOM2. More clones contained the full-length sequence than not. The percentage of 5'-truncated and 3'-truncated clones were approximately equal, indicating the possibility that there is no preference in truncation of either end. Only one locus was represented by a 5' and 3' truncation, adding evidence to the quality of this clone set. Only one locus was represented by a clone with other problems. Although some of the clones not containing the motor domain may have become truncated during reverse transcription or other technical steps, it is possible that these clones exist in vivo. As described for the KIF17 splice variant above, these transcripts would function as dominant negative regulators of cargo binding. Intracellular transport of cargoes could be controlled by competitive binding of cargo binding domains of intact and truncated KIFs. The 3' truncations also may be due to technicalities, though the possibility they exist in vivo cannot be denied. These transcripts would function in the cell as a result of transcriptional regulation and/or may bind alternative binding partners by exposing domains that are hidden in longer transcripts. Seven out of 14 classes of KIFs and 10 out of 18 subfamilies had clones from all loci. This is a high number for a cDNA library and reflects the high coverage of TUs in FANTOM2. In addition, all members of five subfamilies were represented by full-length clones. The KIFs reflect the representation of the transcriptome in FANTOM2, and this representation is in good agreement with predictions of all transcripts. It is highly possible that the predictions are accurate as indicated by the highly similar indicators of the KIFs. The high occurrence of KIFs and the abundance of full-length clones implicate the necessity of using FANTOM2 and the proximity of cataloging the complete transcriptome.
Summary and Future Implications
Recent studies have begun to reveal that KIFs use scaffolding and adaptor protein complexes for this purpose (Nakagawa et al. 2000 The answers for these basic cell biological questions should be promoted using FANTOM resources of the mouse transcriptome. The FANTOM2 library will set the standard by serving as an encyclopedia for the future analysis of all transcribed molecules.
Identification of All KIFs Contained in FANTOM2 We have screened for KIFs by using Pfam, InterPro, and Gene Ontology domain searches and auto-annotation and annotation by assigned curators from The FANTOM Consortium and The RIKEN Genome Exploration Research Group Phase II Team, 2002. The screen was confirmed by comprehensive BLASTN and TBLASTN searches using nucleotide and protein sequences of all KIFs. Results obtained by each method were recorded and compared (Table 1). KIFs with transcripts in FANTOM2 are indicated by a yellow-green underline in Figure 5.
Phase I Clones and EST Analysis
Sequence State Comparisons
The KIF motor domain was defined by the following criteria: conservation of upstream p-loop motifs and a switch 2 sequence approximately 150200 amino acid residues downstream, a YXXXXXDLL motif where X is any amino acid and a switch 1 motif located between p-loop and switch 2 (Kikkawa et al. 2000
Splice Variant Identification
Phylogenic Analysis
The authors are deeply in debt to other members of The FANTOM Consortium and The RIKEN Genome Exploration Research Group Phase II Team, 2002 and the Hirokawa lab. This work was funded by the Center of Excellence Grant-in-Aid from the Ministry of Education, Science, Sports, Culture and Technology of Japan to N. Hirokawa.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.984503.
4 Takahiro Arakawa,2 Piero Carninci,2,3 Jon Kawai,2,3 and Yoshihide Hayashizaki2,3
5 Corresponding author.
Aizawa, H., Sekine, Y., Takemura, R., Zhang, Z., Nangaku, M., and Hirokawa, N. 1992. Kinesin family in murine central nervous system. J. Cell Biol.
119:1287
-1296.
Brendza, R.P., Serbus, L.R., Duffy, J.B., and Saxton, W.M. 2000. A function for kinesin I in the posterior transport of oskar mRNA and Staufen protein. Science
289:2120
-2122.
Guillaud, L., Setou, M., and Hirokawa, N. 2003. KIF17 dynamics and regulation of NR2B trafficking in hippocampal neurons. J. Neurosci. 23:131
-140. Ha, M.J., Yoon, J., Moon, E., Lee, Y.M., Kim, H.J., and Kim, W. 2000. Assignment of the kinesin family member 4 genes (KIF4A and KIF4B) to human chromosome bands Xq13.1 and 5q33.1 by in situ hybridization. Cytogenet. Cell Genet. 88: 41-42.[CrossRef][Medline] Hattori, M., Fujiyama, A., Taylor, T.D., Watanabe, H., Yada, T., Park, H.S., Toyoda, A., Ishii, K., Totoki, Y., Choi, D.K., et al. 2000. The DNA sequence of human chromosome 21. Nature 405:283 -284.[CrossRef][Medline] Hirokawa, N. 1996. Organelle transport along microtubulesthe role of KIFs. Trends Cell Biol. 6: 135-141.
Hirokawa, N. 1998. Kinesin and dynein superfamily proteins and the mechanism of organelle transport. Science 279:519
-526. Hirokawa, N., Noda, Y., and Okada, Y. 1998. Kinesin and dynein superfamily proteins in organelle transport and cell division. Curr. Opin. Cell Biol. 10: 60-73.[CrossRef][Medline] Kawai, J., Shinagawa, A., Shibata, K., Yoshino, M., Itoh, M., Ishii, Y., Arakawa, T., Hara, A., Fukunishi, Y., Konno, H., et al. 2001. Functional annotation of a full-length mouse cDNA collection. Nature 409:685 -690.[CrossRef][Medline] Kikkawa, M., Okada, Y., and Hirokawa, N. 2000. 15 Å resolution model of the monomeric kinesin motor, KIF1A. Cell 100:241 -252.[CrossRef][Medline] Kikkawa, M., Sablin, E.P., Okada, Y., Yajima, H., Fletterick, R.J., and Hirokawa, N. 2001. Switch-based mechanism of kinesin motors. Nature 411:439 -445.[CrossRef][Medline]
Kim, A.J. and Endow, S.A. 2000. A kinesin family tree. J. Cell Sci. 113:3681
-3682. Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. 2001. Initial sequencing and analysis of the human genome. Nature. 409:860 -921.[CrossRef][Medline]
Miki, H., Setou, M., Kaneshiro, K., and Hirokawa, N. 2001. All kinesin superfamily protein, KIF, genes in mouse and human. Proc. Natl. Acad. Sci.
98:7004
-7011.
Nakagawa, T., Tanaka, Y., Matsuoka, E., Kondo, S., Okada, Y., Noda, Y., Kanai, Y., and Hirokawa, N. 1997. Identification and classification of 16 new kinesin superfamily (KIF) proteins in mouse genome. Proc. Natl. Acad. Sci.
94:9654
-9659. Nakagawa, T., Setou, M., Seog, D., Ogasawara, K., Dohmae, N., Takio, K., and Hirokawa, N. 2000. A novel motor, KIF13A, transports mannose-6-phosphate receptor to plasma membrane through direct interaction with AP-1 complex. Cell 103:569 -581.[CrossRef][Medline]
Nakajima, K., Takei, Y., Tanaka, Y., Nakagawa, T., Nakata, T., Noda, Y., Setou, M., and Hirokawa, N. 2002. Molecular motor KIF1C is not essential for mouse survival and motor-dependent retrograde Golgi Apparatus-to-endoplasmic reticulum transport. Mol. Cell. Biol. 22:866
-873. Nangaku, M., Sato-Yoshitake, R., Okada, Y., Noda, Y., Takemura, R., Yamazaki, H., and Hirokawa, N. 1994. KIF1B, a novel microtubule plus end-directed monomeric motor protein for transport of mitochondria. Cell 79:1209 -1220.[CrossRef][Medline] Okada, Y., Yamazaki, H., Sekine-Aizawa, Y., and Hirokawa, N. 1995. The neuron-specific kinesin super family protein KIF1A is a unique monomeric motor for anterograde axonal transport of synaptic vesicle precursors. Cell 87:769 -780. Okamoto, S., Matsushima, M., and Nakamura, Y. 1998. Identification, genomic organization, and alternative splicing of KNSL 3, a novel human gene encoding a kinesin-like protein. Cytogenet. Cell Genet. 83:25 -29.[Medline] Okazaki, Y., Furuno, M., Kasukawa, T., Adachi, J., Bono, H., Kondo, S., Nikaido, I., Osato, N., Saito, R., Suzuki, H., et al. 2002. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420:563 -573.[CrossRef][Medline]
Olivier, M., Aggarwal, A., Allen, J., Almendras, A.A., Bajorek, E.S., Beasley, E.M., Brady, S.D., Bushard, J.M., Bustos, V.I., Chu, A., et al. 2001. A high-resolution radiation hybrid map of the human genome draft sequence. Science
291:1298
-1302.
Page, R.D. 1996. TreeView: An application to display phylogenetic trees on personal computers. Comput. Applic. Biosci. 12:357
-358. Piddini, E., Schmid, J.A., de Martin, R., and Dotti, C.G. 2001. The Ras-like GTPase Gem is involved in cell shape remodelling and interacts with the novel kinesin-like protein KIF9. EMBO J. 20:4076 -4087.[CrossRef][Medline] Reddy, A.S.N. and Day, I.S. 2001. Kinesins in the Arabidopsis genome: A comparative analysis among eukaryotes. BMC Genomics 2:2 -14.[CrossRef][Medline]
Setou, M., Nakagawa, T., Seog, D.H., and Hirokawa, N. 2000. Kinesin superfamily motor protein KIF17 and mLin-10 in NMDA receptor-containing vesicle transport. Science
288:1796
-1802. Setou, M., Seog, D.H., Tanaka, Y., Kanai, Y., Takei, Y., Kawagishi, M., and Hirokawa, N. 2002. Glutamate-receptor-interacting protein GRIP1 directly steers kinesin to dendrites. Nature 417: 83-87.[CrossRef][Medline] Sharp, D.J., Rogers, G.C., and Scholey, J.M. 2000. Microtubule motors in mitosis. Nature 407: 41-47.[CrossRef][Medline] Tanaka, Y., Zhang, Z., and Hirokawa, N. 1995. Identification and molecular evolution of new dynein-like protein sequences in rat brain. J. Cell Sci. 108:1883 -1893.[Abstract] Vale, R.D. and Fletterick, R.J. 1997. The design plan of kinesin motors. Annu. Rev. Cell Develop. Biol. 13:745 -777.[CrossRef][Medline]
Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G., Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A., et al. 2001. The sequence of the human genome. Science. 291:1304
-1351.
Verhey, K.J., Meyer, D., Deehan, R., Blenis, J., Schnapp, B.J., Rapoport, T.A., and Margolis, B. 2001. Cargo of kinesin identified as JIP scaffolding proteins and associated signaling molecules. J. Cell Biol. 152:959
-970. Waterston, R.H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J.F., Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., An, P., et al. 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420:520 -562.[CrossRef][Medline]
Yamazaki, H., Nakata, T., Okada, Y., and Hirokawa, N. 1995. KIF3A/B: a heterodimeric kinesin superfamily protein that works as a microtubule plus end-directed motor for membrane organelle transport. J. Cell Biol.
130:1387
-1399. Yang, Z., Hanlon, D.W., Marszalek, J.R. and Goldstein, L.S. 1997. Identification, partial characterization, and genetic mapping of kinesin-like protein genes in mouse. Genomics 45:123 -131.[CrossRef][Medline]
Yang, Z, Roberts, E.A., and Goldstein, L.S. 2001. Functional analysis of mouse C-terminal kinesin motor KifC2. Mol. Cell. Biol. 21:2463
-2466.
Zhao, C., Takita, J., Tanaka, Y., Setou, M., Nakagawa, T., Takeda, S., Yang, H.W., Terada, S., Nakata, T., Takei, Y., et al. 2001. Charcot-Marie-Tooth disease type 2A caused by mutation in a microtubule motor KIF1B
Received November 25, 2002;
accepted in revised format March 24, 2003.
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||