|
|
|
|
Genome Res. 13:1430-1442, 2003 ©2003 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/03 $5.00 Letter Systematic Characterization of the Zinc-Finger-Containing Proteins in the Mouse Transcriptome1Institute for Molecular Bioscience, Brisbane, Australia 2ARC Special Research Centre for Functional and Applied Genomics, Brisbane, Australia 3CRC for Chronic Inflammatory Diseases, Brisbane, Australia 4Computational Biology and Bioinformatics Environment ComBinE, Brisbane, Australia 5University of Queensland, Brisbane, Australia 6Laboratory of Computational Genomics, The Rockefeller University, New York, New York 10021, USA 7Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan 8Genome Science Laboratory, RIKEN, Hirosawa, Wako, Saitama 351-0198, Japan
Zinc-finger-containing proteins can be classified into evolutionary and functionally divergent protein families that share one or more domains in which a zinc ion is tetrahedrally coordinated by cysteines and histidines. The zinc finger domain defines one of the largest protein superfamilies in mammalian genomes;46 different conserved zinc finger domains are listed in InterPro (http://www.ebi.ac.uk/InterPro
Zinc-finger-containing proteins constitute the most abundant protein superfamily in the mammalian genome, and are best known as transcriptional regulators. They are involved in a variety of cellular activities such as development, differentiation, and tumor suppression. The first zinc finger domain to be identified in Xenopus laevis, basal transcription factor TFIIIA (Miller et al. 1985
Association of many zinc finger proteins with DNA- and/or protein-binding domains allows the formation of multiprotein complexes in which DNA-binding motifs recognize a target sequence in a specific manner or proteinprotein interaction domains allow the assembly of multiprotein regulatory complexes, commonly involved in chromatin remodeling (Aasland et al. 1995
Zinc fingers are among the most common structural motifs in the proteome predicted from the genome sequences of Saccharomyces cerevisiae, Drosophila melanogaster, and Caenorhabditis elegans
(Rubin et al. 2000
The RIKEN Mouse Gene Encyclopedia Project has provided the most comprehensive collection of full-length mammalian complementary DNAs (cDNAs; Okazaki et al. 2002
A total of 1573 protein sequences were extracted based on the presence of one or more zinc finger domains as recognized by InterPro (http://www.ebi.ac.uk/InterPro
Generation of Nonredundant Zinc Finger Protein Set The RIKEN Genome Science Center in collaboration with the FANTOM consortium (http://genome.gsc.riken.go.jp 20,000 protein sequences (http://fantom2.gsc.riken.go.jp/InterPro searches for 46 conserved zinc finger domains against the RTPS-extracted 1573 zinc-finger-containing proteins. These represent 7.5% of the entire RTPS (Table 2). All 46 classifications of zinc finger domains were represented in the RTPS, with the five most frequent zinc finger domains being the C2H2 (506), RING finger (196), KRAB-box (134), LIM domain (60), and the PHD finger (52). Comparative analysis with other eukaryotes confirms similar frequencies of the zinc finger domains in other genomes (Supplementary Table 1; available online at www.genome.org).
A comparison of the profiles with nonmammalian genomes revealed lineage-specific evolution in the zinc-finger-containing proteins. Certain zinc finger domains are vertebrate-specific. The KRAB (IPR001909), KRAB-related (IPR003655), Nuclear transition protein 2 (IPR000678), SCAN domain (IPR003309), and the subfamily of Nuclear receptor ROR (IPR003079) have not been identified in the D. melanogaster
(http://www.fruitfly.org/ In contrast, comparison of the predicted mouse and human zinc finger sets shows minimal lineage-specific evolution, although there are some examples of structural domain differences in putative mouse and human ortholog pairs. RNF6, RNF13, and G1RP1 are such examples (protein architectures of the mouse and human RNF6, RNF13, and G1RP1 are shown in Supplementary Fig. 4).
Cluster Analysis of the Zinc-Finger-Containing Proteins
The ZFPS contains 677 proteins that have not been identified previously. In this analysis, we consider as novel all those proteins that were annotated in the MATRICS computational pipeline (Kawai et al. 2001
We found that 33 of the 46 zinc finger families we have analyzed have at least one new member in the RTPS (Table 2). The majority of new proteins belong to the C2H2 family. Among 506 C2H2-containing-proteins, 208 are new mouse transcripts (41%). The zinc finger family that presents the highest proportion of newly described proteins is the recently discovered DHHC-type zinc finger (IPR001594; Putilina et al. 1999
In our classification we noted a small group of structurally related newly described proteins that appear entirely novel. An example is cluster 24, which contains four new proteins sharing in common a central array of six C2H2 zinc fingers, one N-terminal C2H2 zinc finger, and an array of two to three C-terminal C2H2 zinc fingers. BLAST analysis of the proteins in cluster 24 (http://www.ncbi.nlm.nih.gov/BLAST) reveals no homologous proteins with functional annotation (Fig. 1). The name Fzf (Fantom zinc finger protein) has been proposed for this new family of C2H2 zinc fingers. The murine Fzf1 (9530006B08Rik) encodes a 1409-amino-acid protein with a predicted molecular mass of 155 kD. The murine Fzf2
(B13004
[GenBank]
3A04Rik) encodes a 951-amino-acid protein with a predicted molecular mass of 88.5 kD. The murine Fzf3 (AAH28839Rik) encodes a 707-amino-acid protein with a predicted molecular mass of 77.8 kD. Finally, Fzf4 (6030407P18Rik) encodes a 534-amino-acid protein with a predicted molecular mass of
Although there is no functional or structural information regarding these proteins, there are human orthologs of the Fzf family, and proteins with sequence similar to the Fzf family are also evident in other eukaryotes such as Xenopus laevis, D. melanogaster, and C. elegans. We also identified a conserved stretch of 16 amino acids immediately N-terminal to the central zinc finger array that does not show similarity with other previously described conserved domains, KLIMLV-[D/N/S]-[D/N/S]-FYYG-[K/R/Q]-[H/Y/D]-[E/K/G]-G (Fig. 1B). This new conserved domain, named Fantom family associated box (FFAB), is highly conserved in all FZF proteins and together with the characteristic distribution of C2H2 zinc finger domains can be considered as the signature domain of this new family.
The ENSEMBL gene prediction program Genscan (http://www.ensembl.org To determine substructures within the major clusters and better characterize the new genes present in this data set, Neighbor Joining phylogenetic trees were calculated from multiple sequence alignments (see Methods; Figs. 1A, 2A, and 3A). To illustrate the importance of this analysis in gene discovery and annotation, clusters 5 and 7, containing proteins of the Sp/Krüppel-like factors and RING-H2, E3 ubiquitin-protein ligase families, respectively, are discussed in detail below.
The Sp/Krüppel-Like Factors Family: Identification of a New Sp Family Member Sp/Krüppel-like factors are transcriptional regulators involved in development, cell growth, and differentiation (Lania et al. 1997
Sequence-based hierarchical clustering segregates the Sp proteins from the Krüppel-like factors to form a clearly distinct subfamily of transcriptional regulators (Fig. 2A). This segregation revealed a new member of the Sp subfamily, named Sp8 (Bouwman and Philipsen 2002 The 13.30-kb-long murine Sp8 locus is found at Chromosome 12 band f2 with a structure of 4 exons and 3 introns, and encodes a 486-amino-acid protein with a predicted molecular mass of 48 kD (Table 5).
The N-terminal part of Sp1 can be divided into five domains: the Sp-box (Harrison et al. 2000
BLAST analysis reveals that the three C-terminal zinc fingers of Sp8 have 95% homology with Sp5 and 97% with the D. melanogaster Sp1 (NP_572579
[GenBank]
). Outside the zinc finger domain, Sp8 has a serine/alanine-rich region in the very N terminus of the protein (amino acids 11116) and a glycine-rich region in the central region (amino acids 132149). This region of the protein shows 23% homology with osterix/Sp7 with which Sp8 clusters in the hierarchical tree. Osterix/Sp7 has been shown to be a transcription factor required for osteoblast differentiation and hence for bone formation (Nakashima et al. 2002 The mouse and human protein architectures of the Sp/KLF family including different isoforms generated by alternative splicing are shown in Supplementary Figure 3.
Treichel et al. (2001
The different homologies of the zinc finger domain and the non-zinc finger domain found in the Sp/KLF family is evidence of their different evolutionary history. This family of transcriptional regulators most likely evolved novel proteins by modular evolution in which domains were created by gene duplication and translocated by domain shuffling events (Morgenstern and Atchley 1999
RING-H2 and the E3 Ubiquitin-Protein Ligase Family Cluster analysis identified a group of 14 proteins that share in common a C-terminal RING-H2-type finger (Table 3, cluster 7; Fig. 3A,B). Five of the 14 proteins are newly identified mouse proteins. RNF50 (NP_598825 [GenBank] ) encodes a 339-amino-acid protein with a predicted molecular mass of 37.9 kD with a central proline-rich region (56228). RNF51 (2500002L14Rik) encodes a 166-amino-acid protein with a predicted molecular mass of 19.1 kD. RNF52 (AAH16543 [GenBank] ) encodes a 313-amino-acid protein with a predicted molecular mass of 34.08 kD, with a C-terminal serine-rich region (293313). RNF53 (0610009J22Rik) encodes a 380-amino-acid protein of 41.57 kD with a predicted molecular mass of 1.59 kD. A proline-rich region is present in the very N-terminal part of the protein (733). Names for these four proteins are proposed based on the conventional nomenclature for ring finger proteins (RNFX; Table 6).
The fifth newly identified mouse protein 1700042K15Rik shares 61% of protein identity with the g1-related protein (G1RP1), a homolog to the D. melanogaster g1 (Baker and Reddy 2000
An emerging role of RING-finger-containing proteins is in ubiquitination pathways, where they play a central role in the transfer of ubiquitin (Ub) to a heterologous substrate, thereby targeting the substrate for destruction by the proteosome (Joazeiro and Weissman 2000
The ubiquitination pathway is crucial for cells to maintain protein homeostasis and to allow proteins that are folded incorrectly to be targeted for degradation. Ubiquitination is also important in chromatin remodeling and transcriptional regulation by histone ubiquitination. Ubiquitination of histones H2A and H2B might work as tagging them for the recruitment of the histone acetyl-transferases necessary for chromatin remodeling during transcriptional activation or histone displacement by protamines during spermatogenesis (Jason et al. 2002
Alternative Splicing in the Zinc-Finger-Containing Proteins Set (ZFPS)
The high rate of alternative splicing in the zinc finger superfamily could reflect the modular domain architecture, and the fact that individual domains commonly occur as single exons within a gene.
Detailed analysis of individual transcripts confirmed that isoforms generated by alternative splicing are likely to have different functions (Supplementary Figs. 36). For example, the murine transcription factor Krüppel-like factor 13 (mKLF13; Scohy et al. 2000
Another example of likely functional plasticity is found in the RIKEN transcript C33002
[GenBank]
6E23Rik, which encodes a protein with a C-terminal C2H2-type finger and an N-terminal KRAB-repressor domain (IPR001909). Two isoforms were identified, encoding proteins that contain only the C2H2 fingers and lack the KRAB domain (variants cluster scl11314). The two different structural isoforms could compete with the full-length protein to relieve transcriptional repression, because they lack the repressor domain KRAB (Friedman et al. 1996 In the RING finger family, alternative splicing may modulate the cellular localization of different isoforms. In the case of the membrane-bound protein Ring finger protein 13 (RNF13; NM_011883 [GenBank] ; variants cluster scl7546), we found six isoforms of this transcript (Supplementary Fig. 6), encoding proteins from 381 to 200 amino acids long. The 200-amino-acid isoform f (C23003 [GenBank] 3M15Rik) generated by alternative use of a cryptic exon lacks a membrane domain and is presumably soluble (Supplementary Fig. 4).
Conclusion
The evolution of the zinc finger proteins has occurred in a modular fashion (Morgenstern and Atchley 1999 The RIKEN full-length, Representative Transcript and Protein Set (RTPS), represents the most complete transcriptome available in higher eukaryotes. The full-length cDNA and protein sequences allow us to better map each individual transcript to the mouse genome and define human homologs and possible splice variants generated from a single genetic locus. Gene prediction algorithms used in the mouse and human genome projects are imperfect. The availability of large full-length sequence sets reduces this imprecision in gene structure prediction. The high incidence of newly described genes present in the RTPS will allow a more thorough and systematic approach in characterizing protein families. In overview, we have analyzed 46 structurally related zinc finger families in the mouse transcriptome, and placed the first part of the analysis in the public domain. We have looked in detail at three of these families and started to suggest nomenclature based on family relationships. Annotation of the remaining families may provide a rationale basis for future nomenclature, and also a basis for prioritization of functional characterization of members of this key family.
To facilitate future characterization of this superfamily, we generated a Web-based interface (http://cassandra.visac.uq.edu.au/zf
Zinc Finger Classification Zinc-finger-containing proteins were identified in the RTPS of 21,019 protein sequences using the InterPro protein domain searching tool version 5.0, resulting in a data set of 1573 proteins having at least one zinc finger domain. Specific subsets were selected from this data set based on two different classifications. The first classification is by distinct zinc finger domains as defined by the 46 distinct PROSITE sequence signatures. Obviously, a protein with more than one zinc finger domain can be present in more than one class, and proteins in the same class may have completely different domain compositions and are not necessarily functionally related.
The second classification was much more rigorous and attempted to identify protein families that are truly functionally related. An all-against-all sequence comparison was performed using the BLASTP 2.1.3 program (Altschul et al. 1990
Alignments and Phylogenetic Construction
CLUSTALX version 1.6.6 (Thompson et al. 1997
Mapping of the New Mouse and Human Zinc-Finger-Containing Proteins
Alternative Spliced Variants in the Zinc Finger Data Set
TR is funded by the Cooperative Research Centre for Chronic Inflammatory Diseases, Australia. The authors thank the RIKEN Genome Science Center Institute; the FANTOM2 consortium; and Matthew J. Sweet and S. Roy Himes for critical comments on the manuscript. The data set (RTPSv2) used for these analyses has been generated by the Genomic Sciences Center, RIKEN Yokohama Institute and by the Functional Annotation of the Mouse Genome (FANTOM) consortium, during the RIKEN Mouse cDNA Encyclopedia Project.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.949803.
9 Corresponding author.
10 Takahiro Arakawa, Piero Carninci, Jun Kawai, and Yoshihide Hayashizaki. [Supplemental material is available online at www.genome.org. To facilitate future characterization of this superfamily, we generated a Web-based interface, http://cassandra.visac.uq.edu.au/zf, containing the structural classification of the entire zinc finger data set discussed in this study.]
Aasland, R., Gibson, T.G., and Stewart, A.F. 1995. The PHD finger: Implications for chromatin-mediated transcriptional regulation. Trends Biochem. Sci. 20:56 -59.[CrossRef][Medline] Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403 -410.[CrossRef][Medline] Bach, I. 2000. The LIM domain: Regulation by association. Mech. Dev. 91: 5-17.[CrossRef][Medline] Bach, I., Rodriguez-Esteban, C., Carriere, C., Bhushan, A., Krones, A., Rose, D.W., Glass, C.K., Andersen, B., Izpisua Belmonte, J.C., and Rosenfeld, M.G. 1999. RLIM inhibits functional activity of LIM homeodomain transcription factors via recruitment of the histone deacetylase complex. Nat. Genet. 22:394 -399.[CrossRef][Medline] Baker, S.J. and Reddy, E.P. 2000. Cloning of murine G1RP, a novel gene related to Drosophila melanogaster g1. Gene 248:33 -40.[CrossRef][Medline]
Bono, H., Kasukawa, T., Hayashizaki, Y., and Okazaki, Y. 2002. READ: RIKEN Expression Array Database. Nucleic Acids Res. 30:211
-213. Bouwman, P. and Philipsen, S. 2002. Regulation and activity of Sp1-related transcription factors. Mol. Cell. Endocrinol. 195:27 -38.[CrossRef][Medline] Choo, Y., Castellanos, A., Garcia-Hernandez, B., Sanchez-Garcia, I., and Klug, A. 1997. Promoter-specific activation of gene expression directed by bacteriophage-selected zinc fingers. J. Mol. Biol. 273:525 -532.[CrossRef][Medline] Dang, D.T., Pevsner, J., and Yang, V.W. 2000. The biology of the mammalian Kruppel-like family of transcription factors. Int. J. Biochem. Cell Biol. 32:1103 -1121.[CrossRef][Medline] David, G., Alland, L., Hong, S.H., Wong, C.W., DePinho, R.A., and Dejean, A. 1998. Histone deacetylase associated with mSin3A mediates repression by the acute promyelocytic leukemia-associated PLZF protein. Oncogene 16:2549 -2556.[CrossRef][Medline]
Friedman, J.R., Fredericks, W.J., Jensen, D.E., Speicher, D.W., Huang, X.P., Neilson, E.G., and Rauscher III, F.J. 1996. KAP-1, a novel corepressor for the highly conserved KRAB repression domain. Genes & Dev. 10:2067
-2078.
Hanawa, H., Watanabe, K., Nakamura, T., Ogawa, Y., Toba, K., Fuse, I., Kodama, M., Kato, K., Fuse, K., and Aizawa, Y. 2002. Identification of cryptic splice site, exon skipping, and novel point mutations in type I CD36 deficiency. J. Med. Genet.
39:286
-291. Harrison, S.M., Houzelstein, D., Dunwoodie, S.L., and Beddington, R.S. 2000. Sp5, a new member of the Sp1 family, is dynamically expressed during development and genetically interacts with Brachyury. Dev. Biol. 227:358 -372.[CrossRef][Medline] Jason, L.J., Moore, S.C., Lewis, J.D., Lindsey, G., and Ausio, J. 2002. Histone ubiquitination: A tagging tail unfolds? Bioessays 24:166 -174.[CrossRef][Medline] Joazeiro, C.A. and Weissman, A.M. 2000. RING finger proteins: Mediators of ubiquitin ligase activity. Cell 102:549 -552.[CrossRef][Medline] Kawai, J., Shinagawa, A., Shibata, K., Yoshino, M., Itoh, M., Ishii, Y., Arakawa, T., Hara, A., Fukunishi, Y., Konno, H., et al. 2001. Functional annotation of a full-length mouse cDNA collection. Nature 409:685 -690.[CrossRef][Medline]
Kolell, K.J. and Crawford, D.L. 2002. Evolution of Sp transcription factors. Mol. Biol. Evol.
19:216
-222. Laity, J.H., Lee, B.M., and Wright, P.E. 2001. Zinc finger proteins: New insights into structural and functional diversity. Curr. Opin. Struct. Biol. 11: 39-46.[CrossRef][Medline] Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. 2001. Initial sequencing and analysis of the human genome. Nature 409:860 -921.[CrossRef][Medline] Lania, L., Majello, B., and De Luca, P. 1997. Transcriptional regulation by the Sp family proteins. Int. J. Biochem. Cell Biol. 29:1313 -1323.[CrossRef][Medline]
Lorick, K.L., Jensen, J.P., Fang, S., Ong, A.M., Hatakeyama, S., and Weissman, A.M. 1999. RING fingers mediate ubiquitin-conjugating enzyme (E2)-dependent ubiquitination. Proc. Natl. Acad. Sci. 96:11364
-11369. Miller, J., McLachlan, A.D., and Klug, A. 1985. Repetitive zinc-binding domains in the protein transcription factor IIIA from Xenopus oocytes. EMBO J. 4:1609 -1614.[Medline] Modrek, B. and Lee, C. 2002. A genomic view of alternative splicing. Nat. Genet. 30: 13-19.[CrossRef][Medline]
Modrek, B., Resch, A., Grasso, C., and Lee, C. 2001. Genome-wide detection of alternative splicing in expressed sequences of human genes. Nucleic Acids Res.
29:2850
-2859. Morgenstern, B. and Atchley, W.R. 1999. Evolution of bHLH transcription factors: Modular evolution by domain shuffling? Mol. Biol. Evol. 16:1654 -1663.[Abstract] Nakashima, K., Zhou, X., Kunkel, G., Zhang, Z., Deng, J.M., Behringer, R.R., and de Crombrugghe, B. 2002. The novel zinc finger-containing transcription factor osterix is required for osteoblast differentiation and bone formation. Cell 108: 17-29.[CrossRef][Medline] Nomura, A. and Sugiura, Y. 2002. Contribution of individual zinc ligands to metal binding and peptide folding of zinc finger peptides. Inorg. Chem. 41:3693 -3698.[Medline] Okazaki, Y., Furuno, Y., Kasukawa, T., Adachi, J., Bono, H., Kondo, S., Nikaido, I., Osato, N., Saito, R., and Suzuki, H. 2002. Analysis of the mouse transcriptome based upon functional annotation of 60,770 full length cDNAs. Nature 420:563 -573.[CrossRef][Medline] Putilina, T., Wong, P., and Gentleman, S. 1999. The DHHC domain: A new highly conserved cysteine-rich motif. Mol. Cell Biochem. 195:219 -226.[CrossRef][Medline]
Rubin, G.M., Yandell, M.D., Wortman, J.R., Gabor Miklos, G.L., Nelson, C.R., Hariharan, I.K., Fortini, M.E., Li, P.W., Apweiler, R., Fleischmann, W., et al. 2000. Comparative genomics of the eukaryotes. Science 287:2204
-2215. Scohy, S., Gabant, P., Van Reeth, T., Hertveldt, V., Dreze, P.L., Van Vooren, P., Riviere, M., Szpirer, J., and Szpirer, C. 2000. Identification of KLF13 and KLF14 (SP6), novel members of the SP/XKLF transcription factor family. Genomics 70: 93-101.[CrossRef][Medline] Senti, K., Keleman, K., Eisenhaber, F., and Dickson, B.J. 2000. brakeless is required for lamina targeting of R1R6 axons in the Drosophila visual system. Development 127:2291 -2301.[Abstract]
Thompson, J.D., Gibson, T.G., Plewniak, F., Jeanmougin, F., and Higgins, D.G. 1997. The CLUSTAL_X windows interface: Flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25:4876
-4882. Treichel, D., Becker, M.B., and Gruss, P. 2001. The novel transcription factor gene Sp5 exhibits a dynamic and highly restricted expression pattern during mouse embryogenesis. Mech. Dev. 101:175 -179.[CrossRef][Medline] Tucker, P., Laemle, L., Munson, A., Kanekar, S., Oliver, E.R., Brown, N., Schlecht, H., Vetter, M., and Glaser, T. 2001. The eyeless mouse mutation (ey1) removes an alternative start codon from the Rx/rax homeobox gene. Genesis 31: 43-53.[CrossRef][Medline]
Yang, E., Henriksen, M.A., Schaefer, O., Zakharova, N., and Darnell Jr., J.E. 2002. Dissociation time from DNA determines transcriptional function in a STAT1 linker mutant. J. Biol. Chem. 277:13455
-13462.
Yang, P., Shaver, S.A., Hilliker, A.J., and Sokolowski, M.B. 2000. Abnormal turning behavior in Drosophila larvae. Identification and molecular analysis of scribbler (sbb). Genetics 155:1161
-1174.
Zavolan, M., Van Nimwegen, E., and Gaasterland, T. 2002. Splice variation in mouse full-length cDNAs identified by mapping to the mouse genome. Genome Res.
12:1377
-1385.
ftp://ftp.ncbi.nih.gov/refseq/; The NCBI Reference Sequence project (RefSeq). http://cassandra.visac.uq.edu.au/zf; RTPS zinc finger data set. http://fantom2.gsc.riken.go.jp/; FANTOM 2. http://genome.gsc.riken.go.jp; Genome Exploration Research Group. http://genomes.rockefeller.edu; Laboratory of Computational Genomics. http://genome-www.stanford.edu/Saccharomyces; Assembly of the Saccharomyces cerevisiae whole genome sequence. http://prodes.toulouse.inra.fr/ESPript; ESPript 2.0 beta. http://smart.embl-heidelberg.de; Simple Modular Architecture Research Toll (SMART). http://www.ebi.ac.uk/InterPro; InterPro. http://www.ensembl.org; assembly of the mouse whole genome sequence data. http://www.fruitfly.org/; assembly of the Drosophila melanogaster whole genome sequence data. http://www.informatics.jax.org/mgihome; Mouse Genome Informatics Database. http://www.ncbi.nlm.nih.gov; National Center for Biotechnology Information. http://www.sanger.ac.uk/Software/analysis/SSAHA; Sequence Search and Alignment by Hashing Algorithm. http://www.wormbase.org/; assembly of the Caenorhabditis elegans whole genome sequence data. http://www.tigr.org; The Institute for Genomic Research's home page. http://taxonomy.zoology.gla.ac.uk/rod/treeview.html; Tree View software.
Received November 1, 2002;
accepted in revised format February 19, 2003.
This article has been cited by other articles:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||