|
|
|
|
Genome Res. 13:1416-1429, 2003 ©2003 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/03 $5.00 Letter Identification and Analysis of Chromodomain-Containing Proteins Encoded in the Mouse Transcriptome1ARC Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, St.Lucia, Queensland 4072, Australia 2School of Molecular and Microbial Sciences, University of Queensland, St.Lucia, Queensland 4072, Australia 3Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan 4Genome Science Laboratory, RIKEN, Hirosawa, Wako, Saitama 351-0198, Japan
The chromodomain is 4050 amino acids in length and is conserved in a wide range of chromatic and regulatory proteins involved in chromatin remodeling. Chromodomain-containing proteins can be classified into families based on their broader characteristics, in particular the presence of other types of domains, and which correlate with different subclasses of the chromodomains themselves. Hidden Markov model (HMM)-generated profiles of different subclasses of chromodomains were used here to identify sequences encoding chromodomain-containing proteins in the mouse transcriptome and genome. A total of 36 different loci encoding proteins containing chromodomains, including 17 novel loci, were identified. Six of these loci (including three apparent pseudogenes, a novel HP1 ortholog, and two novel Msl-3 transcription factor-like proteins) are not present in the human genome, whereas the human genome contains four loci (two CDY orthologs and two apparent CDY pseudogenes) that are not present in mouse. A number of these loci exhibit alternative splicing to produce different isoforms, including 43 novel variants, some of which lack the chromodomain. The likely functions of these proteins are discussed in relation to the known functions of other chromodomain-containing proteins within the same family.
The chromodomain (CD) is a domain of 4050 amino acids long contained in various proteins involved in chromatin remodeling and the regulation of gene expression in eukaryotes during development (Cavalli and Paro 1998
We clustered known CD sequences from all organisms were clustered into discrete subclasses using the Protein Distance Method (Felsenstein 1989 We identified 13 families of CD-containing proteins that are present in the genomically well studied eukaryotes Saccharomyces cerevisiae, Schizosaccharomyces pombe, Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, human, and mouse (K. Tajul-Arifin, R. Teasdale, and J.S. Mattick, in prep.). These families include the chromodomain-helicase-DNA-binding (CHD) family, the histone methyl transferase family, the HP1 family, the Polycomb family, the Msl-3 homolog family, the histone acetyltransferase (HAT) family, the retinoblastoma-binding protein 1 (RBBP1) family, the enoyl-CoA hydratase family, the SWI3 family, and the plant-specific chromomethylase family (Table 1). We also found that particular subclasses of CDs are associated with particular families of CD-containing proteins, indicating that the function and/or specificity of the CD are subtly different in the different types of CD proteins.
In the present study we used the HMM profiles of the various subclasses of CDs (see Supplementary information) to identify all CD-containing proteins and their alternatively spliced products in mouse. The HMM profiles were used to query the Variant-based Proteome Set (VPS) of RIKEN's FANTOM2 Representative Transcripts and Protein Sets (RTPS; The FANTOM Consortium and the RIKEN Genome Exploration Research Group Phase I and II Team, 2002
Table 2 lists the source of the identified CD-containing proteins in the RIKEN VPS database, with cross-reference information where relevant. The RIKEN VPS database contains most but not all mouse CD-containing proteins. Table 3 shows all of the CD-containing loci identified in the mouse genome, as well as in the human genome, derived from analysis of both the RIKEN VPS and Ensembl databases. A total of 36 loci encoding CD-containing proteins, including 17 novel loci, were identified in mouse, which included representatives of 10 of the 13 eukaryotic CD-containing protein families. Our analyses also show that particular subclasses of chromodomains are associated with particular CD-containing protein families (Table 4), which presumably reflects different substrate specificity of different subclasses of CDs and their associated protein families (K. Tajul-Arifin, R. Teasdale, and J.S. Mattick, in prep.). A large number of alternative splice variants, including novel variants, were also identified, and these are discussed in more detail below in terms of the different CD-containing protein families (Table 4). A number of other transcripts from these loci were also found in the RIKEN and Ensembl databases, but were not included in this analysis unless they exhibited a different exon structure consistent with alternative splicing, as opposed to 3'- or 5'-truncated reverse transcripts.
Chromodomain-Helicase-DNA-Binding (CHD) Family
CHD-1/2 Subfamily Two genes encoding CHD-1 and CHD-2 homologs were identified in mouse. The CHD-1 gene is located on chromosome 17 band A2, with the full-length protein encoded by 35 exons. This protein had already been identified in the public database as well as in the Ensembl mouse database (protein No. 1a). A protein from FANTOM2 was identified as a possible alternative splicing product for this locus (No. 1b). The transcript contains 23 exons, whereby the first exon is located about 1.6 kb upstream from the first exon of protein 1a transcript. This strongly supports the use of an alternative promoter in transcribing the mRNA for this protein. The translated protein contains all of the recognized structural domains of the full-length protein, but lacks the C-terminal region, due to an in-frame stop codon encoded by a 3' extension of exon 23. The second CHD protein identified in this subfamily is CHD-2. Described as a novel mouse protein in Ensembl, with a homolog in human, the mouse gene is located on chromosome 7 band D1. Four alternatively spliced transcripts were identified, each of which contains one or more exons not present in one or more of the other transcripts (Table 4). Each transcript codes for isoforms of CHD-2 protein that contain only one CD of subclass H (Table 4), whereas the human CHD-2 homolog has two CD domains (O and H), which suggests that the mouse CHD-2 protein recorded transcripts are incomplete. Indeed, analysis of the genomic region surrounding this locus in mouse revealed sequences located between the recorded exons 8 and 9 capable of encoding a second CD, suggesting that these sequences represent a cryptic exon, which presumably is included in other splice variants. This suggests that different isoforms of this protein may include one or both CD domains, with potentially important functional consequences.
CHD-3/4 Subfamily The first of three CHD-3/4 proteins, also known as Mi-2a, is encoded by a gene located on chromosome 11 band B4. The protein in Ensembl is encoded by 31 exons, and domain analyses identified a RING-PHD Zn-finger domain overlap near the N-terminal, one CD of subclass J, and an ATP-dependent helicase region, which is typical of this subfamily. The FANTOM2 protein, encoded by 14 exons, contains two PHD domains with two CDs of subclass P and J. It seems that exons 10 and 11 encode for the CD of subclass P, whereas exon 8 may encode for part of the PHD domain sequence, whereby splicing the exon creates a sequence that constructs a RING-PHD domain overlap (Table 4). The second CHD-3/4 protein, Mi-2b, is encoded on chromosome 6 band F2, and is represented in the FANTOM2 and Ensembl databases by several different transcripts (Table 4), none of which may be full-length. Indeed, two of these transcripts (4b and 4c) are presented as originating from separate adjacent loci in Ensembl, but probably arise from a single locus, in comparison with other known Mi-2b homologs. This conclusion is also supported by examination of the FANTOM2 transcript (4a), which spans both 4b and 4c. Proteins 4a and 4b contain two PHD domains and an ATP helicase domain and either one or two CDs, via alternative splicing, similar to that observed in CHD-2. A third gene encodes another member of this subfamily (which we have termed Mi-2c), and is located on mouse chromosome 4 band E2. The Ensembl transcript includes four exons and encodes a short protein of 229 amino acids that contains just two CDs. However, the transcript is not full-length, and comparison with the human homolog indicates that the full-length protein would contain two PHD domains, two CDs, and an ATP-helicase domain. No transcripts from this locus were present in FANTOM2.
CHD-5 Subfamily The second novel CHD-5 locus is designated CHD-5b (No. 7). Located on chromosome 8 at band C5, four alternative spliced products were identified for this locus, two from FANTOM2 and two from Ensembl. The longest protein (No. 7c) is encoded by 30 exons (1666 aa), and contains two CDs, an ATP-dependent helicase domain, and two BRK domains within an extended C-terminal region. The function of BRK domains is unknown but is described by SMART as a domain found in transcriptional and CD helicase proteins. The second isoform identified in Ensembl (No. 7d) is encoded by 18 exons (1283 aa) and contains just two BRK domains. The proteins identified in FANTOM2 are probable alternative splicing products which include two CDs and the ATP-dependent helicase domain (Table 4). The third and fourth CHD-5 genes in mouse (CHD-5c and CHD-5d, Nos. 8 and 9, respectively) are not represented in FANTOM2. The CHD-5c gene is located on chromosome 14 band C1, with four alternatively spliced transcripts identified in Ensembl encoding various isoforms of the protein, none of which may be full-length (Table 4). The CHD-5d locus is on chromosome 4 band A1, with four alternatively spliced transcripts identified in Ensembl. The possible full-length protein (No. 9a) is coded by 34 exons, and contains two CDs of subclasses Q and I near the N-terminal, an ATP-dependent helicase region, a SANT DNA-binding domain, and two BRK domains near the C-terminal (Table 4).
The functions of CHD-5 proteins in mammals have not been elucidated, but by analogy with homologs such as kismet in D. melanogaster
(Daubresse et al. 1999
Histone Methyltransferase Family
The histone methyltransferase CD family contains a single CD of subclass X together with PreSET, SET, and Post-SET domains. The SET domain confers the catalytic activity of the histone methyltransferase proteins. The name itself comes from the name of proteins from which the domain was first identified; the strongest PEV suppressor gene Su(var)3-9
(Tschiersch et al. 1994 Two Suv39h1 proteins were identified in FANTOM2 VPS (Table 2) and two from Ensembl. As the two proteins in FANTOM2 VPS share 99% identity, differing only at aa position 364, they are considered to be synonymous (No. 10a). The Ensembl proteins are encoded by seven and six exons, producing proteins of 452 aa and 412 aa in length, respectively. Analysis shows that the FANTOM2 proteins correspond to the smaller-sized Ensembl protein (Table 4). The Suv39h2 locus is represented by four proteins, three from the FANTOM2 VPS and one from Ensembl. Three of the proteins are similar (No. 11a) except for one or two mismatches in the aa sequence, whereas the fourth (No. 11b) appears to be a novel alternative splicing product, deduced from the transcript from FANTOM2. The longer proteins are encoded by six exons, and the shorter protein is encoded by five exons, with exon 2 spliced out. Exon 2 is 146 bases long, and partially codes for the CD, the removal of which shortens the CD region by 14 aa.
Suv39h1 and Suv39h2 genes in mouse are homologs of the suppressor of variegation (39) gene in D. melanogaster (Su[var]39; Table 5). Mutation of this gene in Drosophila suppresses the position variegation effect, whereby an active gene is silenced when it is physically translocated near a repressive region of the chromosome (Tschiersch et al. 1994
The functions of the mouse and human Suv39h2 genes are not known, but mouse Suv39h2 transcripts are specifically expressed in adult testis (O'Carroll et al. 2000
Heterochromatin Protein 1 (HP1) Family
The HP1
The HP1
The HP1
Another HP1
HP1 proteins in mouse have not been extensively studied. In D. melanogaster, HP1
Polycomb (Pc) Family
Chromobox Protein Homolog 6 (Cbx6)
Cbx2 (M33)
Cbx4 (MPc2)
Cbx8 (MPc3)
Cbx7
Msl-3 Homologs A mouse Msl-3 protein which is similar to the Drosophila Msl-3 protein was identified in Ensembl and the FANTOM2 VPS set, but does not contain a CD, unlike other known Msl-3 homologs. The gene is located on chromosome X band F5, and the corresponding cDNA is encoded by 11 exons. We presume that the cDNA is not full-length but an alternatively spliced variant that lacks a CD, because its human homolog contains a CD. Potential CD encoding sequences are present at this locus in the mouse and rat genomes.
Another Msl-3 homolog was identified, the MORF-related gene 15 (MRG15), also known as Tex-189. It is 323 aa in length, and is encoded by 12 exons. The gene is located on chromosome 9 band E3.2. The two proteins identified in FANTOM2 VPS and Ensembl are synonymous. A likely pseudogene for Tex-189 was also identified on chromosome 19 band D1. It codes for the full-length protein but consists of just one exon. Studies have shown that mouse MRG15/Tex-189 is localized in dendrites as well as in the nuclei of Purkinje cells (Matsuoka et al. 2002 In addition to the two proteins described above, two novel Msl-3 homologs were identified in Ensembl, both encoded by three exons separately on chromosome 18 bands A2 and C. Neither gene has a homolog in human, which represents one of the relatively rare cases of such an occurrence. These two proteins were not identified in FANTOM2.
Histone Acetyltransferase (HAT) Family
The first protein identified is the Myst1 (Kawai et al. 2001
The second CD-containing HAT gene identified is the mouse homolog of the human protein Tip60 (Table 3). The mouse homolog gene is located on chromosome 19 band A and is comprised of 13 exons. This gene is not represented in FANTOM2. Mouse Tip60 has been suggested to have a developmental function in early embryogenesis and organ development (McAllister et al. 2002
Retinoblastoma-Binding Protein 1 (RBBP1) Family The first mouse RBBP1 gene is located on chromosome 12 band C3. Two alternatively spliced transcripts were identified in Ensembl for this gene, coding for the full-length protein containing all three CD, TUDOR, and BRIGHT domains (No. 30a) and a shorter transcript that codes for a protein which contains just the CD (No. 30b; Table 4). The gene for RBBP1-related protein is located on chromosome 13 band A2. In Ensembl, two isoforms of the protein were identified, with the long isoform containing a CD, as well as TUDOR and BRIGHT domains, and the shorter isoform containing just the CD. Interestingly, because of the splicing out of exon 15 in the long isoform, which encodes for part of the CD sequence, the CD subclass is transformed into subclass U. Perhaps the longer isoform protein is targeted to a region that is different from that of the shorter isoform.
Enoyl-CoA Hydratase (EnoylCoAH) Family The first locus encodes a protein that is identified in FANTOM2 VPS with two corresponding proteins identified in Ensembl. The gene is located on chromosome 8 band E1, with both protein isoforms encoded by six exons. The full-length proteins identified in both FANTOM2 and Ensembl are synonymous. They contain a CD near the N-terminal and an EnoylCoAH domain (No. 32a). The shorter transcript encodes for a protein that lacks the CD as a result of the use of an alternative first exon. Presumably this indicates the use of an alternative promoter for the transcription of mRNA for both proteins. The function of this protein in mouse is not yet known, and thus far this family has only been identified in mammals (Table 5).
The second gene identified that encodes for CD-containing EnoylCoAH protein is located on chromosome 13 band A5. The gene encodes for the Cdyl protein, which plays a major role in spermatogenesis in mouse and shares 93% identity with its human homolog, CDYL. Cdyl is deduced to activate genes in spermatogenesis by acetylating histone H4 (Lahn et al. 2002
SWI3 Family
The first SWI3 family member in mouse is SRG3, also known as SMARCC1. Mouse SRG3 is essential for early embryogenesis and plays an important role in mice brain development (Kim et al. 2001 The second gene of the mouse SWI3 family is on chromosome 10 band D3, the mouse homolog of the human SMARCC2 gene (No. 39; Table 3). Four alternatively spliced transcripts were identified at the locus in Ensembl, three of them coding for very similar proteins, and the fourth transcript of four exons coding for a short peptide (No. 39d). Proteins 39a, 39b, and 39c are very similar to each other except for their protein lengths. Alternative splicing of the exons does not cause any major differences in proteins 39b and 39c, but the splicing of exon 7 in 39a causes the protein to lose the BRCT protein-binding domain and shortens that CD region slightly (Table 3). Exon 7 therefore presumably encodes for part of the BRCT domain and CD sequence. Transcripts for this gene were not identified in FANTOM2.
Ankyrin Family
CD Families Not Detected in the Mouse Transcriptome
CMT family proteins contain a CD of subclass M embedded within an extended DNA methylase domain, with a BAH (bromo adjacent homology) domain near the N-terminal of the protein. These proteins are unique to plants and have not been identified in animals (Genger et al. 1999
The integrase family members also appear to be plant-specific and have not thus far been identified in animals (K. Tajul-Arifin, R. Teasdale, and J.S. Mattick, in prep.). These proteins contain reverse transcriptase and integrase core domains, with a CD of subclass W located at the C-terminal end of the protein. The function of these proteins is unknown. The integrases form a novel group within Ty3/Gypsy retrotransposons, which have been proposed to be named the `Chromovirus' (Marin and Llorens 2000
The last CD family which has not been identified in either mouse or human is the AAA family. The proteins have been identified in S. cerevisiae
and S. pombe, but not in plants or animals (K. Tajul-Arifin, R. Teasdale, and J.S. Mattick, in prep.). The proteins are characterized by two AAA domains, with a CD of subclass A embedded within the C-terminal AAA domain. Yef3 (yeast elongation factor 3) of S. cerevisiae, a component of yeast protein elongation machinery, is proposed to play a role in the ribosomal optimization of the accuracy of fungal protein synthesis by altering the conformation and activity of a ribosomal `accuracy center' (Sandbaken et al. 1990
CD-Containing Proteins in Human
By combining data from FANTOM2 and Ensembl, we identified 36 loci encoding CD-containing proteins in mouse, including 17 novel loci. In total, 65 CD-related alternatively spliced proteins were identified, with 43 being novel. All of the mouse genes except four are conserved in human, whereas there are six human genes encoding CD-containing proteins that are not conserved in mouse.
CD-containing proteins are generally localized in the nucleus as part of large complexes and are involved in either activating or repressing genes or regions of the chromosome by altering chromatin architecture. Evidence suggests that the CD determines the target specificity of the protein/complex, as domain swapping of CD regions of the Pc and HP1 proteins of D. melanogaster, which normally localize to the Polycomb and heterochromatin region, respectively, targeted the Pc proteins/complexes to the heterochromatin region, and vice versa (Platero et al. 1995 The absence of three CD families in mouse and human indicates that some CD-containing proteins have evolved to carry out specific function(s) in a particular class of organism. Consequently, three of the CD families described here are only found in mammals (RBBP, EnoylCoAH, and Ankyrin families; Table 5), and mammals lack the three CD families which are found in plants or fungi. Evolutionary and domain accretion studies will provide further insight into the evolution and function of the different CD families.
Creating CD Hidden Markov Models (HMM) Profiles CD-containing protein sequences were obtained from publicly available databases. Proteins were filtered to avoid multiple representation of a protein from the same organism, except for alternatively spliced products. The proteins were analyzed with SMART and Pfam to obtain the CD sequence for each protein. CD sequences were then clustered using the Protein Distance Method in BioManager (http://bn2.angis.org.au
Identification and Mapping of Mouse and Human CD-Containing Proteins
Alternative Splicing Analysis
Domain Analysis
The Centre for Functional and Applied Genomics is a Special Research Centre of the Australian Research Council. The sequence information used in this study was provided by the RIKEN Phase I and II Sequencing Team and FANTOM2 Consortium.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1015703.
5 Takahiro Arakawa, Piero Carninci, Jun Kawai, and Yoshihide Hayashizaki.
6 Corresponding author. [Supplemental material is available online at www.genome.org.]
Aagard, L., Laible, G., Selenko, P., Schmid, M., Dorn, R., Schotta, G., Kuhfittig, S., Wolf, A., Lebersorger, A., Singh, P.B., et al. 1999. Functional mammalian homologs of the Drosophila PEV-modifier Su(var)3-9 encode centromereassociated proteins which complex with the heterochromatin component M31. Embo J. 18:1923 -1938.[CrossRef][Medline]
Aasland, R. and Stewart, A.F. 1995. The chromo shadow domain, a second chromodomain in heterochromatin-binding protein 1, HP1. Nucleic Acids Res. 23:3168
-3174. Akhtar, A., Zink, D., and Becker, P.B. 2000. Chromodomains are protein-RNA interaction modules. Nature 407:405 -409.[CrossRef][Medline] Alkema, M.J., Jacobs, J., Voncken, J.W., Jenkins, N.A., Copeland, N.G., Satijn, D.P., Otte, A.P., Berns, A., and van Lohuizen, M. 1997. MPc2, a new murine homolog of the Drosophila polycomb protein is a member of the mouse polycomb transcriptional repressor complex. J. Mol. Biol. 273:993 -1003.[CrossRef][Medline]
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25:3389
-3402. Bannister, A.J., and Miska, E.A. 2000. Regulation of gene expression by transcription factor acetylation. Cell. Mol. Life Sci. 57:1184 -1192.[CrossRef][Medline]
Bardos, J.I., Saurin, A.J., Tissot, C., Duprez, E., and Freemont, P.S. 2000. HPC3 is a new human polycomb ortholog that interacts and associates with RING1 and Bmi1 and has transcriptional repression properties. J. Biol. Chem.
275:28785
-28792.
Bateman, A., Birney, E., Durbin, R., Eddy, S.R., Howe, K.L., and Sonnhammer, E.L.L. 2000. The Pfam protein family database. Nucleic Acids Res. 28:263
-266. Belfield, G.P., Ross-Smith, N.J., and Tuite, M.F. 1995. Translation elongation factor-3 (EF-3): An evolving eukaryotic ribosomal protein? J. Mol. Evol. 41:376 -387.[CrossRef][Medline] Bertram, M.J. and Pereira-Smith, O.M. 2001. Conservation of the MORF4 related gene family: Identification of a new chromo domain subfamily and novel protein motif. Gene 266:111 -121.[CrossRef][Medline] Bouazoune, K., Mitterweger, A., Langst, G., Imhof, A., Akhtar, A., Becker, P.B., and Brehm, A. 2002. The dMi-2 chromodomains are DNA binding modules important for ATP-dependent nucleosome mobilization. EMBO J. 21:2430 -2440.[CrossRef][Medline] Brehm, A., Langst, G., Kehle, J., Clapier, C.R., Imhof, A., Eberharter, A., Muller, J., and Becker, P.B. 2000. dMi-2 and ISWI chromatin remodeling factors have distinct nucleosome binding and mobilization properties. EMBO J. 19:4332 -4341.[CrossRef][Medline] Bultman, S. and Magnuson, T. 2000. Molecular and genetic analysis of the mouse homolog of the Drosophila suppressor of position-effect variegation 39 gene. Mamm. Genome 11:251 -254.[CrossRef][Medline] Callebaut, I., Courvalin, J.C., and Mornon, J.P. 1999. The BAH (bromo-adjacent homology) domain: A link between DNA methylation, replication and transcriptional regulation. FEBS Lett. 446:189 -193.[CrossRef][Medline]
Carninci, P., Shibata, Y., Hayatsu, N., Sugahara, Y., Shibata, K., Itoh, M., Konno, H., Okazaki, Y., Muramatsu, M., and Hayashizaki, Y. 2000. Normalization and subtraction of cap-trapper-selected cDNAs to prepare full-length cDNA libraries for rapid discovery of new genes. Genome Res. 10:1617
-1630. Cavalli, G. and Paro, R. 1998. Chromo-domain proteins: Linking chromatin structure to epigenetic regulation. Curr. Opin. Cell Biol. 10:354 -360.[CrossRef][Medline] Chakraburtty, K. and Triana-Alonso, F.J. 1998. Yeast elongation factor 3: Structure and function. Biol. Chem. 379:831 -840.[Medline] Cowell, I.G. and Austin, C.A. 1997. Self-association of chromo domain peptides. Biochim. Biophys. Acta 1337:198 -206.[CrossRef][Medline]
Crosby, M.A., Miller, C., Alon, T., Watson, K.L., Verrijzer, C.P., Goldman-Levi, R., and Zak, N.B. 1999. The trithorax
group gene moira encodes a brahma-associated putative chromatin-remodeling factor in Drosophila melanogaster. Mol. Cell. Biol. 19:1159
-1170. Czvitkovich, S., Sauer, S., Peters, A.H., Deiner, E., Wolf, A., Laible, G., Opravil, S., Beug, H., and Jenuwein, T. 2001. Over-expression of the SUV39H1 histone methyltransferase induces altered proliferation and differentiation in transgenic mice. Mech. Dev. 107:141 -153.[CrossRef][Medline] Daubresse, G., Deuring, R., Moore, L., Papoulas, O., Zakrajsek, I., Waldrip, W.R., Scott, M.P., Kennison, J.A., and Tamkun, J.W. 1999. The Drosophila kismet gene is related to chromatin-remodeling factors and is required for both segmentation and segment identity. Development 126:1175 -1187.[Abstract]
Delmas, V., Stokes, D.G., and Perry, R.P. 1993. A mammalian DNA-binding protein that contains a chromodomain and an SNF2/SWI2-like helicase domain. Proc. Natl. Acad. Sci.
90:2414
-2418. Ding, D.Q., Tomita, Y., Yamamoto, A., Chikashige, Y., Haraguchi, T., and Hiraoka, Y. 2000. Large-scale screening of intracellular protein localization in living fission yeast cells by the use of a GFP-fusion genomic DNA library. Genes Cells 5: 169-190.[Abstract] Eddy, S.R. 1996. Hidden Markov models. Curr. Opin. Struct. Biol. 6: 361-365.[CrossRef][Medline] Eddy, S.R. 1998. Profile hidden Markov models. Bioinfomatics 14:755 -763. Felsenstein, J. 1989. PHYLIPPhylogeny Inference Package (Version 3.2). Cladistics 5: 164-166. The FANTOM Consortium and the RIKEN Genome Exploration Research Group Phase I and II Team. 2002. Analysis of the mouse transcriptome based upon functional annotation of 60,770 full-length cDNAs. Nature 420:563 -573.[CrossRef][Medline] Genger, R.K., Kovac, K.A., Dennis, E.S., Peacock, W.J., and Finnegan, E.J. 1999. Multiple DNA methyltransferase genes in Arabidopsis thaliana. Plant Mol. Biol. 41:269 -278.[CrossRef][Medline] Gorman, M., Franke A., and Baker, B.S. 1995. Molecular characterization of the male-specific lethal-3 gene and investigations of the regulation of dosage compensation in Drosophila. Development 121:463 -475.[Abstract] Hashimoto, N., Brock, H.W., Nomura, M., Kyba, M., Hodgson, J., Fujita, Y., Takihara, Y., Shimada, K., and Higashinakagawa, T. 1998. RAE28, BMI1, and M33 are members of heterogeneous multimeric mammalian Polycomb group complexes. Biochem. Biophys. Res. Commun. 245:356 -365.[CrossRef][Medline] Hemenway, C.S., Halligan, B.W., Gould, G.C., and Levy, L.S. 2000. Identification and analysis of a third mouse Polycomb gene, MPc3. Gene 242:31 -40.[CrossRef][Medline]
Henikoff, S. and Comai, L. 1998. A DNA methyltransferase homolog with a chromodomain exists in multiple polymorphic forms in Arabidopsis. Genetics
149:307
-318. Hilfiker, A., Hilfiker-Kleiner, D., Pannuti, A., and Lucchesi, J.C. 1997. mof, a putative acetyl transferase gene related to the Tip60 and MOZ human genes and to the SAS genes of yeast, is required for dosage compensation in Drosophila. EMBO J. 16:2054 -2060.[CrossRef][Medline] Horsley, D., Hutchings, A., Butcher, G.W., and Singh, P.B. 1996. M32, a murine homologue of Drosophila heterochromatin protein 1 (HP1), localises to euchromatin within interphase nuclei and is largely excluded from constitutive heterochromatin. Cytogenet. Cell Genet. 73:308 -311.[Medline]
Hubbard, T., Barker, D., Birney, E., Cameron, G., Chen, Y., Clark, L., Cox, T., Cuff, J., Curwen, V., Down, T. et al. 2002. The Ensembl genome database project. Nucleic Acids Res.
30: 38-41.
James, T.C. and Elgin, S.C. 1986. Identification of a nonhistone chromosomal protein associated with heterochromatin in Drosophila melanogaster and its gene. Mol. Cell. Biol. 6:3862
-3872. Jenuwein, T., Laible, G., Dorn, R., and Reuter, G. 1998. SET domain proteins modulate chromatin domains in eu- and heterochromatin. Cell. Mol. Life Sci. 54: 80-93.[CrossRef][Medline]
Jeon, S.H., Kang, M.G., Kim, Y.H., Jin, Y.H., Lee, C., Chung, H.Y., Kwon, H., Park, S.D., and Seong, R.H. 1997. A new mouse gene, SRG3, related to the SWI3 of Saccharomyces cerevisiae, is required for apoptosis induced by glucocorticoids in a thymoma cell line. J. Exp. Med. 185:1827
-1836. Jin, Y.H., Yoo, E.J., Jang, Y.K., Kim, S.H., Kim, M.J., Shim, Y.S., Lee, J.S., Choi, I.S., Seong, R.H., Hong, S.H., et al. 1998. Isolation, and characterization of hrp1+, a new member of the SNF2/SWI2 gene family from the fission yeast Schizosaccharomyces pombe. Mol. Gen. Genet. 257:319 -329.[CrossRef][Medline] Jones, D.O., Cowell, I.G., and Singh, P.B. 2000. Mammalian chromodomain proteins: Their role in genome organisation and expression. Bioessays 22:124 -137.[CrossRef][Medline]
Jones, R.S. and Gelbart, W.M. 1993. The Drosophila Polycomb-group gene Enhancer of zeste contains a region with sequence similarity to trithorax. Mol. Cell. Biol. 13:6357
-6366. Kawai, J., Shinagawa, A., Shibata, K., Yoshino, M., Itoh, M., Ishii, Y., Arakawa, T., Hara, A., Fukunishi, Y., Konno, H., et al. 2001. Functional annotation of a full-length mouse cDNA collection. Nature 409:685 -690.[CrossRef][Medline]
Kehle, J., Beuchle, D., Treuheit, S., Christen, B., Kennison, J.A., Bienz, M., and Muller, J. 1998. dMi-2, a hunchback-interacting protein that functions in Polycomb repression. Science 282:1897
-1900. Kelley, D.E., Stokes, D.G., and Perry, R.P. 1999. CHD1 interacts with SSRP1 and depends on both its chromodomain and its ATPase/helicase-like domain for proper association with chromatin. Chromosoma 108:10 -25.[CrossRef][Medline]
Kim, J.K., Huh, S.O., Choi, H., Lee, K.S., Shin, D., Lee, C., Nam, J.S., Kim, H., Chung, H., Lee, H.W., et al. 2001. Srg3, a mouse homolog of yeast SWI3, is essential for early embryogenesis and involved in brain development. Mol. Cell. Biol.
21:7787
-7795. Lachner, M., O'Carroll, D., Rea. S., Machtler, K., and Jenuwein, T. 2001. Methylation of histone H3 lysine 9 creates a binding site for HP1 proteins. Nature 410:116 -120.[CrossRef][Medline] Lahn, B.T. and Page, D.C. 1999. Retroposition of autosomal mRNA yielded testis-specific gene family on human Y chromosome. Nat. Genet. 21:429 -433.[CrossRef][Medline]
Lahn, B.T., Tang, Z.L., Zhou, J., Barndt, R.J., Parvinen, M., Allis, C.D., and Page, D.C. 2002. Previously uncharacterized histone acetyltransferases implicated in mammalian spermatogenesis. Proc. Natl. Acad. Sci.
99:8707
-8712. Lai, A., Marcellus, R.C., Corbeil, H.B., and Branton, P.E. 1999. RBP1 induces growth arrest by repression of E2F-dependent transcription. Oncogene 18:2091 -2100.[CrossRef][Medline]
Le Douarin, B., Nielsen, A.L., Garnier, J.M., Ichinose, H., Jeanmougin, F., Losson, R., and Chambon, P. 1996. A possible involvement of TIF1
Letunic, I., Goodstadt, L., Dickens, N.J., Doerks, T., Schultz, J., Mott, R., Ciccarelli, F., Copley, R.R., Ponting, C.P., and Bork, P. 2002. Recent improvements to the SMART domain-based sequence annotation resource. Nucleic Acids Res.
30:242
-244.
Li, Y.J., Pak, B.J., Higgins, R.R., Lu, S.J., and Ben-David, Y. 2001. Contiguous arrangement of p45 NFE2, HnRNPA1, and HP1
Lindroth, A.M., Cao, X., Jackson, J.P., Zilberman, D., McCallum, C.M., Henikoff, S., and Jacobsen, S.E. 2001. Requirement of CHROMOMETHYLASE3 for maintenance of CpXpG methylation. Science 292:2077
-2080. Lopez-Fernandez, L.A. and del Mazo, J. 1996. Characterization of genes expressed early in mouse spermatogenesis, isolated from a subtractive cDNA library. Mamm. Genome 7: 698-700.[CrossRef][Medline] Lucchesi, J.C. 1998. Dosage compensation in flies and worms: The ups and downs of X-chromosome regulation. Curr. Opin. Genet. Dev. 8:179 -184.[CrossRef][Medline] Maison, C., Bailly, D., Peters, A.H., Quivy, J.-P., Roche, D., Taddei, A., Lachner, M., Jenuwein, T., and Almouzni, G. 2002. Higher-order structure in pericentric heterochromatin involves a distinct pattern of histone modification and an RNA component. Nat. Genet. 30:329 -334.[CrossRef][Medline]
Marin, I. and Baker, B.S. 2000. Origin and evolution of the regulatory gene male-specific lethal-3. Mol. Biol. Evol. 17:1240
-1250. Marin, I. and Llorens, C. 2000. Ty3/Gypsy retrotransposons: Description of new Arabidopsis thaliana elements and evolutionary perspectives derived from comparative genomic data. Mol. Biol. Evol. 7:1040 -1049. Matsuoka, Y., Shibata, S., Ban, T., Toratani, N., Shigekawa, M., Ishida, H., and Yoneda, Y. 2002. A chromodomain-containing nuclear protein, MRG15 is expressed as a novel type of dendritic mRNA in neurons. Neurosci. Res. 42:299 -308.[CrossRef][Medline] Mattick, J.S. 2001. Noncoding RNAs: The architects of eukaryotic complexity. EMBO Rep. 2: 986-991.[CrossRef][Medline] McAllister, D., Merlo, X., and Lough, J. 2002. Characterization and expression of the mouse tat interactive protein 60 kD (TIP60) gene. Gene 289:169 -176.[CrossRef][Medline]
Messmer, S., Franke A., and Paro, R. 1992. Analysis of the functional role of the Polycomb chromo domain in Drosophila melanogaster. Genes & Dev.
6:1241
-1254.
Muchardt, C., Guillemé, M., Seeler, J.-S., Trouche, D., Dejean A., and Yaniv, M. 2002. Coordinated methyl and RNA binding is required for heterochromatin localization of mammalian HP1
O'Carroll, D., Scherthan, H., Peters, A.H., Opravil, S., Haynes, A. R., Laible, G., Rea, S., Schmid, M., Lebersorger, A., Jerratsch, M., et al. 2000. Isolation and characterization of Suv39h2, a second histone H3 methyltransferase gene that displays testis-specific expression. Mol. Cell. Biol. 20:9423
-9433. Pearce, J.J., Singh, P.B., and Gaunt, S.J. 1992. The mouse has a Polycomb-like chromobox gene. Development 114:921 -929.[Abstract] Platero, J.S., Hartnett, T., and Eissenberg, J.C. 1995. Functional analysis of the chromo domain of HP1. EMBO J. 14:3977 -3986.[Medline] Prakash, S.K., Van den Veyver, I.B., Franco, B., Volta, M., Ballabio, A., and Zoghbi, H.Y. 1999. Characterization of a novel chromo domain gene in xp22.3 with homology to Drosophila msl-3. Genomics 59:77 -84.[CrossRef][Medline] Sandbaken, M.G., Lupisella, J.A., DiDomenico, B., and Chakraburtty, K. 1990. Protein synthesis in yeast. Structural and functi |