|
|
|
|
Published online before print
March 21, 2005, 10.1101/gr.3155905 Genome Res. 15:463-474, 2005 ©2005 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/05 $5.00
Disclosing hidden transcripts: Mouse natural senseantisense transcripts tend to be poly(A) negative and nuclear localized1 Technology and Development Team for Mammalian Cellular Dynamics, BioResource Center (BRC), RIKEN Tsukuba Institute, Tsukuba, Ibaraki, Japan 305-0074 2 BioResource Information Division, BioResource Center (BRC), RIKEN Tsukuba Institute, Tsukuba, Ibaraki, Japan 305-0074 3 Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan 305-0006 4 Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Tsurumi-ku, Yokohama, Kanagawa, Japan 230-0045 5 Genome Science Laboratory, RIKEN, Wako, Saitama, Japan 351-0198 6 Division of Genomic Information Resource Exploration, Science of Biological Supramolecular Systems, Yokohama City University, Graduate School of Integrated Science, Tsurumi-ku, Yokohama, Japan 230-0045
Genome-wide in silico analysis identified thousands of natural senseantisense transcript (SAT) pairs in the mouse transcriptome. We investigated their expression using strand-specific oligo-microarray that distinguishes expression of sense and antisense RNA from 1947 SAT pairs. The majority of the predicted SATs are expressed at various steady-state levels in various tissues, and cluster analysis of the array data demonstrated that the ratio of sense and antisense expression for some of the SATs fluctuated markedly among these tissues, while the rest was unchanged. Surprisingly, further analyses indicated that vast amounts of multiple-sized transcripts are expressed from the SAT loci, which tended to be poly(A) negative, and nuclear localized. The tendency that the SATs are often not polyadenylated is conserved, even in the randomly chosen SAT genes in the plant Arabidopsis thaliana. Such common characteristics imply general roles of the SATs in regulation of gene expression.
Recently, increasing numbers of natural antisense transcripts have been identified in a variety of eukaryotic organisms using large-scale transcriptome analysis. The senseantisense transcript (SAT) pair is a pair of transcripts produced from the same locus on the chromosome, but from the DNA strands opposite each other. These include 2481 SAT pairs in mice (The FANTOM Consortium and The Riken Genome Exploration Research Group Phase I & II Team 2002 Although we now have massive amounts of data gleaned from genome/transcriptome analyses, the experimental evidence defining the extent of most SAT expression remains very limited; in part, because the majority of the described data have been obtained using in silico analysis exclusively. Here, we report actual transcriptional analyses of sense and antisense genes and show that these transcripts tend to be both poly(A) negative and nuclear localized.
Expression analysis of SATs by Oligo DNA microarray To determine whether SATs are actually transcribed, we performed expression analysis of available mouse SATs by using custom-made oligo DNA (60-mer) chips that distinguish the expression of sense versus antisense transcripts. From 2481 pairs of SATs identified in mouse transcriptome (The FANTOM Consortium and The RIKEN Genome Exploration Research Group Phase I & II Team 2002
We found that most of the sense and antisense genes are indeed expressed at various levels in various cells and tissues (Fig. 1) (all of the processed signal data are found in Supplemental Table S1). Figure 1, A and B, show the expression of the SATs in ES cells (Tada et al. 2001
We next performed clustering analysis of the microarray data, in which the SATs pairs were grouped according to the log ratio of the expression levels for the sense and the antisense pairs (when both of the pairs consist of coding genes or noncoding genes only), or the coding and noncoding gene pairs (Fig. 2AF). If the expression level of the noncoding gene is threefold more than that of the coding gene (in noncoding/coding pairs) (Fig. 2C), or when the expression of the sense gene is more than threefold compared with antisense gene (in coding/coding or noncoding/noncoding pairs) (Fig. 2A,B), the ratio is shown in red. In the reversed situation, it is shown in green. As shown in these expression heat maps, there are SAT pairs whose expression ratio is relatively unchanged among the five tissues/cell samples tested, whereas some of the SAT pairs exhibit tissue-specific expression patterns. For example, as shown in Figure 2C, 8.5% of the coding/noncoding SAT correspond to a monotonously red block (indicated by arrow), while 26% of the SATs belong to green block, suggesting that the expression ratio of these SAT pairs are fixed among the five different samples, and that there are SAT pairs in which expression of noncoding genes are higher than that of coding genes. On the other hand, the expression ratio of the rest of the SAT pairs fluctuated to a various extent. Some of the SAT pairs showed expression-ratio differences specific to each tissue sample (arrowhead in Fig. 2C), implying that noncoding RNA may be involved in tissue-specific regulation of the expression of their coding partners. Such trends are pertinent for coding/coding or noncoding/noncoding SAT pairs; expression ratios of 30% 40% of the SAT pairs were relatively unchanged, while the rest fluctuated significantly among the tissues/cells (Fig. 2A,B).
Northern hybridization analysis of selected SATs We next performed Northern analysis of six randomly chosen pairs of SATs. To distinguish the sense and antisense transcripts, we transcribed single-stranded riboprobes and used them in the analysis. The results from two SAT pairs are shown in Figures 3 and 4. The first pair comprises 6330439J10 (FANTOM2 clone ID, 3-oxoacid CoA transferase) and A230019L24 (unknown EST). A230019L24 lacks significant ORFs and appears to represent ncRNA. Clone 6330439J10 is 2099-bp long, which corresponds to the lower band on both the Northern blot loaded with total RNA and that with poly(A)+ RNA (Fig. 3D). The microarray data were in agreement with the signal intensities in the Northern blots (Fig. 3B). Very faint bands of 1.8 kb and 3 kb corresponding to A230019L24 were detected on the poly(A)+ RNA blot. However, the total RNA blot for A230019L24 showed an extremely large band (>10 kb), as well as a strong hybridization smear between 18S and 28S rRNA (Fig. 3E). This hybridization smear was not due to repetitive sequences in the probe, as the same probe gave a single band on a genomic Southern blot (data not shown).
In the second SAT pair (G430028I15, unknown EST; D930007L12, Metaxin1), G430028I15 appears to encode ncRNA. It is intriguing that the ncRNA probe again gave a strong hybridization smear on the total RNA blot, but not on the poly(A)+ blot (Fig. 4D). The size of the Metaxin1 mRNA is 1.8 kb (Bornstein et al. 1995
To ascertain whether these banding patterns are not due to nonspecific cross hybridization to the RNA probes used, we designed 30-mer oligo DNA probes complementary to either 6330439J10/A230019L24 or G430028I15 (the probe positions indicated in Figs. 3A and 4A) and performed Northern analysis again (Fig. 5A,B). The hybridization patterns with 30-mer probes were very similar to those obtained with the full-length RNA probes, confirming that both kinds of probes specifically detected multiple-sized transcripts in the total RNA fraction. The multiple bands thus detected likely are not degradation products produced during the RNA preparation, since other probes such as 6330439J10 detected bands with distinct sizes on the same blot (Fig. 3D). These results suggest that multiple-sized transcripts can be generated from a single SAT locus. As shown in the lower part of Figure 5A, a small band of
The SAT gene probes often detected many bands or strong hybridization smears in total RNA, but not in poly(A)+ RNA. These hybridization patterns were not specific to ncRNA, but appeared to be associated with the SAT loci in general. Of the six SAT pairs analyzed (6 SATs; 2 probes per SAT = 12 probes total), we found the smear hybridization pattern with eight (66.7%) probes, five of which came from protein-coding genes (Table 1).
Because we obtained different results with total RNA and poly(A)+ RNA, we prepared poly(A)+ and poly(A) RNA from fibroblasts and repeated the Northern analysis. The transcripts revealed by the A230019L24 and G430028I15 probes occurred in the poly(A) fraction, indicating that they were not polyadenylated (Fig. 5D,E). We also found that these transcripts were localized predominantly in the nuclear fraction (Fig. 5D,E); in particular, G430028I15 transcripts occurred exclusively in the nucleus. We performed a similar Northern analysis with nuclear/cytoplasmic RNA fractions and the genes producing smears on the total RNA blots. Including A230019L24 and G430028I15 above, four genes produced almost nuclear-exclusive transcripts such as found in G430028I15 (Fig. 5E), and another four genes produced both nuclear and cytoplasmic transcripts similar to those found in A230019L24 (Fig. 5D) (data not shown). Thus, nuclear localization of the SATs may be an intrinsic nature of the SAT genes. It is generally believed that primary transcripts destined to be mRNAs move quickly to the cytoplasm after their synthesis (Jackson et al. 2000
Estimations of expression levels of SATs by dot blots
Identification of start sites of SATs
Global comparison of microarray data: Difference between Oligo dT and random priming methods As described in the Methods section, our custom microarray contains 2697 probe sequences derived from ESTs that are unrelated to the SAT gene sequences. Homology search analysis showed that the EST sequences did not overlap with the SAT sequences used in this study (data not shown). Therefore, this EST set likely was derived from typical polyadenylated mRNA and could serve as a convenient control against the SATs. As expected, the sum of the signals detected by these EST probes was reduced markedly when random priming was used (Fig. 6). In contrast, the opposite trend was apparent for SATs; random-primed targets gave higher signals in all cases tested (Fig. 6). It is therefore likely that the SAT transcripts generally are enriched in the poly(A) fraction.
Cluster analysis of the expression ratio for the SAT pairs were again performed using the data obtained with random-primed targets (Fig. 2D,E,F). Global patterns of the heat maps were similar to those obtained with the oligo dT priming method; there were clusters with fixed ratios and clusters showing ratio fluctuations or tissue-specific ratio differences. However, members belonging to each cluster were quite different between the data obtained with two different priming methods. Side-by-side comparisons of cluster analysis data clearly showed that hybridization patterns for a given probe pair often disagreed when a different priming method was chosen (see Supplemental Fig. 2).
Some of the plant SATs are also present in poly(A)- fraction
In this study, we present the first experimental evidence for the expression of SATs identified by genome-wide analyses in silico. The steady-state RNA levels of the SATs generally were high. In light of dot-blot hybridizations, we estimated the amounts of the noncoding transcripts, A230019L24 and G430028I15, to be 10% and 100% of the level of the -actin RNA, respectively. More importantly, we disclosed several previously hidden characteristics of SAT transcripts; they generate multiple-sized transcripts that are not polyadenylated and tend to be nuclear localized. These traits suggest that SATs belong to a hitherto unknown category of transcripts. In this context, it is interesting to note that a smear hybridization pattern on the Northern blot with a noncoding gene probe was recently reported by Imamura et al. (2004
The processing pathways from primary transcript to mRNA or other classes of RNA are not yet fully understood. For example,
The regulatory roles played by the SAT genes may occur at the chromatin-domain level. Expression of imprinted genes is subjected to such domain-level regulation. Imprinted genes often are accompanied by a large antisense transcript, which is involved in the regional regulation of expression (e.g., AirIgf2r, LIT1KvLQT1) (Mitsuya et al. 1999
Another interesting aspect of SATs is their relationship to RNA interference (RNAi). It is of great interest whether naturally occurring SATs can form dsRNA and become targets of RNAi machinery. Recently, dependence on natural RNAi via dsRNA formation in transposon silencing was demonstrated in Caenorhabditis elegans (Sijen and Plasterk 2003 Studies on poly(A) RNAs has been limited, because conventional belief in molecular biology suggests that poly(A)+ mRNAs are major mediators in flows of genetic information. However, the information obtained from this study implies that some class of poly(A) nuclear RNA may have important biological functions, and will promote a new field of research on the regulatory mechanisms mediated by the poly(A) RNA.
DNA microarray experiment and analysis The sequences of 60-mer DNA specific to the sense and antisense genes were chosen by K.K. DNAFORM (Japan), and Agilent Technologies manufactured custom oligo DNA microarray chips by using this information. RNA was labeled and hybridized using Fluorescent Direct Label Kits (oligo dT-primed labeling) (Agilent Technologies), according to the manufacturer's protocols. For labeling with random nanomers, we used the CyScribe First-Strand cDNA Labeling Kit (Amersham). The RNA samples used for microarray experiments were from mouse ES cells, SL10 cells (fibroblast cell line), brain, heart, and testis. The total RNA of brain, heart, and testis for array experiments was purchased from Ambion. The total RNA of ES and fibroblast cells was isolated using Trizol reagent (Invitrogen). The same total RNA samples were reciprocally labeled with Cy3 or Cy5, hybridized to the oligo DNA on the chip, and dye-normalized, and processed signals were obtained using Feature Extraction software (Agilent Technologies). The custom oligo DNA microarray chip contained 2097 pairs of the senseantisense genes and 592 pairs of nonantisense bidirectional genes (total 5078 genes), along with an additional 3013 genes unrelated to natural antisense analysis. These 3013 genes include 2697 EST sequences mentioned in the section of "Global Comparison of Microarray Data" of the results. For the Feature Extraction software to produce the processed signals, the data for all genes on the chip were used. For further analysis, the average of Cy3-labeled and Cy5-labeled processed signals was used as the processed signal of a particular gene expression. The correlation coefficient between the Cy3- and Cy5-labeled processed signals from oligo dT-primed samples was 0.951 (ES cells), 0.944 (fibroblast cells), 0.982 (brain), 0.972 (heart), and 0.973 (testis). The correlation coefficient of random primed samples was 0.923 (ES cells), 0.970 (fibroblast cells), 0.985 (brain), 0.983 (heart), and 0.970 (testis). The total gross signal on the chip in each hybridization experiment was adjusted to that with the ES cell sample, so that the relative differences in gene expression among cell lines and tissues could be compared with one another. The processed signals in the Supplemental data (Table S1) were these averaged signals. The microarray data were analyzed using cluster analysis implemented in the statistical package R. A MIAME-compliant description of our array analysis is provided as an online Supplemental document. The raw data of array experiments were deposited to the Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) under the accession number of GSE2185 [NCBI GEO] (http://www.ncbi.nlm.nih.gov/geo/
Northern hybridization To obtain the probes for the 20 kb, +20 kb, and +40 kb regions of the Metaxin1 locus, DNA fragments were cloned into the pGEM-T Easy vector (Promega) after being amplified by PCR using C57BL/6J mouse genomic DNA and the following primer sequences: 20 kb, AGTCTTCTGTTGCCACTT GCCG and AGAGAAGGTGGCAGG TTGGCTG; +20 kb, ACACAGCAGTGA TAAGCCAGGG and TGCTTTACCTAT CCAGCACCC; and +40 kb, TGTGAG GAGGGAACCTCAAGGC and TGTCT CTTTTCCACTGTCTCCC. 32P-labeled probes were generated by in vitro transcription to detect the transcripts of the same direction as G430028I15. The sequences of the synthetic 30-mer DNA probes were CGAAGCCCAGAGGACAGGAGTTT GAGAGCC (FANTOM2 ID, 6330439J10), GGCTCTCAAACTCCT GTCCTCTGGGCTTCG (FANTOM2 ID, A230019L24), and AAT GACACCCTCTTTCCGCCTCTCCTTTTG (FANTOM2 ID, G430028I15). The sequences of 30-mer probes used for Fig. 3C are CTCCAAAACGATGAAGTTAACCCACCACCG (#1), CCCTC CAGGCCCTTGGCAACCCGAAACCCT (#2), GGCACAACCCGC GGGGCTCGTCCTAGCTGT (#3), GGCTCTCAAACTCCTG TCCTCTGGGCTTCG (#4), CCGGGGGTTGGGGGTGGGGGCT TGGAGACG (#5), and GCTGGAGGAGGGTGGGGAGGAGGG TGGATT (#6). These 30-mer DNAs were labeled with [32P]ATP and polynucleotide kinase (Takara) and hybridized using ULTRAhyb-Oligo hybridization solution (Ambion).
cDNA synthesis, PCR, and 5' RACE
We thank Misako Yuzuriha for her excellent technical assistance. We also thank Drs. Kazuo Shinozaki and Motoaki Seki for sharing A. thaliana sense and antisense gene data. Whole-rosette A. thaliana plants were kindly provided by Dr. Masatomo Kobayashi at RIKEN BRC. This work was supported in part by a Grant-in-aid of Ministry of Education, Science and Culture of Japan to H.K. and K.A. and by the Special Coordinating Funds for Promoting Science and Technology to K.A.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.3155905. Article published online before print in March 2005.
7 Corresponding author. [Supplemental material is available online at www.genome.org. The expression data from this study have been submitted to Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) under accession no. GSE2185 [NCBI GEO] .]
The Arabidopsis Genome Initiative. 2000. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796-815.[CrossRef][Medline]
Bornstein, P., McKinney, C.E., LaMarca, M.E., Winfield, S., Shingu, T., Devarayalu, S., Vos, H.L., and Ginns, E.I. 1995. Metaxin, a gene contiguous to both thrombospondin 3 and glucocerebrosidase, is required for embryonic development in the mouse: Implications for Gaucher disease. Proc. Natl. Acad. Sci. 92: 4547-4551.
Collins, M., Rojnuckarin, P., Zhu, Y.H., and Bornstein, P. 1998. A far upstream, cell type-specific enhancer of the mouse thrombospondin 3 gene is located within intron 6 of the adjacent metaxin gene. J. Biol. Chem. 273: 21816-21824. The FANTOM Consortium, and The RIKEN Genome Exploration Research Group Phase I & II Team. 2002. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420: 563-573.[CrossRef][Medline]
Hall, I.M., Shankaranarayana, G.D., Noma, K., Ayoub, N., Cohen, A., and Grewal, S.I. 2002. Establishment and maintenance of a heterochromatin domain. Science 297: 2232-2237. Herbert, A. 2004. The four Rs of RNA-directed evolution. Nat. Genet. 36: 19-25.[CrossRef][Medline] Imamura, T., Yamamoto, S., Ohgane, J., Hattori, N., Tanaka, S., and Shiota, K. 2004. Non-coding RNA directed DNA demethylation of Sphk1 CpG island. Biochem. Biophys. Res. Commun. 322: 593-600.[CrossRef][Medline]
Jackson, D.A., Pombo, A., and Iborra, F. 2000. The balance sheet for transcription: An analysis of nuclear RNA metabolism in mammalian cells. FASEB J. 14: 242-254. Kiyosawa, H. and Abe, K. 2002. Speculations on the role of natural antisense transcripts in mammalian X chromosome evolution. Cytogenet. Genome Res. 99: 151-156.[CrossRef][Medline]
Kiyosawa, H., Yamanaka, I., Osato, N., Kondo, S., and Hayashizaki, Y. 2003. Antisense transcripts with FANTOM2 clone set and their implications for gene regulation. Genome Res. 13: 1324-1334. Kuwabara, T., Hsieh, J., Nakashima, K., Taira, K., and Gage, F.H. 2004. A small modulatory dsRNA specifies the fate of adult neural stem cells. Cell 116: 779-793.[CrossRef][Medline] Misra, S., Crosby, M.A., Mungall, C.J., Matthews, B.B., Campbell, K.S., Hradecky, P., Huang, Y., Kaminker, J.S., Millburn, G.H., Prochnik, S.E., et al. 2002. Annotation of the Drosophila melanogaster euchromatic genome: A systematic review. Genome Biol. 3: research0083.
Mitsuya, K., Meguro, M., Lee, M.P., Katoh, M., Schulz, T.C., Kugoh, H., Yoshida, M.A., Niikawa, N., Feinberg, A.P., and Oshimura, M. 1999. LIT1, an imprinted antisense RNA in the human KvLQT1 locus identified by screening for differentially expressed transcripts using monochromosomal hybrids. Hum. Mol. Genet. 8: 1209-1217. Mouse Genome Sequencing Consortium. 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420: 520-562.[CrossRef][Medline] Nelson, P., Kiriakidou, M., Sharma, A., Maniataki, E., and Mourelatos, Z. 2003. The microRNA world: Small is mighty. Trends Biochem. Sci. 28: 534-540.[CrossRef][Medline]
Nikaido, I., Saito, C., Mizuno, Y., Meguro, M., Bono, H., Kadomura, M., Kono, T., Morris, G.A., Lyons, P.A., Oshimura, M., et al. 2003. Discovery of imprinted transcripts in the mouse transcriptome using large-scale expression profiling. Genome Res. 13: 1402-1409. Osato, N., Yamada, H., Satoh, K., Ooka, H., Yamamoto, M., Suzuki, K., Kawai, J., Carninci, P., Ohtomo, Y., Murakami, K., et al. 2003. Antisense transcripts with rice full-length cDNAs. Genome Biol. 5: R5.[CrossRef][Medline]
Salditt-Georgieff, M. and Darnell Jr., J.E. 1982. Further evidence that the majority of primary nuclear RNA transcripts in mammalian cells do not contribute to mRNA. Mol. Cell. Biol. 2: 701-707. Seki, M., Satou, M., Sakurai, T., Akiyama, K., Iida, K., Ishida, J., Nakajima, M., Enju, A., Narusaka, M., Fujita, M., et al. 2005. Full-length cDNAs for the discovery and annotation of genes in A. thaliana. In Plant functional genomics (ed. D. Leister), pp. 3-22. The Haworth Press, Inc., Binghamton, NY.
Shinagawa, T. and Ishii, S. 2003. Generation of Ski-knockdown mice by expressing a long double-strand RNA from an RNA polymerase II promoter. Genes & Dev. 17: 1340-1345. Sijen, T. and Plasterk, R.H. 2003. Transposon silencing in the Caenorhabditis elegans germ line by natural RNAi. Nature 426: 310-314.[CrossRef][Medline] Sleutels, F., Zwart, R., and Barlow, D.P. 2002. The non-coding Air RNA is required for silencing autosomal imprinted genes. Nature 415: 810-813.[Medline] Tada, M., Takahama, Y., Abe, K., Nakatsuji, N., and Tada, T. 2001. Nuclear reprogramming of somatic cells by in vitro hybridization with ES cells. Curr. Biol. 11: 1553-1558.[CrossRef][Medline]
Vance, V. and Vaucheret, H. 2001. RNA silencing in plantsDefense and counterdefense. Science 292: 2277-2280.
Volpe, T.A., Kidner, C., Hall, I.M., Teng, G., Grewal, S.I., and Martienssen, R.A. 2002. Regulation of heterochromatic silencing and histone H3 lysine-9 methylation by RNAi. Science 297: 1833-1837.
Yamada, K., Lim, J., Dale, J.M., Chen, H., Shinn, P., Palm, C.J., Southwick, A.M., Wu, H.C., Kim, C., Nguyen, M., et al. 2003. Empirical analysis of transcriptional activity in the Arabidopsis genome. Science 302: 842-846. Yelin, R., Dahary, D., Sorek, R., Levanon, E.Y., Goldstein, O., Shoshan, A., Diber, A., Biton, S., Tamir, Y., Khosravi, R., et al. 2003. Widespread occurrence of antisense transcription in the human genome. Nat. Biotechnol. 21: 379-386.[CrossRef][Medline]
http://www.ncbi.nlm.nih.gov/geo/; National Center for Biotechnology Information. http://genome.ucsc.edu/; The UCSC Bioinformatics site.
Received August 18, 2004; accepted in revised format January 26, 2005. This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||