|
|
|
|
Genome Res. 13:1360-1365, 2003 ©2003 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/03 $5.00 Letter Continued Discovery of Transcriptional Units Expressed in Cells of the Mouse Mononuclear Phagocyte Lineage1Institute for Molecular Bioscience and ARC Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, Australia 2The Institute for Genomic Research, Rockville, Maryland 20850, USA 3Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan 4Genome Science Laboratory, RIKEN, Hirosawa, Wako, Saitama 351-0198, Japan 5JDRF/WT Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 2XY, United Kingdom
The current RIKEN transcript set represents a significant proportion of the mouse transcriptome but transcripts expressed in the innate and acquired immune systems are poorly represented. In the present study we have assessed the complexity of the transcriptome expressed in mouse macrophages before and after treatment with lipopolysaccharide, a global regulator of macrophage gene expression, using existing RIKEN 19K arrays. By comparison to array profiles of other cells and tissues, we identify a large set of macrophage-enriched genes, many of which have obvious functions in endocytosis and phagocytosis. In addition, a significant number of LPS-inducible genes were identified. The data suggest that macrophages are a complex source of mRNA for transcriptome studies. To assess complexity and identify additional macrophage expressed genes, cDNA libraries were created from purified populations of macrophage and dendritic cells, a functionally related cell type. Sequence analysis revealed a high incidence of novel mRNAs within these cDNA libraries. These studies provide insights into the depths of transcriptional complexity still untapped amongst products of inducible genes, and identify macrophage and dendritic cell populations as a starting point for sampling the inducible mammalian transcriptome.
The elucidation of mammalian genome sequences (Lander et al. 2001
The ultimate success of this approach is still dependent on the selection of tissues/cell types chosen to survey. Many transcripts will only be expressed in certain tissues, at specific time points, or in response to particular stimuli. This constraint is particularly relevant to cells of the immune system, and as a consequence many known, inducible immune response genes are poorly represented in the public EST databases (Staudt and Brown 2000
Macrophages and dendritic cells (DC) are related cell types with many unique functions in innate and acquired immunity, one of which is the presentation of antigen, together with costimulatory signals, to initiate T cell responses to pathogen. The function of both cell types is acutely regulated by many stimuli, with activation often being associated with extensive remodeling of their transcriptomes (Hashimoto et al. 2000 In the present study, we have assessed the complexity of the mouse transcriptome expressed in macrophages before and after LPS stimulation using existing RIKEN 19K microarrays. Additionally, given the low representation of known macrophage/DC-expressed genes in the current RIKEN set, we have looked for additional genes by creating cDNA libraries from purified macrophage and DC populations.
The Macrophage Transcriptome To assess the complexity of the transcriptome expressed in macrophages, RNA was extracted from primary bone marrow-derived macrophages cultured in the presence or absence of LPS and hybridized in duplicate to RIKEN 19K microarrays. Because of the diversity in macrophage function that is known to exist between mouse strains, and to provide additional power to the clustering, we used three different strains. We also examined multiple time points to assess the temporal cascade and assess the optimal time for cDNA library construction (see below). To permit comparison with other tissues studied in the RIKEN Expression Array Database, all hybridizations were performed using 17.5dpc C57Bl/6J whole-embryo RNA as a reference. Following normalization, the gene expression data obtained was clustered together with data obtained from 49 other tissues (Miki et al. 2001
Stimulation with LPS led to a remodeling of macrophage gene expression, and 373 probes were LPS-inducible in the LPS responsive strains, BALB/c and C3H/ARC, but not the hyporesponsive C3H/HeJ strain (Fig. 2, Suppl. Table 2). Interestingly, the majority (86%) of LPS-inducible transcripts were not restricted to the macrophage-enriched set (Fig. 2A). Most of the LPS-responsive probes have no annotated function, but those that could be classified were consistent with a role in macrophage activation, for example, 18% could be classified as playing a role in cell signaling and 10% are involved in antigen presentation (Fig. 2B), while only 4% of the LPS-responsive probes on the array fell into the cytokine/chemokine category. The paucity of macrophage specific genes that were LPS-inducible may reflect the underrepresentation of inflammatory genes in the probe set. Detailed analysis of the function of the known macrophage-specific and LPS-inducible genes on this array is not the core focus of this study. The key observations that can be made from the data are that macrophages are a distinct cell type that has not been sampled adequately in the RIKEN transcriptome project, that the mRNA profiles are complex (75% of elements on the array gave a detectable signal), and are not dominated by a small number of transcripts, and that LPS causes a significant shift in the profiles of expressed genes. These findings indicated to us that in depth sequencing of cDNA libraries from cells of the macrophage lineage was required to ensure the comprehensive sampling of the mouse transcriptome in the RIKEN project.
Identification of Novel Macrophage/Dendritic Cell Transcripts To assess the novelty of the data obtained, we determined how many of this set of unknown ESTs are represented in the FANTOM2 data set. Of the 29,301 ESTs that do not match known genes, 20,225 (69%) were also not found in the FANTOM2 set (Table 1). This observation is not just restricted to the singletons; a significant proportion of the unknown clusters, 816 of the 2068 (39%), were not represented in the FANTOM2 set (Table 1). The identification of a large number of novel singletons is not surprising for a number of reasons. Firstly, this mRNA source has not previously been widely sampled. Secondly, the normalization and subtraction strategy employed during library construction is designed to identify rare transcripts and increase diversity in the library, with the consequential identification of many singletons. To provide additional evidence as to whether these are truly expressed sequences, we compared the unknown unique sequences to the TIGR mouse gene indices (MGI). Of the unknown unique sequences, 1583 clusters and 5585 singleton ESTs matched TIGR MGI, which provides independent evidence that these represent genuine transcripts. Surprisingly, 23% (485/2068) of the unknown clusters were not represented in TIGR MGI (Table 1). Of these clusters, 349 (72%) could be mapped to either the mouse or human genome sequences (Table 1), providing additional support that they represent genuinely transcribed sequences. While the RIKEN full-length clone collection represents the most comprehensive set of cDNAs assembled to date, the high novelty rate among the sequences obtained in this present study demonstrates the need for continued sequencing of cDNA libraries from specialized cell populations if a complete picture of the transcriptome is to be obtained.
Functional Analysis of Macrophage/Dendritic Cell-Derived Transcripts
This study provides a snapshot of the transcriptome in a single inducible cell system. We have demonstrated a molecular lineage marker for primary macrophages. The cluster of genes highly expressed in this set included a large number of full-length RIKEN clones with no annotated function. Interestingly, this set was not dependant on the activation status of the cells, as the LPS hyporesponsive mouse C3H/HeJLpsd clustered with the LPS responsive strains BALB/c and C3H/ARCeven after LPS activation. Indeed, we showed that many LPS-inducible genes are not macrophage-restricted, an observation which itself has broader implications on how we target treatments for inflammatory disease. The RIKEN 19K full-length cDNA set did not contain many known inflammatory mediators. This observation was extended to the public databases generally, and is not surprising given that the large mouse EST projects have focussed on libraries from healthy tissues derived from specific pathogen-free animal house facilities. Our preliminary data set from three activated inflammatory cell populations describes a very high degree of novel transcripts, even among the large tentative consensus sequences. We were able to show broad representation of known cellular processes including signaling, transcription factors, receptors, and enzymes within these libraries. The emphasis of this work was gene discovery and as such a more detailed analysis of the macrophage and dendritic cell gene expression profiles is beyond the scope of this current manuscript. The high degree of novelty found in this study clearly demonstrates the requirement for continued sampling of inducible cell types such as these if a complete picture of the transcriptome, the ultimate aim of projects such as FANTOM2, is to be realized.
Mouse Strains, Cell Culture and Total RNA Extraction Bone marrow-derived macrophages (BMM) were differentiated from primary mouse bone marrow cells obtained from 6-wk-old femurs, collected from pools of male siblings from each of three mouse strains, BALB/c, C3H/ARCLpsn, and C3H/HeJLpsd. The Tlr4 P691H polymorphism in C3H/H3JLpsd was confirmed by sequencing. Cells were differentiated in the presence of 10% Serum Supreme (Gibco-BRL) in RPMI media (Gibco-BRL) and 104 U/mL (10 ng/mL) recombinant human CSF-1 (Chiron). At day 6 cells were harvested and plated in 90mm2 tissue culture dishes at a concentration of 106cells/plate. At day 7, cells were treated with LPS from Salmonella Minnesota (Sigma-Aldrich) for the times shown in the result section. CD11c+ DC were differentiated from bone marrow-derived progenitors by culturing in Iscoves medium containing 10% FCS, 10% GM-CSF (Preprotech). After culturing for 11 d, CD11c+ cells were purified by FACS. Total RNA was extracted using RNAeasy Midi columns (Qiagen) following the vendor's protocol.
Microarray Studies
cDNA Library Construction and Sequencing
Sequence Analysis The pipeline for clustering and assembly included:
This process resulted in assembliestentative consensus sequences ("TCs") and "singleton ESTs" (did not cluster with anything else). The resulting assemblies (TCs) were classified into TCs that contain both genes and ESTs and TCs that contain only ESTs. A nonredundant set of "unknown" sequences was built using the TCs containing only ESTs together with the singleton ESTs.
This set was searched using BLAST (E = 1e-5) against the TIGR Mouse Gene Index (http://www.tigr.org/tdb/tgi/mgi
For mapping of sequences to the mouse genome, Jim Kent's BLAT program was used, for reasons of high speed and high sensitivity for the same species transcripts/DNA sequences (Kent 2002 To evaluate the overlap between the "unknown" set and the sequences produced by the FANTOM2 project, this set was searched against a nonredundant set of sequences from FANTOM2, built in a similar fashion using the TIGR TGI clustering tools. The clones were annotated through human curation of a list of key words that were extracted from the sequence definitions ascribed to each TCor EST.
PAL is funded by the Juvenile Diabetes Research Foundation and the Wellcome Trust. CAW and TR are funded by the CRC for chronic inflammatory diseases.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1056103.
7 Takahiro Arakawa, Jun Kawai, and Yoshihide Hayashizaki.
6 Corresponding author. [Supplemental material is available online at www.genome.org.] The sequence data from this study have been submitted to DDBT under accession nos. AK089166 [GenBank] AK089912, BY153905 [GenBank] BY223868, BY681767 [GenBank] BY576025, BY742706 [GenBank] BY765561, BY767554 [GenBank] BY752495, BY761159 [GenBank] BY761576, and BY763617 [GenBank] BY766105. The expression data from this study have been submitted to GEO under accession nos. GPL256, GSM4635GSM4669, and GSE324GSE326.
Carninci, P., Shibata, Y., Hayatsu, N., Sugahara, Y., Shibata, K., Itoh, M., Konno, H., Okazaki, Y., Muramatsu, M. and Hayashizaki, Y. 2000. Normalization and subtraction of cap-trapper-selected cDNAs to prepare full-length cDNA libraries for rapid discovery of new genes. Genome Res. 10:1617
-1630.
Ehrt, S., Schnappinger, D., Bekiranov, S., Drenkow, J., Shi, S., Gingeras, T.R., Gaasterland, T., Schoolnik, G. and Nathan, C. 2001. Reprogramming of the macrophage transcriptome in response to interferon-
Florea, L., Hartzell, G., Zhang, Z., Rubin, G.M., and Miller, W. 1998. A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome Res.
8: 967-976.
Hashimoto, S.-I., Suzuki, T., Nagai, S., Yamashita, T., Toyoda, N. and Matsushima, K. 2000. Identification of genes specifically expressed in human activated and mature dendritic cells through serial analysis of gene expression. Blood
96:2206
-2214. Hogenesch, J., Ching, K., Batalov, S., Su, A., Walker, J., Zhou, Y., Kay, S., Schultz, P. and Cooke, M. 2001. A comparison of the Celera and Ensembl predicted gene sets reveals little overlap in novel genes. Cell 106:413 -415.[CrossRef][Medline]
Hume, D.A., Ross, I.L., Himes, S. R., Sasmono, R.T., Wells, C.A. and Ravasi, T. 2002. The mononuclear phagocyte system revisited. J. Leukoc. Biol. 72:621
-627.
Kapranov, P., Cawley, S., Drenkow, J., Bekiranov, S., Strausberg, R., Fodor, S. and Gingeras, T. 2002. Large-scale transcriptional activity in chromosomes 21 and 22. Science
296:916
-919. Kawai, J., Shinagawa, A., Shibata, K., Yoshino, M., Itoh, M., Ishii, Y., Arakawa, T., Hara, A., Fukunishi, Y., Konno, H., et al. 2001. Functional annotation of a full-length mouse cDNA collection. Nature 409:685 -690.[CrossRef][Medline]
Kent, W.J. 2002. BLATThe BLAST-like alignment tool. Genome Res. 12:656
-664. Lander, E., Linton, L., Birren, B., Nusbaum, C., Zody, M., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. 2001. Initial sequencing and analysis of the human genome. Nature 409:860 -921.[CrossRef][Medline]
Miki, R., Kadota, K., Bono, H., Mizuno, Y., Tomaru, Y., Carninci, P., Itoh, M., Shibata, K., Kawai, J., Konno, H., et al. 2001. Delineating developmental and metabolic pathways in vivo by expression profiling using the RIKEN set of 18,816 full-length enriched mouse cDNA arrays. Proc. Natl. Acad. Sci.
98:2199
-2204. Okazaki, Y., Furuno, M., Kasukawa, T., Adachi, J., Bono, H., Kondo, S., Nikaido, I., Osato, N., Saito, R., Suzuki, H., et al. 2002. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420:563 -573.[CrossRef][Medline]
Ravasi, T., Wells, C., Forest, A., Underhill, D.M., Wainwright, B.J., Aderem, A., Grimmond, S. and Hume, D.A. 2002. Generation of diversity in the innate immune system: Macrophage heterogeneity arises from gene-autonomous transcriptional probability of individual inducible genes. J. Immunol. 168:44
-50. Staudt, L. and Brown, P. 2000. Genomic views of the immune system. Annu. Rev. Immunol. 18:829 -859.[CrossRef][Medline]
Venter, J., Adams, M., Myers, E., Li, P., Mural, R., Sutton, G., Smith, H., Yandell, M., Evans, C., Holt, R., et al. 2001. The sequence of the human genome. Science
291:1304
-1351. Waterston, R.H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J.F., Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., An, P., et al. 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420:520 -562.[CrossRef][Medline]
http://www.imb.uq.edu.au/groups/hume/; access to indirect labeling protocol. http://genome-www5.stanford.edu/cgi-bin/SMD/source//sourceBatchSearch; Stanford SOURCE batch server. http://www.tigr.org/tdb/tgi/software/; TGI clustering tools. http://www.tigr.org/tdb/tgi/mgi; TIGR Mouse Gene Index
Received December 11, 2002;
accepted in revised format February 25, 2003.
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||