|
|
|
Published online before print
October 15, 2001, 10.1101/gr.184501
Vol. 11, Issue 11, 1861-1870, November 2001
LETTER
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |
ABSTRACT |
|---|
|
|
|---|
We investigated the changes in gene expression accompanying the development and progression of kidney cancer by use of 31,500-element complementary DNA arrays. We measured expression profiles for paired neoplastic and noncancerous renal epithelium samples from 37 individuals. Using an experimental design optimized for factoring out technological and biological noise, and an adapted statistical test, we found 1738 differentially expressed cDNAs with an expected number of six false positives. Functional annotation of these genes provided views of the changes in the activities of specific biological pathways in renal cancer. Cell adhesion, signal transduction, and nucleotide metabolism were among the biological processes with a large proportion of genes overexpressed in renal cell carcinoma. Down-regulated pathways in the kidney tumor cells included small molecule transport, ion homeostasis, and oxygen and radical metabolism. Our expression profiling data uncovered gene expression changes shared with other epithelial tumors, as well as a unique signature for renal cell carcinoma.
[Expression data for the differentially expressed cDNAs are available as a Web supplement at http://www.dkfz-heidelberg.de/abt0840/whuber/rcc. The array data have been submitted to the GEO data repository under accession no. GSE3.]
| |
INTRODUCTION |
|---|
|
|
|---|
Renal cell carcinoma (RCC) is one of the 10 most frequent
malignancies in Western societies. Advances in the
understanding of the genetics underlying the development of renal
epithelial tumors have lead to the recognition of distinctive types of
tumors. Genetic alterations play a role in determining both the
morphology and the behavior of tumors and underlie the most recent
classifications (Kovacs et al. 1997
; Störkel et al. 1997
). The most
common histological subtypes of RCC include clear cell (80%),
papillary (~10%), and chromophobe (<5%) carcinoma. Previous
studies have shown that these histological subtypes are genetically and
biologically different (Presti et al. 1991
; Kovacs et al. 1997
). Human
RCCs are derived from epithelial cells in the proximal and connecting
tubuli. Like many solid tumors, they contain other cell types in
addition to carcinoma cells. Especially clear cell RCC is generally
well vascularized, and infiltrating immune cells are frequently seen on
histological sections.
Many genes and signaling pathways are known to be involved in RCC
initiation and progression (Presti et al. 1991
; Linehan et al. 1993
).
Genes potentially involved in kidney cancer include the genes for von
Hippel-Lindau (Seizinger et al. 1988
; Gnarra et al. 1994
), vascular
endothelial growth factor (VEGF; Brieger et al. 1999
; Takahashi et al.
1999
), epidermal growth factor receptor (EGFR; Ishikawa et al. 1990
;
Moch et al. 1998
), transforming growth factor alpha (TGFA; Ishikawa et
al. 1990
; Lager et al. 1994
; Uhlman et al. 1995
; Moch et al. 1998
),
c-myc proto-oncogene (Drabkin et al. 1985
; Yao et al. 1988
), and
vimentin (Moch et al. 1999
). However, these molecular markers have not
yet gained general use in RCC diagnostics and prognosis. Only tumor
stage, determined by tumor extension, regional lymph node involvement,
and distant metastases has gained widespread acceptance among
pathologists and urologists as an indicator of patient prognosis
(Guinan et al. 1997
). Moreover, it is likely that many of the genes
involved in the initiation and progression of renal cancer are
currently unknown. The identification of differentially expressed genes in renal cell carcinoma could lead to the identification of markers for
biological phenomena such as invasiveness or metastasis, which would be
of significant value for diagnosis, prognosis, and treatment.
The highly parallel analysis of gene expression made possible by the
development of cDNA array technology provides a powerful tool for the
molecular dissection of cancer. A better understanding of the molecular
changes associated with tumor formation and progression could improve
the classification of cancer and provide clues to the development of
specific therapies for pathogenetically distinct tumor types. The
possibility of cancer classification based solely on gene expression
monitoring was shown for human acute leukemias (Golub et al. 1999
).
Gene expression profiling of diffuse large B-cell lymphomas identified
two molecularly distinct subtypes with significantly different overall
survival (Alizadeh et al. 2000
). Moreover, in a variety of solid human
tumors and tumor cell lines, variation in gene expression observed by
use of this mode of analysis has been correlated to phenotypic
characteristics (DeRisi et al. 1996
; Alon et al. 1999
; Perou et al.
1999
, 2000
; Bittner et al. 2000
; Ross et al. 2000
).
We use macroscopically selected samples of RCC and normal corresponding renal tissue. Microscopically, the estimated proportion of non-neoplastic cells in the tumor samples was typically <5%. The choice to use solid tumors, rather than cell lines or microdissected material, was motivated by the fact that it yields important insights into the origin, development, and progression of tumors and immune responses against tumor formation, which would not be available otherwise. Furthermore, it avoids possible artifacts caused by cell line immortalization or RNA amplification technology.
To identify genes that are differentially expressed in different types and stages of epithelial kidney cancer, we analyzed gene expression profiles of primary tumors, metastases, and normal renal tissues. Labeled single-stranded cDNA target was derived from tumor and normal mRNA and hybridized to 31,500-element nylon cDNA arrays. By quantification of the resulting signal from each spot, we obtained a measure for the relative abundance in the tissue samples of mRNA corresponding to each gene. Our study has yielded a well-annotated list of genes that are differentially expressed in renal cell carcinoma. These results should lead toward the identification of kidney tumor-specific marker genes and potential targets for new therapeutic strategies.
| |
RESULTS |
|---|
|
|
|---|
Analysis of Gene Expression in Renal Cell Carcinoma
To measure variation in gene expression between renal cell carcinoma
and normal renal tissue, we designed a cDNA array carrying a global
human cDNA set. We developed an algorithm to select 1 representative
clone from each of 41,120 UniGene clusters (Build 17, NCBI). This
resulted in 33,792 physically available, noncontaminated I.M.A.G.E.
(Lennon et al. 1996
) cDNA clones (www.rzpd.de), and the derivation of
consensus sequences for each UniGene cluster. Approximately 30% of the
clones represented known genes; the remaining 70% were unknown ESTs.
Approximately 31,500 cDNA clones from this set were amplified by PCR by
use of vector-specific primers and spotted in duplicate to a set of two
22 × 22-cm nylon membranes (Human UniGene 1; www.rzpd.de). An
estimated 30% of the UniGene clusters were represented by more than
one clone (see Methods), thereby providing internal controls for the
reproducibility of gene expression quantitation. We used these cDNA
arrays to generate expression profiles from 32 primary RCC samples,
matched with normal renal tissue from the same patients, and five liver
and pancreas metastases of RCC (Table 1).
Radioactively labeled cDNA representations prepared from each
patient's tumor and normal messenger RNA sample were hybridized in
parallel onto arrays from the same production batch. For the
metastases, normal renal tissue from another patient with a primary RCC
was used for comparison. Each hybridization was performed twice by use
of independently labeled cDNA target from the same mRNA isolation. In
total, more than 8,000,000 gene expression measurements were made in 69 malignant and normal samples by use of 50 reusable arrays.
|
Selection of Differentially Expressed Genes
Despite several reports indicating that tumor (sub)types can be
distinguished with DNA microarray data, the methods of data analysis in
the presence of considerable technical noise are far from being
established (Brazma and Vilo 2000
). The identification of
differentially expressed genes is biologically important by its own
right, as well as an essential step toward classification. Figure
1 shows the histograms of measured
tumor/normal ratios for two genes. They indicate systematic
differential expression for these genes, but also reflect considerable
inter-individual variation, as well as experimental noise. We surveyed
the histograms for a large number of genes and ESTs. Ratio-voting
criteria are often used to select differentially expressed genes. We
used a criterion that marks genes for which 30% of the ratios are
>3.5 as up-regulated, and those in which 30% are below a ratio of
1/3.5 as down-regulated. Applied to the data from 31 tumors of stages I, III, or IV, this yielded 230 differentially expressed genes, of
which 90 were up-regulated and 140 down-regulated.
|
Higher sensitivity and better control over the rate of false positives
was obtained through a statistical test that is based on the sign
statistic, which we call adapted sign test in the following; it counts
for each gene the number of times that its measured intensity in the
set of repeated pair-wise comparisons is higher in tumor than in normal
tissue. This number is compared with what is expected if the gene is
not differentially expressed, and a P value is calculated. In
the calculation of the null distribution, correlations between
repetitions were taken into account. At an approximate significance
level of 2 × 10
4 (
6 false positive calls expected in
the total data set of 31,500 clones), we found 1023 clones
up-regulated, and 715 clones down-regulated in tumor tissue (Fig.
2).
|
The two criteria select genes for quite different properties. Whereas the ratio-voting criterion selects genes that are differentially expressed by a large factor in at least a certain fraction of the population, the adapted sign test selects genes that are nearly always differentially expressed by whatever small amount. For our data and with the given parameters, we found the gene selection from ratio voting to be an almost strict subset of the selection from the adapted sign test. It is instructive to consider the power of the tests, for example, to determine how the number of selected genes depends on the number of patient samples profiled, random subsets of the total data set were drawn. For each of these subsets, the size of the selected gene list was calculated. The number of genes identified by the adapted sign test increases continuously with the number of experiments, and does not reach saturation with the present data set size of 37 patients (Fig. 3A). In contrast, the mean number of genes found by the ratio-voting criterion transiently decreases and before reaching a plateau, whereas its variation drops remarkably (Fig. 3B).
|
To identify genes that are correlated with different types and stages
of renal cell carcinoma, we performed two-sample permutation t-tests for differential expression across different subsets
of tumor samples (Table 2). The tests were
based on the log ratios between normalized tumor and normal intensities
of each patient for the 20,000 clones that did not show consistently
low intensities, and were conducted at a significance level of
10
3 (
20 false positives expected). The comparison of
chromophobe versus clear cell RCC yielded the largest set of 123 clones, probably reflecting the different cellular origin, histological
characteristics, and cytogenetic background of these 2 tumor types. The
class distinction between clear cell carcinoma of stages I-III as
opposed to stage IV and metastasis, yielded 44 candidate clones. Even
taking into account the expected proportion of false positives, it is
likely that we are beginning to identify genes that are involved in
tumor progression that could be used in tumor stage diagnosis.
|
Renal Cell Carcinoma Expression Patterns
A comprehensive list of differentially expressed genes in primary
RCC was assembled by applying the adapted sign test and the
ratio-voting criterion to the data grouped by tumor stages, as well as
to the pooled data. The identities of 892 cDNAs have been sequence
verified, including all of those referred to here by name. Of these
cDNAs, 584 were annotated genes and 308 were ESTs. After excluding
different clones representing the same known gene, we found 167 transcripts to be up-regulated and 154 down-regulated in RCC. These
genes were classified into the category's biological pathway and
cellular component using the terminology proposed by the Gene Ontology
Consortium (Ashburner et al. 2000
). We based the classification mainly
on the functional information available in the GeneCards
(http://bio-www.ba.cnr.it:8000/GeneCards/index.html) and Genatlas
(http://www.citi2.fr/GENATLAS/welcome.html) databases. Several groups
of coexpressed genes provided views of the activities of specific
biological pathways (Fig. 4).
|
A large group of genes involved in cell adhesion was consistently
expressed higher in tumor, including fibronectin 1,
collagen 4A, and laminin A4 (Fig.
5A). The latter two are major structural components of glomerular basement membranes. Fibronectin is also overexpressed in prostate cancer cells (Sonmez et al. 1995
; Suer et al.
1996
). Transcripts encoding gene products functioning in different
signal transduction pathways, such as thymic hormones (prothymosin
and thymosin
-4), GTP-binding proteins (e.g., guanylate-binding
protein 2), kinases (e.g., tyk2), and zinc finger transcriptional
regulators (e.g., ZNF76), were more often found to be overexpressed in
the tumors. Other biological processes that showed a large proportion
of genes overexpressed in renal cell carcinoma were nucleotide and
nucleic acid metabolism (encoding gene products involved in mRNA
transcription and stability as well as in DNA replication), protein
metabolism and modification (mostly ribosomal subunits), cell shape and
cell size (several actin-interacting and remodeling proteins;
Fig. 5B), and immune response (e.g., MHC molecules). Tumor markers
described for renal cell carcinoma that were up-regulated in our data
set include vimentin, VEGF, EGF-B, and EGFR.
|
Down-regulated biological pathways in the kidney tumor cells included
transport (e.g., renal-specific transport of small molecules; Fig. 5C),
ion homeostasis (metallothioneins; Fig. 5D), oxygen and radical
metabolism (e.g., glutathione S transferases; Fig. 5E), and electron
transport (cytochrome oxidase complex components). Remarkably, the gene
for free radical detoxification enzyme superoxide dismutase 2 is
strongly overexpressed. Characteristic changes occurred in the
carbohydrate metabolism of renal cell carcinoma, confirming and
complementing earlier studies (Steinberg et al. 1992
). Glycolysis was
activated (phosphoglycerate kinase, enolase and phosphofructokinase)
and gluconeogenesis was reduced (fructose-1,6-biphosphatase and
aldolase B).
Categorizing by cellular compartment showed less pronounced trends for up-regulation or down-regulation. Gene products localized in the extracellular matrix and nucleus were more frequently up-regulated, whereas secreted and mitochondrial proteins were relatively more often down-regulated in tumor tissue (data not shown).
Activation of Cell Communication Pathways
Cell communication seems to be the major target for activation, with
many genes involved in cell adhesion and signal transduction being
overexpressed (Fig. 4). In carcinomas, including RCC, the interactions
of cells with the extracellular matrix are disturbed (Lohi et al.
1998
). The interstitial matrix collagens and fibronectin appear to be
widely distributed in RCC. As a dynamic component of basement
membranes, laminins are important in kidney development and are
involved in RCC progression. Two up-regulated metalloproteinase inhibitors, TIMP1 and TIMP2, are involved in embryonal development and
in the invasive phenotype of acute myelogenous leukemia
(Janowska-Wieczorek et al. 1999
). In addition to the extracellular
matrix components, the expression of cell surface receptors for
extracellular matrix components is disturbed in RCC. We found
overexpression of the tumor-associated transmembrane proteins
epithelial membrane protein 3, cell differentiation antigen CD68,
melanoma adhesion molecule, and GP110. The fibronectin receptor
integrin
5 was overexpressed as well. Extracellular matrix proteins
may function in RCC progression by binding and regulating the activity
of growth factors, such as transforming growth factor
1 and basic
fibroblast growth factor. Some of the observed changes in signal
transduction pathways could reflect cellular responses to these stimuli.
Down-Regulated Pathways and Genes
Kidney-specific pathways appear to be repressed as the tumor cells
dedifferentiate, with lower expression of genes involved in small
molecule transport, ion homeostasis, and oxygen and radical metabolism
(Fig. 4). Down-regulation of specific members of the metallothionein
family has been described in RCC (Izawa et al. 1998
; Nguyen et al.
2000
). In addition, we found genes that are involved in other types of
cancer. Mutations of CDKN1C (p57kip2) are
associated with sporadic cancers and Beckwith-Wiedemann syndrome suggesting that it is a tumor suppressor candidate (Lee et al. 1995
;
Matsuoka et al. 1995
, 1996
; Hatada et al. 1996
). Another down-regulated
gene involved in cell cycle regulation is GADD45A (growth
arrest and DNA-damage-inducible,
), which inhibits the entry of
cells into the S phase. The MPP3 protein (membrane protein, palmitoylated 3) is a membrane-associated guanylate kinase involved in
coupling the cytoskeleton to the cell membrane, and the human homolog
of Drosophila lethal discs large tumor-suppressor protein. Syndecan, a cell surface proteoglycan that links the cytoskeleton to
the interstitial matrix is underexpressed in squamous cell carcinoma of
the head and neck, whereas melastatin 1 was found down-regulated in a
murine melanoma cell line with an aggressive phenotype. The S100
calcium-binding protein A2 gene product may play a role in suppressing
tumor cell growth. The latter two genes belong to the biological
process of ion homeostasis, which is prominently down-regulated in the
RCCs. On the other hand, we found down-regulation of some genes that
were described to be up-regulated in other tumor types. The PDZ
domain-containing protein (PDZK1) is overexpressed in selected tumors
of epithelial origin (Kocher et al. 1999
). It is potentially involved
in cell proliferation, differentiation, and ion transport. However, in
many of the kidney tumors PDZK1 was specifically down-regulated. M1S1
(membrane component, chromosome 1, surface marker 1) is a glycoprotein,
identified by monoclonal antibody GA733. This tumor-associated antigen
may function as a growth factor receptor and is expressed in normal trophoblast cells, in multistratified epithelia and carcinomas. One of
the strongest down-regulated genes was kininogen, which has
many physiological functions, including inhibition of cysteine proteases, which could result in extracellular matrix degradation (Muller-Esterl et al. 1985
).
Molecular Dissection of Renal Cell Carcinoma
To decide whether clusters of tumor-specific genes were derived from
the RCC cells or from other cell types present in the tumor, we
compared our expression data with published sets. Ross et al. (2000)
found that most cell lines derived from RCCs were characterized by
genes whose products are involved in stromal cell functions, such as
synthesis and modification of the extracellular matrix.
Genes defining this mesenchymal cluster that were also activated in our set of RCCs include melanoma adhesion molecule MUC18,
vascular cell adhesion molecule 1, fibronectin 1, caveolin 1, collagen
type IV
-1, collagen type V
-2, collagen binding protein 2, lysyl
oxidase, and annexin II. Genes characteristically expressed by cell
types other than carcinoma cells were detected as well as follows: (1)
von Willebrand factor, strongly expressed in endothelial cells, also
observed in breast tumors (Perou et al. 2000
); (2) markers of
macrophage/monocytes, in common with the macrophage cluster expressed
in breast tumors (Perou et al. 2000
), including CD68 and lysozyme; (3)
B lymphocyte-specific genes, for example the B cell activation gene
BL34 and the cytokine pre-B cell-enhancing factor; (4) genes
involved in antigen processing and presentation, such as MHC class II
genes, peptide transporter TAP 1 (RING4), TAP binding protein, and
proteasome subunit
type 8 (PSMB8). These results confirm the
histological finding of infiltration of the tumors with cells involved
in the immune and inflammatory response. Several interferon-induced
genes were also specifically expressed in the tumor tissue, including
interferon-inducible protein 9-27, interferon
-inducible protein 27 (IFI27), guanylate-binding protein 2 (GBP2), monokine induced by
interferon (MIG), and interferon-
-induced protein (IFI 16), PSMB8,
and TAP1.
Supplementary Information and Array Data
The array data are reported in the Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) under accession no. GSE3. In addition, the author's Web site (http://www.dkfz-heidelberg.de/abt0840/whuber/rcc) presents a comprehensive collection of the original data and the results, in particular the list of 1738 clones (including the distributions of their estimated fold changes), documentation of the sequence verification, and the expression patterns associated with different biological processes.
| |
DISCUSSION |
|---|
|
|
|---|
We have generated a general set of >31,500 human cDNA clones, which represents one of the largest gene sets interrogated in array-based gene expression studies to date. The consensus sequences for all clusters, the cDNA clones, their PCR products, and high-density spotted Nylon cDNA arrays are available at http://www.rzpd.de. Currently, the clone set is being enlarged to 75,000 clones based on UniGene build 90. Here, we have characterized variation in gene expression in a set of surgical specimens of primary and metastasized renal cell carcinoma and normal renal epithelium from 37 different individuals. To eliminate patient-specific variations in gene expression, the primary tumors were directly compared with normal renal tissue from the same patients, and our statistical analysis was optimized for this experimental setup. We present a list of genes that are differentially expressed in neoplasms of renal epithelium compared with normal epithelium (see on-line supplementary information as noted above). Generally, there are two reasons for finding differential expression between primary tissues, different expression levels of genes within the cells, or varying cell type composition. According to histological examination, both the renal cell carcinoma and the normal renal cortex consist to a large majority (estimated at 90%-95%) of epithelial cells. Still, part of the observed differences in gene expression may be influenced by differences in cell type proportion.
The classification according to biological process gives insights into the molecular changes occurring in tumor development and progression (Fig. 5; on-line supplementary information). In addition, we find evidence for a distinction based on gene expression between two different subtypes of RCC, and between different stages of malignancy (Table 2). The largest set of candidate discriminator genes was found in a two-class comparison of clear cell versus chromophobe RCC. These differences in gene expression probably reflect the biological and clinical differences between the histological subtypes. The clear cell carcinoma originates from proximal tubuli that have a mesodermal origin, whereas the less malignant chromophobe carcinoma is derived from connecting ducts with an endodermal origin. Future studies including more chromophobe tumors may elucidate gene expression patterns specific for this difference in embryonal origin.
We used two different criteria to select differentially expressed
genes. A ratio-voting criterion selects genes that show a fold change
above a threshold in a defined number of experiments (Eisen et al.
1998
; Perou et al. 2000
). A robust test with a controlled type I error,
and power that increases with data set size, is based on the sign
statistic. We estimated its null distribution in the presence of
correlations. On the basis of data from 37 patients, this resulted in
1738 cDNAs representing differentially expressed genes, with 6 false
positives expected. For cells as drastically different as tumor and
normal cells, the concept of a well-defined set of truly differentially
expressed genes may be evasive. Whereas the ratio-voting criterion may
identify the most important candidate genes for robust molecular
diagnosis, the more subtle changes in gene expression discovered by the
adapted sign test may lead to better insights into the molecular
changes involved in cancer development and progression.
The translation of gene expression data to potentially useful targets
for molecular diagnosis and treatment depends largely on correct and
complete functional annotation. Genes involved in the same biological
process often group together in experimental expression clusters (Eisen
et al. 1998
; Ashburner et al. 2000
). In contrast, molecular function
and cellular component annotations correlate less well with clustered
expression patterns. However, by use of this information, intelligent
predictions can be made concerning cancer detection and therapy. Our
expression data indicate that RCCs display a mostly mesenchymal
expression pattern, in accordance with the mesodermal origin of the
tumor cells (Ross et al. 2000
). The RCC expression profiling data show
both gene expression changes shared with other epithelial tumors, and a unique signature for RCC. The identification and classification of
differentially expressed genes is the beginning of a more complete understanding of kidney cancer. An annotated list of the expression data for the categorized genes is presented in the web supplement. On
the basis of these results, we have designed a kidney tumor-specific array that will enable higher throughput screening of additional samples and ultimately lead to a clinical classification of RCC on the
basis of their expression profiles.
| |
METHODS |
|---|
|
|
|---|
Selection of a Global Human Clone Set
To generate a nonredundant human clone set, we post-processed the
UniGene clusters (Build 17, NCBI), which represent a large number of
human genes. All processing steps are part of the GeneNest software (Haas et al. 2000
), which additionally provides an interactive graphical interface to the post-processed UniGene database
(http://www.dkfz.de/tbi/services/GeneNest/index). To identify the most
reliable, representative clone from each cluster, we analyzed the
following criteria (sorted according to their importance): (1)
availability of clones at the RZPD; (2) quality of cDNA library of
origin; (3) presence of more than one read of the same clone in a
cluster, ensuring a higher confidence in the sequence-clone
relationship; (4) calculated insert size, selecting for larger inserts;
(5) presence of a poly(A) signal. We used a fuzzy logic-based rule
system to combine all clone selection criteria to obtain a quality
measure of each clone in the entire UniGene set. For each cluster, we
selected the clone with the highest quality as representative. To
estimate the redundancy of the global clone set, we resequenced >2700
clones of the set and found 12.8% wrongly assigned clones. All of
these belonged to clusters that were already represented by another
clone. Therefore, the overall redundancy of the clone set was estimated
to be 1.44-fold, and the 31,500 clones on the cDNA array represent an
estimated 21,875 different transcripts.
The 31,500-Clone Human cDNA Array
From the Human UniGene 1 clone set, cDNA inserts of the clones from
the 82 first 384-well microtiter plates were amplified by PCR in a
384-well format (MJ Research) by use of M13 forward (5'-CGTTGTAAAACGACGGCCAGT-3') and reverse primers
(5'-TTTCACACAGGAAACAGCTATGAC-3'). The 31,488 PCR products were
transferred in a 4 × 4 pattern onto a set of two 22 × 22-cm
Hybond N+ nylon membranes (Amersham Pharmacia Biotech) soaked in 0.4 M
NaOH using a Picking-Spotting-Robot (Linear Drives LDT) with 400-µm
pins (Genetix). After spotting, the arrays were carefully floated for 2 min on 0.4 M NaOH and 5 × SSC (pH 7.5) successively, air-dried
and cross-linked by UV. Every 4 × 4 block contained one spot with
PCR product from the bacterial kanamycin resistance gene to serve
as a guide spot in automated image analysis, one empty spot to serve as
background measurement and DNA from seven clones spotted in
duplicate. On each filter, a control plate containing putative
housekeeping genes and positive and negative controls was spotted. To
assure even quality between subsequent rounds of hybridization,
cDNA arrays were prestripped before their first use (Hauser et al.
1998
). To check for filter quality, M13 forward oligonucleotide
hybridizations were carried out.
Patient Samples
Macroscopically selected samples of the RCC and normal
corresponding renal tissue were snap frozen in liquid nitrogen by
pathologists within 30 min after dissection. The tumors were staged
according to TNM classification (Sobin and Wittekind 1997
), graded
according to the Fuhrmann grading system (Fuhrmann et al. 1982
), and
histologically subtyped according to the recommendations of the World
Health Organization (Mostofi and Davis 1998
). Microscopically, the
estimated proportion of non-neoplastic cells in the tumor samples was
typically <5%. Total RNA was extracted by use of the standard Trizol
method (Life Technologies) and poly(A)+ RNA was selected
using Dynabeads according to the manufacturer's recommendations (Dynal).
Hybridization
A total of 500-1000 ng of poly(A)+ RNA was reverse
transcribed using 500 ng (dT)18V primer and 50 µCi
33P
-dCTP (Amersham Pharmacia Biotech), 400 µM each of
dGTP, dATP, and dTTP, and 200 units of Superscript II reverse
transcriptase (Life Technologies). The RNA was removed by hydrolysis,
and the first strand cDNA cleaned up by gel filtration on a S-300 spin column (Mobitec). The incorporation rates were usually >30% and similar for reactions carried out simultaneously, yielding ~10-30 million counts/min. Each prestripped cDNA array was prewetted in 7.5 mL
of demineralized water and inserted into a glass tube (24 × 7 cm).
An equal volume of prewarmed 2× hybridization solution was added and
prehybridization was carried out for 2 h at 65°C in a total volume of
15 mL of 6× SSC, 5× Denhardt's, 0.5% SDS, 50 µg/mL salmon sperm
DNA, and 0.5 µg/mL each of freshly denatured Cot-1 DNA (Life
Technologies) and (dA)40 oligonucleotide. The first strand
cDNA was heat denatured for 5 min at 100°C and divided equally over
the two cDNA arrays belonging to one set. For guide spot hybridization,
the bacterial kanamycin resistance gene was amplified from a cosmid
cloning vector by PCR with the forward primer 5'-AGTGCCGGGGCAGGATCT-3'
and the reverse primer 5'-TCGTGATGGCAGGTTGGG-3'. 5 × 105
cpm of random-primed 33P-labeled kanamycin cDNA was added to
each filter. The arrays were hybridized for 20-24 h at 65°C. The
wash steps were all performed at 65°C for 10 min in 1× SSC/0.1%
SDS, 0.3× SSC/0.1% SDS, 0.3× SSC/0.1% SDS, and 0.1× SSC/0.1% SDS,
respectively. Membranes were blotted dry briefly, wrapped in SaranWrap,
exposed to PhosphorImager screens for 18-24 h and scanned with the
Storm 860 PhosphorImager (Amersham Pharmacia Biotech). The resulting
gray level images were partitioned and quantized with Xdigitise version
3.1 (H. Lehrach, MPI for Molecular Genetics, Berlin). After
hybridization, membranes were stripped (Hauser et al. 1998
) and stored
dry at room temperature for 6-8 weeks (approximately two half-lives of 33P) before being rehybridized. Membranes were reused up to
six times without significant loss of signal intensities.
Experimental Design, Quality Control, and Normalization
Measurements were repeated in a threefold hierarchical manner.
First, clones were spotted in duplicate onto the arrays. Second, each
mRNA was labeled and hybridized to two arrays. Third, multiple tissue
samples with the same pathological classification were investigated. To
obtain a good contrast between the expression levels for tumor and
normal tissue, a Latin square design was chosen for the repeated
same-patient hybridizations. Hybridizations were done in pairs, with
mRNA from tumor and normal tissue of the same patient being prepared at
the same time under identical conditions, and hybridized to filters
produced during the same spotting run. The intensities were adjusted
through affine transformations: Denoting by N the set of
background-corrected intensities for normal tissue, and by T
the set for tumor tissue, normalization was performed through the
transformations N
N +
, and
T
a T +
(Beißbarth et al. 2000
).
The multiplicative factor a was determined such that the
estimated mode of the distribution of log aT/N, conditional on
TN >
, came to lie at zero.
was set to the 90%-quantile of
TN This robust estimator is based on the assumption that
most genes represented on the filters have unchanged expression levels,
whereas smaller, but possibly different numbers of genes may be
up-regulated and down-regulated. The pseudocount
was used
to regularize the ratio estimation. For the present analysis, we set
to the 50%-quantile of the pooled intensities (N,
aT). The hybridization data reported here are available from the
GEO database (http://www.ncbi.nlm.nih.gov/geo/) under the accession
numbers GSM81 to GSM422.
Gene Selection Statistics
According to the ratio-voting criterion, a gene is considered
differentially expressed if at least 30% of the ratios
(aTphr +
)/(Nphrthinsp;+
) above 3.5 or below 1/3.5 is observed. The three indices p,
h, and r stand for the three levels of repetition, patient (p), multiple hybridization of the same RNA isolation (h), and duplicate spotting (r). The sign statistic
is given by
|
|
Under the null hypothesis of no differential expression, S is
approximately normally distributed with mean 0 and a variance that we
estimated from the data as follows: The
Sph =
rSphr can be
seen as identically distributed random variables with
Sph and Sp'h' uncorrelated for
different patients p
p'. There may be
correlations between different hybridization rounds of the same patient:
|
The variance of S is E [S2] = N1
V + N2 C, in which N1 is the total
number of hybridizations and N2 is the number of pairs of
repeated hybridizations from the same sample. V and C
were estimated for each gene by the standard unbiased estimators for variance and covariance from the data sample given by the Sph. Global estimates for V and C, with standard deviation
reduced by a factor of
|
Two-sample two-sided t-tests for differential expression across different subsets of tumor samples were performed for ~20,000 cDNAs that did not have consistently low intensity. The t statistic was computed on the mean log ratios Rp (obtained from Rphr by averaging over h and r). Avoiding normality assumptions, the tests were carried out as permutation tests.
| |
ACKNOWLEDGMENTS |
|---|
We thank G. Jakse and R.-H. Ringert for patient samples; J. O'Brien for the first sets of Unigene filters; K. Fellenberg, T. Beißbarth, B. Brors, and A. Frischauf for initial data analysis software and discussion; S. Kirby, M. Peters, and D. Schneider for re-arraying and streaking of contaminated clones; R. Will and R. Wittig for optimizing and performing PCR; S. Wiemann and his team for outstanding sequences; and B. Kornacker and M. Stauch for excellent technical assistance. This work was partly supported by a grant from the German Human Genome Project.
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| |
FOOTNOTES |
|---|
Present addresses: 5Department of Human and Clinical Genetics, Leiden University Medical Center, The Netherlands; 6Max-Planck-Institute for Molecular Genetics, Berlin, Germany.
7 Corresponding author.
E-MAIL a.poustka{at}dkfz-heidelberg.de; FAX 49-6221-423454.
Article published on-line before print: Genome Res., 10.1101/gr.184501.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.184501.
| |
REFERENCES |
|---|
|
|
|---|
Received February 15, 2001; accepted in revised form August 7, 2001.
This article has been cited by other articles:
![]() |
F. Xia, C. W. Lee, and D. C. Altieri Tumor Cell Dependence on Ran-GTP-Directed Mitosis Cancer Res., March 15, 2008; 68(6): 1826 - 1833. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. S. M. Ng, Y.-T. Cheung, X.-M. An, Y. C. Chen, M. Li, G. Hoi-Yee Li, W. Cheung, J. Sze, L. Lai, Y. Peng, et al. Cell Cycle-Related Kinase: A Novel Candidate Oncogene in Human Glioblastoma J Natl Cancer Inst, June 20, 2007; 99(12): 936 - 948. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Nicolau, R. Tibshirani, A.-L. Borresen-Dale, and S. S. Jeffrey Disease-specific genomic analysis: identifying the signature of pathologic biology Bioinformatics, April 15, 2007; 23(8): 957 - 965. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Jones and T. A. Libermann Genomics of Renal Cell Cancer: The Biology Behind and the Therapy Ahead Clin. Cancer Res., January 15, 2007; 13(2): 685s - 692s. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. O. Weinzierl, C. Lemmel, O. Schoor, M. Muller, T. Kruger, D. Wernet, J. Hennenlotter, A. Stenzl, K. Klingel, H.-G. Rammensee, et al. Distorted Relation between mRNA Copy Number and Corresponding Major Histocompatibility Complex Ligand Density on the Cell Surface Mol. Cell. Proteomics, January 1, 2007; 6(1): 102 - 113. [Abstract] [Full Text] [PDF] |
||||
![]() |
|