|
|
|
|
Vol. 12, Issue 3, 379-390, March 2002
LETTER
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |
ABSTRACT |
|---|
|
|
|---|
The ribosome, as a catalyst for protein synthesis, is universal and essential for all organisms. Here we describe the structure of the genes encoding human ribosomal proteins (RPs) and compare this class of genes among several eukaryotes. Using genomic and full-length cDNA sequences, we characterized 73 RP genes and found that (1) transcription starts at a C residue within a characteristic oligopyrimidine tract; (2) the promoter region is GC rich, but often has a TATA box or similar sequence element; (3) the genes are small (4.4 kb), but have as many as 5.6 exons on average; (4) the initiator ATG is in the first or second exon and is within ± 5 bp of the first intron boundaries in about half of cases; and (5) 5'- and 3'-UTRs are significantly smaller (42 bp and 56 bp, respectively) than the genome average. Comparison of RP genes from humans, Drosophila melanogaster, Caenorhabditis elegans, and Saccharomyces cerevisiae revealed the coding sequences to be highly conserved (63% homology on average), although gene size and the number of exons vary. The positions of the introns are also conserved among these species as follows: 44% of human introns are present at the same position in either D. melanogaster or C. elegans, suggesting RP genes are highly suitable for studying the evolution of introns.
[The sequence data described in this paper have been submitted to the DDBJ/EMBL/GenBank databases under accession nos. AB055762-AB055780, AB056456, AB061820-AB061859, AB062066-AB062071, and AB070559.]
| |
INTRODUCTION |
|---|
|
|
|---|
The ribosome is the cellular organelle responsible
for protein synthesis in all cells. Recent analyses of the ribosome's
structure using X-ray crystallography have enhanced our understanding
of the structural basis of ribosome function (Ban et al. 2000
;
Schluenzen et al. 2000
; Wimberly et al. 2000
; Yusupov et al. 2001
). In
contrast, comparatively little is known about ribosome biogenesis,
especially in higher eukaryotes. In mammalian cells, the biogenesis of
cytoplasmic ribosomes requires assembly of 4 RNA molecules and 79 different proteins (Wool 1979
). With the exception of two proteins, all of these components are present as single copies within the ribosome. Typically, mammalian cells contain ~4 × 106 cytoplasmic
ribosomes, which account for 80% of all cellular RNA and 5%-10% of
cellular proteins.
Investigation of the mechanism that controls the coordinated expression
of these components is a challenge. Three different RNA polymerases are
involved in production of these RNAs and proteins, RNA polymerase I
(POL I) is involved in production of the 28S, 18S, and 5.8S rRNAs, POL
II in production of ribosomal proteins (RPs), and POL III in production
of the 5S rRNA. The amino acid sequences of all rat and human RPs have
been deduced (Wool et al. 1996
), and the nucleotide sequences of
thousands of eukaryotic rRNAs are now known (The Ribosome Database
Project; Maidak et al. 2001
). On the other hand, only a handful of
mammalian RP genes have been studied in terms of their genomic
structure. Unlike rRNAs, which are encoded by several hundred copies of
genes, each mammalian RP is typically encoded by a single gene. Single
functional genes generate large numbers of processed pseudogenes (Dudov
and Perry 1984
; Wagner and Perry 1985
; Kuzumaki et al. 1987
), which, however has hampered the cloning of the functional genes and, hence,
analysis of their genomic structure. Even though some enhancer/promoter sites have been identified (Rhoads et al. 1986
; Hariharan et al. 1989
;
Kenmochi et al. 1992
; Toku and Tanaka 1996
), we are far from
understanding the basis of the coordinated expression of RP genes.
Despite the central role played by cytoplasmic ribosomes in organismal
growth and development, the effects of their mutation have been largely
ignored, particularly with respect to human disease. One might predict
that genetic defects in the components of ribosomes would invariably
result in early embryonic death. However, there is strong evidence in
Drosophila that a quantitative deficiency of any one of the
cytoplasmic RPs can yield the viable but abnormal Minute
phenotype (Kongsuwan et al. 1985
; Lambertsson 1998
). Moreover,
heterozygous mutations in the ribosomal protein S19 gene
(RPS19) have been found in a subset of patients with Diamond-Blackfan anemia (Draptchinskaia et al. 1999
; Willig et al.
1999
), a rare form of chronic anemia characterized by the absence or
low levels of erythroid precursors in the bone marrow (Diamond et al.
1976
; Halperin et al. 1989
). It has been suggested that RPS4, encoded
by both the X and Y chromosomes, is an important factor for Turner
syndrome (Fisher et al. 1990
; Watanabe et al. 1993
), a complex human
phenotype associated with monosomy X (Zinn et al. 1994
). Finally,
RPL6 was mapped to a critical region for Noonan syndrome
(Jamieson et al. 1994
; Kenmochi et al. 2000
), and because of
similarities between the Noonan and Turner phenotypes (Noonan 1968
;
Allanson 1987
), the gene is considered an attractive candidate for the
disease. Although involvement of the RP genes in the pathogenesis of
the aforementioned diseases has yet to be proved, we are intrigued by
the possibility that defects in other RP genes might also underlie
certain pathological conditions.
To explore this possibility, we mapped all human RP genes to the
chromosomes and then compared the assigned positions with candidate
regions for Mendelian disorders (Kenmochi et al. 1998
; Uechi et al.
2001
). The results emphasize the need to conduct systematic analysis of
the genomic sequences of these genes to screen for mutations that could
disturb ribosomal function. In the present study, we determined the
genomic sequences of human RP genes, as well as the full-length cDNA
sequences. Together with the previously determined sequences, we analyzed the
characteristics of 73 RP genes with respect to intron/exon structure,
transcription start site, promoter region, and the 5' and 3'
noncoding regions. Comparative analysis of these genes among several
eukaryotes was also carried out. Finally, we evaluated the currently
available draft genome sequence using our data set of RP gene sequences.
| |
RESULTS |
|---|
|
|
|---|
Gene Structure
The human RP genes were cloned from the Keio BAC library by PCR
using sequence-tagged sites (STSs) originally developed from partial
genomic sequences of the genes (Kenmochi et al. 1998
; Uechi et al.
2001
). These STSs enabled us to distinguish the intron-containing functional genes from the processed pseudogenes, which, in turn, enabled us to clone 73 of the 80 human RP genes. Of these, 44 were
newly sequenced in the present study by use of the shotgun method. The
full-length cDNA sequences were also determined to analyze the
transcription start sites. Together with the previously determined
sequences, we analyzed 70 complete (including at least 400 bp of the
5'-flanking region) and 3 partial human RP gene sequences. The
accession numbers for these genes, including both newly and previously
determined sequences and the 5'-UTR sequences of the full-length cDNAs
are listed in Table 1.
|
Figure 1 shows the intron/exon structures and the positions corresponding to the translation start and stop sites. The average size of the genes from the transcription start site was ~4.4 kb; RPS4Y was the largest (25 kb), whereas RPS28 was the smallest (only 0.9 kb, Table 1). Each gene contained an average of 5.6 exons, ranging from 3 (RPS29 and RPL39) to 10 (RPL3 and RPL4). The translation initiator ATG was present either in the first or second exon, whereas the stop codon was in the last exon (all but RPS3, RPS25, RPS28, and RPL9). Interestingly, the ATG was always located near the splice sites of the first intron and, in 20 cases, was exactly at the 3' end of the first exon (Table 1).
|
Summarized in Figure 2 are various features
of these genes, including the sizes of the genes and coding sequences (CDSs), the sizes of the 5'- and 3'-noncoding regions, and the size and number of exons. According to the draft sequence of the human genome,
the average sizes of genes, CDSs, exons, and the 5'- and 3'-noncoding
regions are 27 kb, 1340 bp, 145 bp, and 300 bp and 770 bp, respectively
(International Human Genome Sequencing Consortium 2001
). RP genes, in
contrast, were fairly small; with introns of only 760 bp on average,
most were <5 kb in length. The first exons were also small, 45 bp on
average, although the others were 124 bp, which is comparable with the
genome average of 145 bp. The 5'- and 3'-noncoding regions were 42 and
56 bp, respectively, which is also significantly smaller (14 times
smaller in the case of the 3'-noncoding region) than the genome
averages (Table 1; Fig. 2). Similar features were reported in
Xenopus laevis RP genes (Amaldi et al. 1995
), suggesting they
are common among vertebrate RP genes.
|
During our sequencing efforts, we found that many small nucleolar RNAs
(snoRNAs) were encoded within the introns of the RP genes (Fig. 1).
snoRNAs function as guide RNA, mostly in the modification of
pre-ribosomal RNA
that is, site-specific ribose methylation and
pseudouridylation through base pairing with the target RNA (Maxwell and
Fournier 1995
; Nicoloso et al. 1996
; Smith and Steitz 1997
; Huttenhofer
et al. 2001
). To date, 106 methylations and 91 pseudouridylations have
been identified in human rRNA, and about one-half of these have been
tentatively assigned to known snoRNAs. Together with the putative
genes, 54 copies of 38 snoRNA genes were identified within introns of
26 RP genes, accounting for about one-third of the known snoRNAs.
Promoter Features
To determine the transcription start sites of the genes, we analyzed
the 5'-UTR sequences of full-length cDNAs obtained using the
oligo-capping method (Kato et al. 1994
) and identified the start sites
on the genomic sequences. As shown in Figure 3,
transcription always started at a C
residue within a characteristic oligopyrimidine tract that varied from
5 bp to 25 bp in length (12 bp on average). Most often, it was the
second C residue that served as the transcription start site (Fig.
4). However, full-length cDNA analysis
revealed that the position of the start site C residue can vary within a gene (Fig. 5); in some cases,
transcription can begin at different C (or T) residues within a given
oligopyrimidine tact (e.g., RPL32); transcription can also
begin at different C residues within separate oligopyrimidine tracts
(e.g., RPL39); finally, even when transcription always begins
at the same C residue, its position may vary due to the presence of T
stretches of variable length (e.g., RPS20). With respect to
the last, the observed variation in the length of the T stretches does
not appear to be an artifact of the oligo-capping method, as it was
only present within a T stretch at the 5' end of the gene and was also
detected in cDNA prepared by a different method (Kato et al. 1994
).
These sequence variations will appear in the DDBJ/EMBL/GenBank DNA
databases under accession numbers listed in Table 1 (5' UTR).
|
|
|
The average GC content in the 70 complete RP genes was 49%, that in
the promoter regions (
250 to +250 bp) was 61% (Table 1), which is
significantly higher than the genome-wide average of 41%
(International Human Genome Sequencing Consortium 2001
). The promoter
region of RPS21 had the highest GC content, 73%. We found CpG
islands in the promoter regions of all RP genes except RPL7
(data not shown), which is consistent with the characteristics of the
housekeeping genes described by Gardiner-Garden and Frommer (1987)
.
To investigate the coordinated control of RP gene expression at the
transcriptional level, the 5'-flanking regions were examined for
sequence elements that might serve as transcription factor binding
sites. We analyzed a region extending from the transcription start site
up to the
400-bp position in all 73 RP genes using TFSEARCH
(http://molsun1.cbrc.aist.go.jp/research/db/TFSEARCH.html). In
general, the 5'-flanking region of housekeeping genes is GC rich and
the promoter lacks TATA sequences. Likewise, this region of RP genes
was highly GC rich, as described above; however, there were also many
TATA or TATA-like sequences around at the
30-bp position. TATA box
consensus sequences were seen in 7 cases, and TATA-like sequences were
seen in 52 cases (Fig. 3). The presence of TATA boxes or related
sequences has also been reported in other vertebrate RP genes
(Hariharan et al. 1989
; Nakasone et al. 1993
; Higa et al. 1999
),
suggesting that they are a characteristic feature of RP genes.
Although none of the elements common to the 5'-flanking region was
found in all 73 RP genes, possible transcription factor binding sites
commonly seen included those for the GATA-binding protein family (45 cases), for CdxA (Chicken homeobox protein) (43 cases), for the Ets
protein family (34 cases), and for Sp1 (20 cases). Among these, the key
roles played by Ets and Sp1 in the transcription of RP genes have been
reported previously (Hariharan and Perry 1989
; Maeda et al. 1993
;
Genuario and Perry 1996
; Higa et al. 1999
). Possible Ets-binding sites
in the upstream region (up to
50 bp) are shown in Figure 3.
Interspersed Repeats
We found 381 interspersed repeats in the sequences of 70 RP genes
(partial sequences were excluded from this analysis). The sequences
including 400 bp of the 5'- and 3'- flanking regions of the individual
genes were searched for repeats using the RepeatMasker at
the University of Washington
(http://repeatmasker.genome.washington.edu). Alu elements were
the most common; on average, they appeared 3.0 times in each gene,
accounting for 13% of the entire sequence (211 copies in total). On
the other hand, 23 Alu repeats were found in introns 2 and 3 of RPL22, which accounted for 46% of the entire gene,
significantly more than the genome average of 10.6% (International
Human Genome Sequencing Consortium 2001
).
Comparative Analysis
The structures of human RP genes were compared with those from the
fruitfly D. melanogaster, the nematode worm C. elegans, and the budding yeast S. cerevisiae, all of which
are eukaryotes whose entire genome has been sequenced. Although the
CDSs were comparable in both size and sequence and showed 59% to 69%
homology between any two of these species, the genomic structures were varied and showed significant changes to have occurred during evolution
(Tables 2 and 3). The
human RP genes were 4-5 times larger than those from the other species
because of increases in the size and number of introns. In contrast,
the exons were somewhat smaller (Table 2). All human RP genes had at
least two introns, whereas 36% of the yeast genes had no introns.
Interestingly, nematode worm RP genes had more introns than fruitfly
genes and a single worm gene and 13 fly genes had no introns.
|
|
We also compared the positions of the introns among these species. In humans, we found 249 introns within the coding regions of the genes. Among them, the insertion sites of 136 were unique to the human genes, 77 were the same in humans and flies, and 60 were the same in humans and worms. Of these, 26 introns (10% of the total) were common to all three species (Fig. 6). In contrast, only 7 introns shared the same insertion sites in humans and yeast, and the position of only one, the second intron of RPL14, was conserved among all four species. About 80% of fruitfly introns were present in human RP genes, but only 30% of these introns appeared in worms. A comparison of the intron insertion sites in eukaryotic RPL8 genes is summarized in Figure 7.
|
|
| |
DISCUSSION |
|---|
|
|
|---|
Evaluation of the Draft Genome Sequence
To evaluate the publicly available draft sequence of the human
genome, our data set of RP gene sequences was compared with those
appearing in the draft sequence. We found 32 RP genes in the finished
sequence and 43 genes in the unfinished sequence, which together
accounted for 94% of the RP genes; the same value as the claimed
coverage of the human genome (International Human Genome Sequencing
Consortium 2001
). Although sequences that appeared in the finished
sequence are accurate, those in the unfinished sequences still have
some minor problems (as of July 10, 2001), including misassembled
sequences and/or sequence gaps in five cases (data not shown). This
suggests that, for the time being, the draft sequence should be
carefully interpreted. Moreover, even if the sequence is accurate, we
need to know the transcription start sites to determine the complete
gene structures and identify the promoters. Generation of full-length
cDNA sequences, as was done in the present study, should facilitate
this analysis.
Promoter Structure and Gene Expression
In prokaryotes, the RP genes are organized into a small number of
operons, each containing genes for up to 11 RPs under the control of a
single promoter (Nomura et al. 1984
). In contrast, in humans, RP genes
are scattered over the genome (Kenmochi et al. 1998
; Uechi et al.
2001
). But, although encoded at widely dispersed genomic sites, RPs are
assembled into the ribosome with stoichiometric precision; thus,
clustering of RP genes into operons, as in bacteria, is not an
important means of regulated coproduction of RPs in humans. The
situation is similar in other eukaryotes, such as D. melanogaster, C. elegans, and S. cerevisiae. It
has been argued that the translational control of RP gene expression is
the most prevalent regulatory mechanism operating in higher eukaryotes
(Amaldi et al. 1995
; Meyuhas et al. 1996
). Nevertheless, in yeast,
regulation at the transcriptional level seems to dominate RP production
(Warner 1999
). Recent experiments using DNA array technology have shown
that expression of RP mRNAs in yeast is strictly regulated in a manner
responsive to changes in growth conditions (Brown and Botstein 1999
).
Systematic analysis of the human transcriptome also suggests that
transcriptional regulation plays an essential part in the expression of
this class of genes (Kawamoto et al. 2000
; N. Kenmochi and K. Okubo,
unpubl.). Although we found possible binding sites for various
transcription factors in the 5'-flanking regions, common regulatory
factors such as Rap1, which controls most yeast RP gene expression
(Lascaris et al. 1999
), have not yet been identified elsewhere. The
only sequence element that emerged from our studies so far is the
oligopyrimidine tract, which is located at the transcription start site
of the genes. Searches for additional regulatory elements, combined
with analyses of the expression profiles under various conditions, will
need to be carried out if a better understanding of the coordinated control of RP production in humans is to be achieved.
Evolution of Introns
By comparing the positions of introns in RP genes from various species (Fig. 6), we found that about one-half of nematode worm introns (60 of 123) are represented at the same position in the corresponding human gene, but 33 of these introns are not present in fruitflies. Moreover, 26 of these introns apparently disappeared from the fruitfly genome, resulting in a reduced number of introns in the corresponding gene (e.g., RPL8; see Fig. 7). It would be interesting to know whether, during evolution, these introns were deleted from the fruitfly genome after the three species had separated, or whether they were inserted into the same positions in the human and worm genomes.
In that regard, RP genes are well suited for studying the evolution of
introns. Advantages they offer include a large number of family members
(79 proteins), a large number of introns per length of CDS (8 introns/kb), and highly conserved CDS sequences (e.g., human and
fruitfly CDSs share 69% homology; see Table 3). The size and sequence
of CDSs are very similar among eukaryotes; consequently, they are
highly homologous. Furthermore, the amino acid sequences are nearly
identical in mammals, and one can find a yeast homolog for all but one
of the human RPs. Therefore, it is fairly easy to compare the intron
positions within RP gene sequences. In fact, we identified 26 introns
located at the same position in human, fruitfly, and nematode worm RP
genes, although a large fraction of the introns are unique to the
individual species. In addition, many snoRNAs are encoded within the RP
gene introns (Maxwell and Fournier 1995
), and transcriptional control
elements are also found there (Chung and Perry 1989
), perhaps
indicative of new roles for introns in eukaryotic gene expression. RP
genes thus provide a large data set useful for investigating the
evolution of introns and their function.
Implications for Human Disease
Evolutionary and genetic considerations allow us to predict the
roles of RP genes in human disease. Among multicellular animals, the
consequences of mutations in RP genes have been studied most thoroughly
in Drosophila. Here, mutations resulting in reduced expression
of individual RPs yield the Minute phenotype characterized by
short and thin bristles, reduced body size, diminished fertility, and
recessive lethality (Schultz 1929
; Lambertsson 1998
). Because a full
complement of RPs is required to assemble a functional ribosome,
Minute cells are thought to contain fewer ribosomes and thus
have less capacity for protein synthesis (Kay and Jacobs-Lorena 1987
).
As RPs are highly conserved between Drosophila and humans, it
is likely that defects in human RPs will also result in ribosomal dysfunction leading to pathological conditions. In fact, as mentioned earlier, RPS4 and RPL6 are postulated to be candidate
genes for Turner and Noonan syndromes, respectively (Fisher et al.
1990
; Jamieson et al. 1994
; Kenmochi et al. 2000
). Moreover,
RPS19 is mutated in patients with Diamond-Blackfan anemia, so
far the only reported case in which RP gene mutation is associated with
human disease (Draptchinskaia et al. 1999
; Willig et al. 1999
).
Nevertheless, it remains unclear how phenotypes arise from RP defects.
It would be of great interest to us to know the mechanism by which
specific RP mutations disturb normal cell function and lead to abnormal phenotypes.
Meanwhile, transcriptome analysis of Ts65Dn, a segmental trisomy mouse
and a model of Down syndrome, has shown that expression patterns of 14 RP genes in the brains of 30-day-old mice are significantly different
from those of the normal mice, nine are underexpressed, and the others
are overexpressed (Chrast et al. 2000
). This implicates abnormal
ribosome biogenesis in the development and maintenance of Down syndrome
phenotypes. Although no RP genes are present on chromosome 21 (Uechi et
al. 2001
), we found that these 14 RP genes have potential recognition
sites for the GA-binding protein (GABP) in the promoter region and/or
the first intron (data not shown). Because the gene encoding a subunit
of GABP is located in the Down syndrome locus, near the APP
gene in 21q21-q22.1 (Baxter et al. 2000
), and because GABP is thought
to act as both an activator and repressor of RP gene transcription
(Genuario and Perry 1996
), this protein might be involved in the
pathogenesis of Down syndrome through abnormal ribosome biogenesis.
Recent reports indicate that RPL38 is essential for early embryogenesis
and skeletal development, as shown in studies using mouse skeletal
mutations, Tail-short (Ts), Tail-short shionogi (Tss), and Rabo torcido (Rbt). The phenotypes of
these mice are similar and are characterized by a shortened kinky tail,
neural tube defects, and various skeletal abnormalities including
homeotic transformation of the axial skeleton (Hustert et al. 1996
;
Ishijima et al. 1998
; Tsukahara et al. 2000
). Heterozygous mutations in the Rpl38 gene were detected in all of these mice, and a
wild-type Rpl38 transgene rescued the Ts phenotype,
confirming the direct involvement of RPL38 deficiency in abnormal mouse
skeletal development (T. Shiroishi, pers. comm.). In addition,
Volarevic et al. (2000)
further implicated RP defects in abnormal
phenotypes in mice when they conditionally deleted the gene encoding
RPS6 in mouse liver and found that cell cycle progression was blocked
in hepatocytes after partial hepatectomy.
We recently completed chromosomal mapping of the human RP genes and
found certain genes that might be involved in disease by comparing
their assigned positions with candidate regions for Mendelian disorders
(Uechi et al. 2001
). The sequence data presented here allow us now to
screen for mutations in patients. Although RPS19 is the only
case with mutations in patients at present, more mutations in other RP
genes may be identified from such screening. Thus, together with the
mapping data, our sequence data should serve as a powerful tool for
studying ribosomapathy, a new class of human disease.
| |
METHODS |
|---|
|
|
|---|
Cloning
cDNA clones were isolated from the full-length cDNA libraries
prepared from mRNAs of human tissues and cell lines using the DNA-RNA
chimeric oligo-capping method described by Kato et al. (1994)
. BAC
clones were isolated from the Keio BAC library by the PCR screening
method (Asakawa et al. 1997
) using STSs specific to the human RP genes
(Kenmochi et al. 1998
; Uechi et al. 2001
). BAC DNAs were sheared by the
shotgun method using a nebulizer (Kawasaki et al. 1997
), and the
3-5-kb fragments were subcloned into XL1-Blue Escherichia
coli cells using the pUC19 plasmid vector. Subclones containing the
RP genes were selected by colony PCR and then sequenced.
Sequencing
Nucleotide sequences were determined by use of the shotgun
sequencing method as described previously (Kawasaki et al. 1997
). Plasmid DNAs from the isolated subclones were fragmented (1.1-1.3 kb)
and inserted into the pHSG398 vector. After electroporation to XL1-Blue
cells, DNAs from 48-96 clones were sequenced from both ends using ABI
PRISM DNA sequencers. These sequencing conditions provide 2.0-9.6
times redundancy. Sequencing data were edited and assembled using the
Staden or Phred/Phrap/Consed software
packages (Bonfield et al. 1995
; Ewing and Green 1998
; Ewing et al.
1998
). When necessary, sequencing primers were designed within the cDNA
sequences and used for primer walking to determine ambiguous
nucleotides and to fill unsequenced gaps. These sequences will appear
in the DDBJ/EMBL/GenBank DNA databases under accession numbers
AB055762-AB055780, AB056456, AB061820-AB061859, AB062066-AB062071
and AB070559, which are listed in Table 1.
Sequence Analysis
Intron/exon boundaries were determined by comparing the genomic
sequences with the corresponding cDNA sequences. Transcription start
sites were deduced from the 5'-UTR sequences of the full-length cDNAs.
GC contents and sequence homologies were calculated using GENETYX
version 11 (Software Development). We searched the 5'-flanking
regions for possible binding sites of transcription factors
using TFSEARCH at
http://molsun1.cbrc.aist.go.jp/research/db/TFSEARCH.html. Regions up to
the
400 bp were analyzed with a threshold score of 90. When searching
for TATA-like sequences, however, the threshold score was reduced to 50.
The RP gene sequences appearing in the draft sequence of the human genome were obtained by BLASTN search at NCBI (as of July 10, 2001) using the human cDNA sequences as the query. Sequences of the nematode worm C. elegans were obtained by BLASTP search from the C. elegans Genome Project at the Sanger Center (http://www.sanger.ac.uk/Projects/C_elegans/), and sequences of the fruitfly D. melanogaster were from the Berkeley Drosophila Genome Project (BDGP, http://www.fruitfly.org/). Sequences of the yeast S. cerevisiae were obtained by keyword search from SGD at Stanford University (http://genome-www.stanford.edu/Saccharomyces/).
| |
WEB SITE REFERENCES |
|---|
|
|
|---|
http://genome-www.stanford.edu/Saccharomyces/; sequences of the yeast S. cerevisiae were obtained by keyword search from SGD at Stanford University.
http://molsun1.cbrc.aist.go.jp/research/db/TFSEARCH.html; TFSEARCH.
http://repeatmasker.genome.washington.edu; Repeat Masker.
http://www.fruitfly.org/; the Berkeley Drosophila Genome Project (BDGP).
http://www.sanger.ac.uk/Projects/C_elegans/; the C. elegans Genome Project at the Sanger Center.
| |
ACKNOWLEDGMENTS |
|---|
We thank Dr. Atsushi Shimizu and the genome sequencing team at Keio University School of Medicine. This work was supported in part by grants from the Ministry of Education, Culture, Sports, Science and Technology of Japan, and Fund for "Research for the Future" Program from the Japan Society for the Promotion of Science.
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| |
FOOTNOTES |
|---|
Present addresses: 4Department of Rehabilitation Engineering, Research Institute of National Rehabilitation Center for the Disabled, Tokorozawa, Saitama, 359-8555, Japan; 5Central Research Laboratories, Miyazaki Medical College, 5200 Kihara, Kiyotake, Miyazaki 889-1692, Japan.
6 Corresponding author.
E-MAIL kenmochi{at}post.miyazaki-med.ac.jp; FAX 81-985-85-1514.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.214202.
| |
REFERENCES |
|---|
|
|
|---|
gene locus.
Genome Res.
7:
250-261Received September 7, 2001; accepted in revised form December 21, 2001.
This article has been cited by other articles:
![]() |
R. Yamashita, Y. Suzuki, N. Takeuchi, H. Wakaguri, T. Ueda, S. Sugano, and K. Nakai Comprehensive detection of human terminal oligo-pyrimidine (TOP) genes and analysis of their characteristics Nucleic Acids Res., June 1, 2008; 36(11): 3707 - 3715. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Bianchetti, Y. Wu, E. Guerin, F. Plewniak, and O. Poch SAGETTARIUS: a program to reduce the number of tags mapped to multiple transcripts and to plan SAGE sequencing stages Nucleic Acids Res., September 25, 2007; 35(18): e122 - e122. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Vinckenbosch, I. Dupanloup, and H. Kaessmann Evolutionary fate of retroposed gene copies in the human genome PNAS, February 28, 2006; 103(9): 3220 - 3225. [Abstract] [Full Text] [PDF] |
||||
![]() |