Genome Research

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


Genome Res. 13:1466-1477, 2003
©2003 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/03 $5.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental Research Data
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Kawasawa, Y.
Right arrow Articles by Yanagisawa, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kawasawa, Y.
Right arrow Articles by Yanagisawa, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Letter

G Protein-Coupled Receptor Genes in the FANTOM2 Database

Yuka Kawasawa1,6, Louise M. McKenzie2, David P. Hill2, Hidemasa Bono3, RIKEN GER Group3, GSL Members 4,5 and Masashi Yanagisawa1

1Howard Hughes Medical Institute, Department of Molecular Genetics, University of Texas Southwestern Medical Center at Dallas, Dallas, Texas 75390-9050, USA 2The Jackson Laboratory, Bar Harbor, Maine 04609, USA 3Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan 4Genome Science Laboratory, RIKEN, Hirosawa, Wako, Saitama 351-0198, Japan


    ABSTRACT
 Top
 ABSTRACT
 RESULTS AND DISCUSSION
 METHODS
 REFERENCES
 WEB SITE REFERENCES
 
G protein-coupled receptors (GPCRs) comprise the largest family of receptor proteins in mammals and play important roles in many physiological and pathological processes. Gene expression of GPCRs is temporally and spatially regulated, and many splicing variants are also described. In many instances, different expression profiles of GPCR gene are accountable for the changes of its biological function. Therefore, it is intriguing to assess the complexity of the transcriptome of GPCRs in various mammalian organs. In this study, we took advantage of the FANTOM2 (Functional Annotation Meeting of Mouse cDNA 2) project, which aimed to collect full-length cDNAs inclusively from mouse tissues, and found 410 candidate GPCR cDNAs. Clustering of these clones into transcriptional units (TUs) reduced this number to 213. Out of these, 165 genes were represented within the known 308 GPCRs in the Mouse Genome Informatics (MGI) resource. The remaining 48 genes were new to mouse, and 14 of them had no clear mammalian ortholog. To dissect the detailed characteristics of each transcript, tissue distribution pattern and alternative splicing were also ascertained. We found many splicing variants of GPCRs that may have a relevance to disease occurrence. In addition, the difficulty in cloning tissue-specific and infrequently transcribed GPCRs is discussed further.


G protein-coupled receptors (GPCRs) bind to and transduce a large variety of extracellular stimuli (ligands) such as hormones, neurotransmitters, autacoids, chemokines, enzymes, odorant, taste, and even light, thus mediating many physiological functions through interaction with heterotrimeric G proteins. Given the fact that a very significant proportion of known drugs interact with GPCRs (Wise et al. 2002Go), identification of mouse orthologs of human GPCRs is an important contribution to future development of human therapeutic agents. GPCRs are membrane-integrated receptor proteins and possess a unique seven membrane-spanning region. Sequence similarity between each member of the GPCR family is highly conserved, and the membrane-spanning region often shows the highest similarity, by which these receptor proteins are discriminated from any other proteins. Moreover, GPCR family genes are categorized into six subgroups based on sequence similarity (Table 1; Kolakowski Jr. 1994Go; Horn et al. 1998Go). Family A is a very large family containing rhodopsin, olfactory, biogenic amine, nucleic acid, bioactive lipid, and peptide receptors. Family B consists of secretin, calcitonin, parathyroid hormone, glucagon, vasoactive intestinal peptide receptors, etc. Family C contains metabotropic glutamate receptors (mGluRs), {gamma}-aminobutyric acid type B receptors (GABA-B), Ca2+-sensing receptor, and vomeronasal receptors type 2. Family D is fungal pheromone P- and {alpha}-factor receptors (STE2/MAM2). Family E is fungal pheromone A- and M-factor receptors (STE3/MAP3). Family F is related to slime mold cyclic adenosine monophosphate (cAMP) receptors. Recently, a growing number of new GPCR families have been reported. These include the frizzled family (Vinson and Adler 1987Go), smoothened (Alcedo et al. 1996Go), vomeronasal receptors type 1 (Dulac and Axel 1995Go), ocular albinism (Schiaffino et al. 1996Go), and Arabidopsis thaliana receptor GCR1 (Josefsson and Rask 1997Go). Although a receptor function or G protein coupling has not been experimentally demonstrated in some cases, we focused on collecting any probable cDNAs of GPCRs that fulfill the criteria mentioned above.


View this table:
[in this window]
[in a new window]
 
Table 1. Family Classification of GPCRs

 

Recent genomic analyses in human (Lander et al. 2001Go; Venter et al. 2001Go) reported that there are ~600 GPCR genes that belong to the Families A, B, and C. This number, however, excluded several putative GPCR families, such as a large family of odorant receptors (nearly 350 odorant receptor genes are estimated), taste receptors, frizzled/smoothened receptors, and Family D, E, and F receptors, implying that there are nearly a thousand GPCRs in the human genome (Conklin et al. 2000Go). Although the achievement of sequencing the entire genome provides much information for exploring areas such as gene number, polymorphisms, and gene structure analysis, it is also quite important to acquire an overall view of expressed sequences, the transcriptome. Gene expression of GPCRs is regulated in a temporally and spatially specific manner and can also be altered by physiological and pathological conditions. Moreover, various alternative splice products are described for many GPCRs, but their biological significance often remains elusive. Therefore, the GPCR family is one of the most interesting gene families to assess with respect to the complexity of the transcriptome in mammals.

The RIKEN Mouse Gene Encyclopaedia project involves the development of a cap-trapper method to acquire full-length cDNA libraries from various mouse tissues, the creation of automated systems for DNA sequencing, and a fully computed system to infer other information such as chromosomal locations and gene expression patterns. This effort to catalog overall transcriptional units in mouse is called FANTOM (Bono et al. 2002Go), and its principal aims are to create meaningful names for clones, identify coding regions, and categorize clones based on the vocabularies of the Gene Ontology Consortium (Ashburner et al. 2000Go). The initial results and validation of the FANTOM approach were reported previously and generated the functional annotation of 21,076 full-length cDNAs (Kawai et al. 2001Go). The second phase of this project (FANTOM2) resulted in an additional 39,694-cDNA set (total 60,770) and a more global analysis of the mouse transcriptome (The FANTOM Consortium and The RIKEN Genome Exploration Research Group Phase I and II Team 2002Go). Along with the achievement of the mouse genome sequencing project (Mouse Genome Sequencing Consortium 2002Go), these results provide a comprehensive grasp of the widespread transcripts encoded in the mouse genome.

By exploiting various search systems equipped with the FANTOM2 data set, we retrieved all cDNAs that were predicted as GPCR genes. Clustering of those sequences (total 410) led to an identification of 213 individual transcriptional units (TUs). Out of these, 165 TUs have been already represented in the set of known 308 GPCRs in the Mouse Genome Informatics resource (MGI: http://www.informatics.jax.orgGo). The remaining 48 TUs represented novel mouse genes, and 14 of them had no clear mammalian ortholog. In the present work, we classified these GPCRs into subgroups based on their similarities and focused on describing novel 14 genes that have been newly found in mammals. Moreover, tissue-specific expression and alternative splicing of GPCRs were also analyzed.


    RESULTS AND DISCUSSION
 Top
 ABSTRACT
 RESULTS AND DISCUSSION
 METHODS
 REFERENCES
 WEB SITE REFERENCES
 
Data Acquisition From the FANTOM2 Database
The detailed annotation process for the FANTOM2 clone set was described elsewhere (The FANTOM Consortium and The RIKEN Genome Exploration Research Group Phase I and II Team 2002Go). A distinct feature of this process is the combination of two annotation strategies. First is automated annotation, which uses computational searches against the majority of publicly accessible databases, followed by automatic assignments of controlled nomenclature vocabulary transferred from the original literature and/or Gene Ontology (GO) terms to clones (Ashburner et al. 2000Go). The second is manual annotation, which aims to qualify the automated annotation by assigning the most informative name and coding sequence to each transcript. During an international consortium (the Mouse Annotation Teleconference for RIKEN cDNA sequences, MATRICS) many experts in bioinformatics and biology worked to verify and enhance the computational annotations. The Web-based FANTOM2 interface provided integrated graphical summaries of sequence similarity, motif search results, ortholog search results, alignments against the public draft mouse genome assemblies, and so on, and was used by MATRICS curators to manually assess and refine the automated annotations (Kasukawa et al. 2003Go). Furthermore, it included various search systems allowing the retrieval of genes of interest based on GO terms, Pfam name, protein motif, source of cDNA library, gene length, etc. Taking advantage of this system, we obtained probable GPCR genes. Among these preselected cDNAs, we verified 410 clones as putative GPCRs and annotated them using the guidelines designed for the FANTOM2 interface.

Coverage of Mouse Transcriptome
The entire set of 60,770 FANTOM2 sequences was clustered into 17,594 protein-coding TUs, excluding a substantial number of noncoding TUs (The FANTOM Consortium and The RIKEN Genome Exploration Research Group Phase I and II Team 2002Go). We found the 410 GPCR candidate clones clustered into 213 TUs. Therefore, it is estimated that proportional coverage of GPCR transcripts in mouse is equivalent to 0.67% (410 divided by 60,770) of total sequences or 1.21% (213 divided by 17,594) of total protein coding sequences, respectively. Although the gene number (213) by itself is far less than expected for the mouse genome (Mouse Genome Sequencing Consortium 2002Go), this is the first attempt at estimating the proportion of GPCR genes against the mouse transcriptome and is probably influenced by the initial filtering of the FANTOM2 cDNAs to try to remove redundancy (The FANTOM Consortium and The RIKEN Genome Exploration Research Group Phase I and II Team 2002Go). Because GPCRs exert their function in a temporally and spatially specific manner and some members of subfamilies recognize the same ligand and act cooperatively, we believe it is very important to understand the transcriptome and/or proteome of GPCRs to comprehensively evaluate their biological significance.

To estimate the coverage of FANTOM2 against a public database, we chose the MGI GPCR data set because of its detailed annotation descriptions and nonredundancy among receptor genes that is often problematic in many GPCR databases. Out of the 213 TUs in FANTOM2, 165 were represented within the known 308 GPCRs in MGI (as of May 5, 2002; Table 1). The coverage was calculated as 53.6%. The remaining 48 TUs were new to MGI, and 34 of these were hypothesized as representing homologs or paralogs to known genes. MGI curators will use these relationships to coordinate appropriate official nomenclature for these genes during the MGI FANTOM2 data load (Baldarelli et al. 2003Go). The remaining 14 TUs represent novel GPCRs that have no counterparts in public databases (as of May 5, 2002). According to sequence similarity, the 14 genes represented by these TUs were classified into subgroups (Table 1). Eight genes belong to Family A, four to Family C, one to Family F, and one to vomeronasal receptor type 1. Among these, Family A and C receptors are represented in classification trees (Figs. 1, 2, and 4 below), in an aim to illustrate their relatedness to known GPCRs. Moreover, we found several identical genes that have been registered in GenBank (as of September 20, 2002) while we were preparing this manuscript and updated this information to refine our data (as discussed belowand shown in Table 2). In addition, 67 out of the 213 genes are orphan GPCRs for which endogenous ligands have not been identified and the physiological functions remain elusive (Table 1; Lee et al. 2001Go).



View larger version (37K):
[in this window]
[in a new window]
 
Figure 1 Classification tree (Family A—small molecule). A rooted tree was constructed for 83 GPCRs. GPCRs that have cognate ligands are distinguished in colored subgroups. Orphan GPCRs are shown in uncolored branches, and novel genes are indicated with black circles. Abbreviations are shown in Supplementary Information 1 (available online at www.genome.org).

 


View larger version (33K):
[in this window]
[in a new window]
 
Figure 2 Classification tree (Family A—peptide). A rooted tree was constructed for 82 GPCRs. GPCRs that have cognate ligands are distinguished in colored subgroups. Orphan GPCRs are shown in uncolored branches, and novel genes are indicated with black circles. Abbreviations are shown in Supplementary Information 2.

 


View larger version (21K):
[in this window]
[in a new window]
 
Figure 4 Classification tree (Family C). An unrooted tree was constructed for 11 GPCRs. The scale bar indicates a maximum likelihood branch length of 0.1 inferred substitutions per site. GPCRs that have cognate ligands are distinguished in colored subgroups. Orphan GPCRs are shown in uncolored branches, and novel genes are indicated with black circles. Abbreviations are shown in Supplementary Information 4.

 

View this table:
[in this window]
[in a new window]
 
Table 2. Novel GPCR cDNAs in FANTOM2 (as of May 5, 2002)

 

Classification Trees
Representative clones from the 213 GPCRs were classified into subgroups based on their sequence similarity. Because Family A GPCRs consist of many genes (165 genes; Table 1), we divided them into two subgroups. One is the GPCRs that recognize electromagnetic radiation and small molecules such as light (rhodopsin), odors (olfactory receptors), biogenic amines, nucleic acids, and bioactive lipids. The other is peptide receptors including chemokines, chemoattractants, protease-activated, and hormone protein receptors. The data shown in Figures 1 and 2 were internally consistent, and the branching patterns agree well with the relatedness within known receptor gene (shaded in color). Moreover, there are a substantial number of orphan GPCRs (unshaded) distributed in each clade. Sequence alignments of orphan GPCRs with the use of such dendrogams could help identify the relevant subfamilies and point the way to identification of potential ligands and biological functions.

Family B (secretin-like) contains 24 genes, and more than half of them were revealed to be orphan receptors (Table 1). Family B receptors are characterized not only by the lack of the structural signature sequences present in the Family A GPCRs but also by the presence of a large N-terminal extracellular domain (exodomain; Laburthe et al. 1996Go). Because of the lack of similarity within the exodomains of each receptor, the relatedness among this group is hardly significant (Fig. 3). Additional analyses identifying newmembers of this subgroup and dissection of the receptor structure are needed to clarify the physiological relevance of this family.



View larger version (35K):
[in this window]
[in a new window]
 
Figure 3 Classification tree (Family B). An unrooted tree was constructed for 24 GPCRs. The scale bar indicates a maximum likelihood branch length of 0.1 inferred substitutions per site. GPCRs that have cognate ligands are distinguished in colored subgroups. Orphan GPCRs are shown in uncolored branches. Abbreviations are shown in Supplementary Information 3.

 

Likewise, Family C (mGluR/GABA-B/pheromone receptors) is featured by a large N-terminal exodomain where the receptor can capture its cognate ligand. There are four novel genes found in the FANTOM2 data set (Fig. 4), two of which (clones AK083234 [GenBank] and AK030625 [GenBank] ) share little homology with other known GPCRs (Gustincich et al. 2003Go). They could therefore comprise a distinct gene family and recognize a novel ligand.

Novel GPCR cDNAs in FANTOM2 (as of May 5, 2002)
Family A
Olfactory Receptors
Clone AK017005 [GenBank] belongs to the olfactory receptor superfamily and is most similar to "olfactory receptor MOR184–6 [Mus musculus]" (accession no. AAL60754 [GenBank] ; Zhang and Firestein 2002Go). However, this transcript was cloned from testis, where olfactory receptors are not thought to function. The receptor also appears to be too short to encode the primary structure of a putative seven-transmembrane receptor. As evidenced by Zhang and Firestein (2002Go), there are 1296 olfactory receptor genes defined in the mouse genome, of which ~1000 are functional and the rest are pseudogenes. Thus the clone AK017005 [GenBank] transcript may only exist as a nonfunctional pseudogene.

Similar to GPR34
Clone AK041317 [GenBank] shares a weak (25%) amino acid identity with GPR34 (Fig. 1; Marchese et al. 1999Go; Schoneberg et al. 1999Go), but no further similarity to any other known GPCR is detected. Although clone AK041317 [GenBank] and GPR34 are distinct from any other GPCRs, they are placed in a clade of nucleic acid receptor family (Fig. 1). Because GPR34 by itself is an orphan GPCR, a biological relevance of this clone remains uncertain.

Similar to Purinergic Receptor
Clone AK041740 [GenBank] is identical to "similar to purinergic receptor [Mus musculus]" (accession no. XP_142039 [GenBank] , registered on May 20, 2002). It has prominent identities with "putative purinergic receptor P2Y10 [Homo sapiens]" (accession no. NP_055314 [GenBank] , 72% identity; Ralevic and Burnstock 1998Go) and with "putative purinergic receptor FKSG79 [Homo sapiens]" (accession no. NP_115942 [GenBank] , 50% identity). Less similar sequences are also retrieved, such as "similar to P2Y purinoceptor 9 (P2Y9/Purinergic receptor 9/G protein-coupled receptor GPR23/P2Y5-like receptor) [Homo sapiens]" (accession no. XP_018505, 33% identity), "purinergic receptor (family A group 5) [Homo sapiens]" (accession no. NP_005758 [GenBank] , 32% identity), and "G protein-coupled receptor 17 [Homo sapiens]" (accession no. NP_005282 [GenBank] , 31% identity). P2ry10, P2ry5, and P2ry9 genes were also identified in the FANTOM2 database, and all of these genes were clustered with clone AK041740 [GenBank] in the classification tree (Fig. 1).

Similar to C5L2 (GPR77)
Clone AK053187 [GenBank] is identical to "similar to C5a anaphylatoxin chemotactic receptor C5L2 [Mus musculus]" (accession no. XP_145404 [GenBank] , registered on May 16, 2002) and shares 56% identity with human C5L2 protein. C5L2 (also termed GPR77) belongs to a subfamily of C5a, C3a, and formyl peptide receptors that are related to the chemoattractant receptor family and clone AK053187 [GenBank] situates among the chemoattractant receptor subgroup in the classification tree (Fig. 2). C5L2 has recently been shown to have a high binding affinity to C5a (Cain and Monk 2002Go). Although C5a is known as a potent chemoattractant and anaphylatoxin that acts on leukocytes and on many other cell types, more work is necessary to ascertain the relevance of this receptor to in vivo chemotactic reaction.

Similar to GPR31
Clone AK036897 [GenBank] matches a partial sequence of T complex responder 1 locus, which is described in UniGene Cluster Mm.132359 (the cluster name is "Mus musculus T complex responder 1 mRNA sequence"). As is consistent with a previous report (Schimenti 1999Go), this locus contains an intronless open reading frame (ORF) of "G protein coupled receptor [Mus musculus]" gene (accession no. AAF26668 [GenBank] ), and clone AK036897 [GenBank] has 83% identity with this gene. It also holds 50% identity with human GPR31, indicating that this could be a murine paralog of GPR31. GPR31 is an orphan receptor that shares 25%–33% homology with members of the chemokine, nucleic acid, and somatostatin receptor gene families (Zingoni et al. 1997Go). Furthermore, the classification tree indicated that clone AK036897 [GenBank] may have a relatedness to protease-activated receptors (Fig. 2). Despite these observations, clone AK036897 [GenBank] is too short to encode a putative GPCR structure, indicating that this cDNA is likely a partial fragment.

Similar to Galanin Receptor Type 2
Clone AK048591 [GenBank] is identical to "similar to putative G-protein coupled receptor [Mus musculus]" (accession no. XP_140302 [GenBank] , registered on May 17, 2002). It has a significant identity (78%) with human sequence "similar to putative G-protein coupled receptor [Homo sapiens]" (accession no. XP_068829 [GenBank] , registered on Aug. 1, 2002), implying that this clone may be an ortholog of the human gene. Although this gene is located in a hormone receptor subgroup in the classification tree (Fig. 2), there is a weak similarity (25%) between clone AK048591 [GenBank] and galanin receptor type 2 (accession no. AAC36589 [GenBank] ; Pang et al. 1998Go). Galanin is a ubiquitously expressed neuropeptide that exerts diverse modulatory functions in the central and peripheral nervous systems (Tatemoto et al. 1983Go; Bartfai et al. 1993Go). The presence of a structurally related peptide has been also recognized and shown to act on galanin receptors (Ohtaki et al. 1999Go). Thus, clone AK048591 [GenBank] could encode a novel type of receptor protein that interacts with yet unidentified galanin-related peptides.

Similar to Mesotocin Receptor
Clone AK047609 [GenBank] belongs to the arginine vasopressin receptor family (Fig. 2) and is closely related to "similar to mesotocin receptor (MTR) [Mus musculus]" (accession no. XP_138721 [GenBank] , registered on May 17, 2002, chromosome="13"). Gene mapping concludes that clone AK047609 [GenBank] is intronless and is localized on Chromosome 13. Its DNA sequence fully matches XP_138721 [GenBank] , except for the gaps in the 5' terminus and an internal region of gene XP_138721 [GenBank] (data not shown). The difference in these particular regions of the ORF, in turn, results in translating a shorter polypeptide. Kyte and Doolittle hydropathicity plots (Kyte and Doolittle 1982Go) predict that this product carries only five or six membrane-spanning domains, whereas clone AK047609 [GenBank] presumably contains a seven-transmembrane structure (data not shown). These observations postulate that clone AK047609 [GenBank] encodes a newmember of the arginine vasopressin receptor family, whereas clone XP_13871 is probably not a GPCR and might be produced as a result of gene duplication or chromosomal remodeling. In addition, clone AK047609 [GenBank] has a prospective ortholog that shares 70% identity and is termed "seven transmembrane helix receptor [Homo sapiens]" (accession no. BAC05903 [GenBank] , registered on July 23, 2002). Although both of these murine and human orthologs are weakly similar to the amphibian mesotocin receptor (accession no. Q90252 [GenBank] , 26% identity; Akhundova et al. 1996Go), mesotocin itself has not yet been identified in mammals. Therefore, the receptors might bind to a novel peptide hormone that is partially similar to mesotocin or another member of the arginine vasopressin peptide family.

Similar to CG6111 Gene Product
Together with clone AK047609 [GenBank] , clone AK033957 [GenBank] belongs to the arginine vasopressin receptor family (Fig. 2) and is highly similar to "similar to CG6111 gene product [Homo sapiens]" (accession no. XP_167325, registered on Aug. 1, 2002). Because the amino acid identity is 90%, this gene is likely the ortholog of the human gene. Moreover, there is a fly ortholog named "putative CCAP receptor [Drosophila melanogaster]" (accession no. AAN10041 [GenBank] , registered on Sept. 16, 2002; Park et al. 2002Go) that has 38% identity with clone AK033957 [GenBank] . Although this gene was originally considered as an orthologous gene of the vasopressin and oxytocin receptor subgroup (Broeck 2001Go; Hewes and Taghert 2001Go), it was recently shown to be activated by crustacean cardioactive peptide (CCAP; Park et al. 2002Go). CCAP was initially identified by its cardioacceleratory action on the heart of the shore crab, and its primary structure is strictly conserved across the arthropods (Veenstra 1989Go). Although a mammalian ortholog of CCAP has not yet been described, it is anticipated that a related peptide may be discovered as a cognate ligand for clone AK033957 [GenBank] .

Family C
We found two novel and unique genes that belong to Family C GPCRs. One is clone AK083234 [GenBank] , and the other is clone AK030625 [GenBank] (Fig. 4). Interestingly, they have a significant identity with each other (44%) but with any known receptor of this family (Fig. 4), implying that they comprise a novel subgroup of Family C GPCRs (Gustincich et al. 2003Go).

Clone AK083234 [GenBank] is highly similar to "hypothetical protein XP_158147 [Mus musculus]" (accession no. XP_158147, registered on May 16, 2002) and "similar to agCP15215 [Homo sapiens]" (accession no. XP_168702, registered on Aug. 1, 2002). Although clone AK083234 [GenBank] and XP_158147 are identical from amino acid 1 to 300, they are mapped to different chromosomes (AK083234 [GenBank] is on chromosome = "2"; XP_158147 is on chromosome = "1"). This indicates that they were generated by gene duplication. In addition, gene XP_158147 lacks the seven-transmembrane region, which is essential to confer biological function to GPCRs. In contrast, there is an overall similarity between clone AK083234 [GenBank] and XP_168702, implying that they are orthologous.

Clone AK030625 [GenBank] is identical to "hypothetical protein XP_147621 [Mus musculus]" (accession no. XP_147621, registered on Nov. 19, 2002). Although this clone has a significant identity to clone AK083234 [GenBank] (44%), it does not contain a seven membrane-spanning segment and there is no polyadenylation signal in the 3' noncoding region. Therefore, this clone may be a truncated fragment of an unknown putative GPCR.

Family F
Clone AK089429 [GenBank] belongs to Family F GPCR and is identical to "similar to hypothetical protein FLJ12132 [Mus musculus]" (accession no. XP_144130 [GenBank] , registered on May 16, 2002). The biological meaning of this gene product remains unclear.

Pheromone Receptor (Family C [V2R] and Other Group [V1R])
Clone AK030224 [GenBank] and AK029734 [GenBank] are placed in the Family C GPCR subfamily (Fig. 4), with significant identities (78% and 69%, respectively) to "putative pheromone receptor V2R2 [Mus musculus]" (accession no. AAC08413 [GenBank] ; Ryba and Tirindelli 1997Go). Predicted CDS lengths (AK030224 [GenBank] is 1425 bp and AK029734 [GenBank] is 1966 bp), however, are substantially shorter than that of the putative V2r2 transcript (ORF; 2739 bp). In addition, both of the predicted amino acid sequences lack the putative seven membrane-spanning segment. Such partial sequences were frequently observed in the FANTOM2 database despite the extensive effort to clone long mRNAs (The FANTOM Consortium and The RIKEN Genome Exploration Research Group Phase I and II Team 2002Go). It is anticipated that further technical improvement can facilitate the full-length cloning of longer mRNAs (Carninci et al. 2002Go).

Clone AK089429 [GenBank] is identical to "vomeronasal 1 receptor, C21 [Mus musculus]" (accession no. NP_598937 [GenBank] , registered on July 10, 2002), which was found in a survey aimed at identifying the vomeronasal type 1 receptor superfamily genes in the mouse genome (Rodriguez et al. 2002Go).

Tissue-Specific GPCRs
As is the well-known case in rodents, odorant and pheromone receptor families consist of extremely large and diverse repertoires of receptors, their variants, and pseudogenes. The diversity of sensory receptors is directly related to the perceptual and behavioral abilities to detect and respond to an enormous variety of sensory stimuli. In the present study, however, we were able to find only three pheromone receptors and six odorant receptors. Many of them were cloned from testis, and some of them were from neonate cerebellum, eyeball, or skin. Although the FANTOM2 project rigorously collected cDNA libraries from various mouse tissues, neither vomeronasal organ nor olfactory epithelium, in which the pheromone or the odorant receptors are thought to be exclusively expressed, were selected as RNA sources. A larger variety of cDNA libraries needs to be produced to cover tissue-specific or infrequently generated transcripts. Nevertheless, it is intriguing that some of the odorant or pheromone receptor genes are expressed in testis and certain neonate tissues in which the odorant receptors appear to have no biological function. This observation is consistent with the previous reports (Parmentier et al. 1992Go; Thomas et al. 1996Go; Tatsura et al. 2001Go) and raises a possibility that the sensory receptors could be involved not only in olfactory sensing, but also in reproduction or development. Moreover, it is well characterized that each olfactory neuron expresses only one odorant receptor gene (Buck 2000Go). Although the exact mechanism underlying this exclusivity of expression in olfactory neurons remains to be defined (Kratz et al. 2002Go), it would be interesting to determine if similar regulation occurs in these other tissues.

Splicing Variants
Many GPCR genes are known to be encoded by a single exon (Gentles and Karlin 1999Go), which facilitates their discovery from genomic sequence (Takeda et al. 2002Go). However, a large number of GPCRs are transcribed from multiple exons and consequently can result in the formation of alternatively spliced variants (Kilpatrick et al. 1999Go). In many cases, they are physiologically distinct with respect to gene distribution, ligand-binding affinity, signaling profile, receptor recycling, and so on (Kilpatrick et al. 1999Go). In addition, there are several reports linking splice variants with disease, although a mechanism responsible for the physiological abnormality remains uncertain (Kilpatrick et al. 1999Go). By mapping each cDNA sequence to the draft mouse genome (Mouse Genome Sequencing Consortium 2002Go), Zavolan et al. constructed a comprehensive database of probable splice variants (http://genomes.rockefeller.edu/MouSDBGo; Zavolan et al. 2003Go). By retrieving the 213 GPCR gene clusters from this database, we found 32 GPCR genes to be intronless and 180 to contain introns. Because of the lack of accurate sequence information for particular genomic sequences, one gene remained unclassified. Among the 180 GPCR genes that possess multiple exons, we found 52 of splicing variant candidates (Table 3), and a couple of examples are discussed further in the next section.


View this table:
[in this window]
[in a new window]
 
Table 3. Analysis of Splicing Variants

 

Gpr83; MGI:95712
One notable example is Gpr83 (glucocorticoid-induced receptor, GIR; GPR72). Gpr83 was originally identified as a stress-responsive gene in T-lymphocytes induced by glucocorticoids and cAMP (Harrigan et al. 1989Go, 1991Go). The mouse Gpr83 gene consists of 5 exons, and its mRNA is highly expressed in mammalian brain and thymus, in which several splicing variants are also described (Harrigan et al. 1989Go, 1991Go; De Moerlooze et al. 2000Go). According to the description given in the original paper (Harrigan et al. 1989Go, 1991Go), the most abundant transcript in mouse tissues is called RP23 (Fig. 5), and it encodes a putative seven transmembrane receptor. In contrast, the RP39 transcript undergoes exon skipping, resulting in the lack of a region that expands from the third extracellular loop to the third transmembrane region (Harrigan et al. 1991Go; De Moerlooze et al. 2000Go). This variant appears to be nonfunctional because it forms a six transmembrane receptor with inverted receptor topology. Clones RP82 and RP105 contain an insertion in the second intracellular loop, which presumably leads to an altered coupling property to trimeric G proteins. In the present study, we identified four distinct transcripts in the Gpr83 TU (Fig. 5). Clones 9530022I23 and C03004 [GenBank] 1M14 are identical to RP23 and encode a 423-amino-acid protein that contains a putative seven transmembrane structure. Clone A63001 [GenBank] 9F13 corresponds to RP39, which results in an abnormal form of GPR83. Although this aberrant receptor seems to have no classical function, this transcript could serve a possible role in regulating gene expression or translation, causing an indirect influence on GPCR function. Clone 5330401O04 represents a novel variant of Gpr83 mRNA, containing an insertion of unspliced intron sequence in its 5' end, and fails to encode a putative GPCR-like structure. This is one example of many immature mRNA sequences described in FANTOM2 including unspliced introns, frame shifts, or truncations, primarily resulting from technical problems in cloning very long transcripts (The FANTOM Consortium and The RIKEN Genome Exploration Research Group Phase I and II Team 2002Go).



View larger version (15K):
[in this window]
[in a new window]
 
Figure 5 Predicted splicing variants of Gpr83. Schematic representation of the mouse GPR83 polypeptide and splicing alternatives generating the different variants. Chromosomal localization was obtained by genome mapping. NM_010287 [GenBank] is the Gpr83 gene registered in the public database, and blue bars represent each exon of Gpr83. Four RIKEN clones (9530022I23, C03004 [GenBank] 1M14, A63001 [GenBank] 9F13, and 5330401A04) were mapped against the Gpr83 gene; red bars represent the predicted coding region of each RIKEN clone. Edited mRNA and predicted ORF sequences are also illustrated. The seven transmembrane (7TM) region is shown in black bar. Green bars show the predicted coding regions that do not match data in the public database. Variants RP23 and RP39 have been described previously (Harrigan et al. 1991Go).

 

Gpr37: MGI:1313297
We also identified probable variants of Gpr37 (Fig. 6). Although GPR37 was initially cloned from a human brain cDNA library based on the sequence similarity to endothelin receptor subtype B (ETB) and herein also named as ETB-like protein 1 (ETB-LP1; Zeng et al. 1997Go), a significant homology can be found neither with ETB nor with other known GPCRs. Combined with the identification of a paralogous gene, termed ETB-like protein 2 (ETB-LP2), the Gpr37 group seems to comprise a distinct gene subgroup. The Gpr37 gene is highly expressed in central nervous system and testis with a variety of transcripts, as demonstrated by Northern blot analysis (Marazziti et al. 1998Go). The genomic structure of Gpr37 revealed the existence of two exons, but evidence for alternative splicing has yet to be provided. In this analysis we found three different transcript variants of the Gpr37 gene. Clone 6430580C01 (representative clone) is identical to the original Gpr37 cDNA and is derived from two exons (accession no. NM_010338 [GenBank] ; Fig. 6; Marazziti et al. 1998Go). On the contrary, clone E13000 [GenBank] 7J18 is predicted to lack the precedent region of exon 1, which might result in an N-terminal truncated form of GPR37 (Fig. 6). The predicted 5' untranslated region of this clone is identical to the corresponding ORF sequence of clone 6430580C01 (original form). In addition, the presence of an in-frame stop codon upstream from the putative initiation codon is not confirmed. These observations do not fulfill the criteria of mature mRNA, indicating that this clone might be a truncated form due to the technical limitations in cloning longer mRNAs. Clone A93001 [GenBank] 7K23 appears to be alternatively spliced in the middle of exon 2, resulting in the insertion of additional coding sequence (Fig. 6). As this novel splicing variant can form a five transmembrane receptor, further studies must be performed to interpret its physiological function. Interestingly, recent research hypothesized that GPR37 is involved in stress-induced nerve cell damage, which is in part mediated by the protein ubiquitination enzyme, Parkin (Imai et al. 2001Go). PARK2 is one of the genes responsible for the occurrence of Parkinson's disease, and GPR37 can serve as one of its endogenous substrates (Imai et al. 2001Go). The detailed biological function of GPR37 remains elusive as it still awaits the discovery of an endogenous ligand. In addition, it is possible that the transcriptional regulation of GPR37 serves as a key event in disease occurrence.



View larger version (19K):
[in this window]
[in a new window]
 
Figure 6 Predicted variants of Gpr37. Schematic representation of the mouse GPR37 polypeptide and splicing alternatives that generate the different variants. Chromosomal localization was obtained by genome mapping. NM_010338 [GenBank] is the Gpr37 gene registered in the public database; blue bars represent each exon of Gpr37. Three RIKEN clones (6430580C01, E13000 [GenBank] 7J18, and A93001 [GenBank] 7K23) were mapped against the Gpr37 gene; red bars represent the predicted coding region of each RIKEN clone. Edited mRNA and predicted ORF sequences are also illustrated. The seven transmembrane (7TM) region is shown in black bar. Green bars show the predicted coding regions that do not match data in the public database. Clone E13000 [GenBank] 7J18 is predicted to lack the precedent region of exon 1, and A93001 [GenBank] 7K23 is predicted to have a shorter exon 2.

 

In conclusion, we found 410 GPCR candidates from the 60,770-clone set generated by FANTOM2 and verified 213 TUs out of them. In comparison to the human, the apparent coverage of the GPCR family in the FANTOM2 set remains limited. This is a reflection of difficulties in full-length cloning of very long GPCR transcripts and a lack of cDNA libraries from the tissues where a large number of GPCRs such as olfactory and pheromone receptors are expressed. Nevertheless, we successfully identified a significant set of GPCRs with an emphasis on 14 novel genes and many possible splice variants.


    METHODS
 Top
 ABSTRACT
 RESULTS AND DISCUSSION
 METHODS
 REFERENCES
 WEB SITE REFERENCES
 
Mining GPCR Candidate Sequences From the FANTOM2 Database
After the prediction of coding sequence (CDS; The FANTOM Consortium and The RIKEN Genome Exploration Research Group Phase I and II Team 2002Go), the cDNA and predicted protein sequences were searched against publicly accessible sequence and protein domain databases followed by automated assignment of a clone name and functional annotation using a controlled vocabulary. Clone sequences that had a high similarity to known genes in the Mouse Genome Informatics (MGI, http://www.informatics.jax.org/Go) and LocusLink/RefSeq (http://www.ncbi.nlm.nih.gov/LocusLink/Go; Pruitt and Maglott 2001Go) databases were assigned the official gene name and available Gene Ontology (GO; Ashburner et al. 2000Go) terms. Sequences that were identical to known mouse genes were assigned the official gene name and available GO terms. Taking advantage of the computational GO assignment, we retrieved probable GPCR genes by searching GO terms related to GPCR against the FANTOM2 database. Those that shared significant homology with known genes in other species such as human, rat, fly, or worm were classified into "homolog to" categories. The others that had no evident homology to any other known gene were classified into the "similar to," "weakly similar to," or "hypothetical protein" categories, indicating it likely they were novel mouse genes.

Clustering of cDNA Clones Into TUs
The 60,770 FANTOM2 clone set was clustered using the ClusTrans method (The FANTOM Consortium and The RIKEN Genome Exploration Research Group Phase I and II Team 2002Go). Briefly, pairwise comparisons and global alignment were performed for all cDNAs using the SSEARCH program distributed with FASTA (Smith and Waterman 1981Go; Pearson 1991Go) with the following parameters: A match score is +1, a mismatch score is -2, a penalty for the first residue in a gap is -8, and a penalty for additional residues in a gap is 0. The last option is critical for detecting long gaps in the alignment so that splicing variants can be clustered together. After the global alignment, cDNA clusters were defined based on percent sequence identity and match length. This method allowed the initial 60,770 clones to be divided into 33,409 candidate clusters. For clustering of the FANTOM2 data set with known genes from the public databases, a modified version of ClusTrans was used with SSEARCH replaced with BLAT (Kent 2002Go), as BLAT produces identical clusters with a great reduction in the time required for pairwise searches.

Genome Mapping
We mapped the cDNA sequences to the MGSCv3 assembly (ftp://wolfram.wi.mit.edu/pub/mouse_contigs/MGSC_V3Go) using BLAT (Kent 2002Go) with default parameters. This provided a basis for determining orthology between mouse and human genes in the annotation process based on knowledge of conserved linkage between these two species (Mural et al. 2002Go). Genome analysis also provided insight into the intron–exon structure of mouse genes. The MGSCv3 assembly is from a female mouse, whereas the FANTOM2 cDNA libraries are from both male and female and include many testis-expressed genes. For the alignment of these genes, we used the genomic sequences from the mouse Y-chromosome available in the public domain (~700 kb taken from ftp://ftp.ncbi.nih.gov/genbank/genomes/M_musculus/CHR_Y/Go) and the human Y-chromosome sequences (GoldenPath sequences; http://genome.cse.ucsc.edu/goldenPath/22Dec2001Go). The remaining unassigned sequences presumably represent mouse genomic regions awaiting accurate assembly.

Tree Building
To make classification trees, we retrieved amino acid sequences of each GPCR and categorized them into four subgroups based on the similarity. Family A GPCRs (165 genes described) were divided into two independent groups for the purpose of making a simple tree. One is the subgroup of rhodopsin, odorant, biogenic amine, nucleic acid, and bioactive lipid receptors (Family A—small molecule group). The other is the subgroup of peptide receptors (Family A—peptide group). After completing the multiple alignments, we constructed neighbor-joining phylogenetic trees for each family using CLUSTAL W (Thompson et al. 1994Go). The dendrogram was subsequently drawn using the Treeviewer program.

Analysis of Alternatively Spliced Genes
Data were obtained from http://genomes.rockefeller.edu/MouSDBGo and the detailed method is described elsewhere (Zavolan et al. 2003Go). Briefly, 60,770 RIKEN full-length cDNA sequences and 44,122 public mRNA sequences (from the mouse divisions of RefSeq and Mammalian Gene Collection databases) were aligned to genomic loci of the mouse genome. cDNA sequences with at least 95% identity (or at most five errors) in each exon were selected, and these yielded 11,677 loci with multiple spliced transcripts. Among these sequences, the presence of cryptic exons and exons flanked by alternative donor/acceptor splice site(s) was determined. Thus 4750 (41%) of the clusters were revealed to contain at least one variant transcript. Taking advantage of this database, we retrieved the corresponding cluster of GPCRs from each data category. Only one cluster was not determined owing to the lack of the complete genome mapping.


    Acknowledgements
 
We are grateful to all our lab members, especially to R.M. Kedzierski for helpful discussions. We also thank M. Zavolan (The Rockefeller University) for providing the data for splicing variants, S. Gustincich (Harvard Medical School) for analyzing Family C GPCRs, and T. Takada (University of Texas Southwestern Medical Center at Dallas) for technical help. This work is supported by NIH Conte Center grant 31222. M.Y. is an Investigator of the Howard Hughes Medical Institute (HHMI). Y.K. is a research associate of HHMI and is supported by the Uehara Memorial Foundation. D.P.H. is supported by NIH/NICHD grant HD33745 to the Gene Expression Database Project and NIH/NHGRI grant HG002273 to the Gene Ontology Project. L.M.M. is supported by NIH/NHGRI grant HG00330 to the Mouse Genome Database Project.


    Footnotes
 
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1087603.

6 Corresponding author.
E-MAIL Yuka.Kawasawa{at}UTSouthwestern.edu; FAX (214) 648-5068. Back

5 Takahiro Arakawa, Piero Carninci, Jun Kawai, and Yoshihide Hayashizaki. Back

[Supplemental material is available online at www.genome.org.]


    REFERENCES
 Top
 ABSTRACT
 RESULTS AND DISCUSSION
 METHODS
 REFERENCES
 WEB SITE REFERENCES
 

Akhundova, A., Getmanova, E., Gorbulev, V., Carnazzi, E., Eggena, P., and Fahrenholz, F. 1996. Cloning and functional characterization of the amphibian mesotocin receptor, a member of the oxytocin/vasopressin receptor superfamily. Eur. J. Biochem. 237:759 -767.[Medline]

Alcedo, J., Ayzenzon, M., Von Ohlen, T., Noll, M., and Hooper, J.E. 1996. The Drosophila smoothened gene encodes a seven-pass membrane protein, a putative receptor for the hedgehog signal. Cell 86:221 -232.[CrossRef][Medline]

Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al. 2000. Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25: 25-29.[CrossRef][Medline]

Baldarelli, R.M., Hill, D.P., Blake, J.A., Adachi, J., Furuno, M., Bradt, D., Corbani, L.E., Cousins, S., Frazer, K.S., Qi, D., et al. 2003. Connecting sequence and biology in the laboratory mouse. Genome Res. (this issue).

Bartfai, T., Hokfelt, T., and Langel, U. 1993. Galanin—A neuroendocrine peptide. Crit. Rev. Neurobiol. 7:229 -274.[Medline]

Bono, H., Kasukawa, T., Furuno, M., Hayashizaki, Y., and Okazaki, Y. 2002. FANTOM DB: Database of functional annotation of RIKEN mouse cDNA clones. Nucleic Acids Res. 30:116 -118.[Abstract/Free Full Text]

Broeck, J.V. 2001. Insect G protein-coupled receptors and signal transduction. Arch. Insect Biochem. Physiol. 48:1 -12.[CrossRef][Medline]

Buck, L.B. 2000. The molecular architecture of odor and pheromone sensing in mammals. Cell 100:611 -618.[CrossRef][Medline]

Cain, S.A. and Monk, P.N. 2002. The orphan receptor C5L2 has high affinity binding sites for complement fragments C5a and C5a des-Arg(74). J. Biol. Chem. 277:7165 -7169.[Abstract/Free Full Text]

Carninci, P., Shiraki, T., Mizuno, Y., Muramatsu, M., and Hayashizaki, Y. 2002. Extra-long first-strand cDNA synthesis. Biotechniques 32:984 -985.[Medline]

Conklin, D., Yee, D.P., Millar, R., Engelbrecht, J., and Vissing, H. 2000. Mining of assembled expressed sequence tag (EST) data for protein families: Application to the G protein-coupled receptor superfamily. Brief Bioinform. 1: 93-99.[Abstract/Free Full Text]

De Moerlooze, L., Williamson, J., Liners, F., Perret, J., and Parmentier, M. 2000. Cloning and chromosomal mapping of the mouse and human genes encoding the orphan glucocorticoid-induced receptor (G protein-coupled receptor 3). Cytogenet. Cell Genet. 90:146 -150.[CrossRef][Medline]

Dulac, C. and Axel, R. 1995. A novel family of genes encoding putative pheromone receptors in mammals. Cell 83:195 -206.[CrossRef][Medline]

The FANTOM Consortium and The RIKEN Genome Exploration Research Group Phase I and II Team. 2002. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420:563 -573.[CrossRef][Medline]

Gentles, A.J. and Karlin, S. 1999. Why are human G protein-coupled receptors predominantly intronless? Trends Genet. 15:47 -49.[CrossRef][Medline]

Gustincich, S., Batalov, S., Beisel, K.W., Yagi, K., Tominaga, N., Bono, H., Carninci, P., Fletcher, C.F., Grimmond, S., Hirokawa, N., et al. 2003. Analysis of the mouse transcriptome for genes involved in the function of the nervous system. Genome Res. (this issue).

Harrigan, M.T., Baughman, G., Campbell, N.F., and Bourgeois, S. 1989. Isolation and characterization of glucocorticoid- and cyclic AMP-induced genes in T lymphocytes. Mol. Cell. Biol. 9:3438 -3446.[Abstract/Free Full Text]

Harrigan, M.T., Campbell, N.F., and Bourgeois, S. 1991. Identification of a gene induced by glucocorticoids in murine T-cells: A potential G protein-coupled receptor. Mol. Endocrinol. 5:1331 -1338.[Abstract]

Hewes, R.S. and Taghert, P.H. 2001. Neuropeptides and neuropeptide receptors in the Drosophila melanogaster genome. Genome Res. 11:1126 -1142.[Abstract/Free Full Text]

Horn, F., Weare, J., Beukers, M.W., Horsch, S., Bairoch, A., Chen, W., Edvardsen, O., Campagne, F., and Vriend, G. 1998. GPCRDB: An information system for G protein-coupled receptors. Nucleic Acids Res. 26:275 -279.[Abstract/Free Full Text]

Imai, Y., Soda, M., Inoue, H., Hattori, N., Mizuno, Y., and Takahashi, R. 2001. An unfolded putative transmembrane polypeptide, which can lead to endoplasmic reticulum stress, is a substrate of Parkin. Cell 105:891 -902.[CrossRef][Medline]

Josefsson, L.G. and Rask, L. 1997. Cloning of a putative G-protein-coupled receptor from Arabidopsis thaliana. Eur. J. Biochem. 249:415 -420.[Medline]

Kasukawa, T., Furuno, M., Nikaido, I., Bono, H., Hume, D.A., Bult, C., Hill, D.P., Baldarelli, R., Gough, J., Kanapin, A., et al. 2003. Development and evaluation of an automated annotation pipeline and cDNA annotation system. Genome Res. (this issue).

Kawai, J., Shinagawa, A., Shibata, K., Yoshino, M., Itoh, M., Ishii, Y., Arakawa, T., Hara, A., Fukunishi, Y., Konno, H., et al. 2001. Functional annotation of a full-length mouse cDNA collection. Nature 409:685 -690.[CrossRef][Medline]

Kent, W.J. 2002. BLAT—The BLAST-like alignment tool. Genome Res. 12:656 -664.[Abstract/Free Full Text]

Kilpatrick, G.J., Dautzenberg, F.M., Martin, G.R., and Eglen, R.M. 1999. 7TM receptors: The splicing on the cake. Trends Pharmacol. Sci. 20:294 -301.[CrossRef][Medline]

Kolakowski Jr., L.F. 1994. GCRDb: A G-protein-coupled receptor database. Receptors Channels 2: 1-7.[Medline]

Kratz, E., Dugas, J.C., and Ngai, J. 2002. Odorant receptor gene regulation: Implications from genomic organization. Trends Genet. 18:29 -34.[CrossRef][Medline]

Kyte, J. and Doolittle, R.F. 1982. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157:105 -132.[CrossRef][Medline]

Laburthe, M., Couvineau, A., Gaudin, P., Maoret, J.J., Rouyer-Fessard, C., and Nicole, P. 1996. Receptors for VIP, PACAP, secretin, GRF, glucagon, GLP-1, and other members of their newfamily of G protein-linked receptors: Structure–function relationship with special reference to the human VIP-1 receptor. Ann. NY Acad. Sci. 805:94 -111.[Medline]

Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. 2001. Initial sequencing and analysis of the human genome. Nature 409:860 -921.[CrossRef][Medline]

Lee, D.K., George, S.R., Evans, J.F., Lynch, K.R., and O'Dowd, B.F. 2001. Orphan G protein-coupled receptors in the CNS. Curr. Opin. Pharmacol. 1: 31-39.[CrossRef][Medline]

Marazziti, D., Gallo, A., Golini, E., Matteoni, R., and Tocchini-Valentini, G.P. 1998. Molecular cloning and chromosomal localization of the mouse Gpr37 gene encoding an orphan G-protein-coupled peptide receptor expressed in brain and testis. Genomics 53:315 -324.[CrossRef][Medline]

Marchese, A., Sawzdargo, M., Nguyen, T., Cheng, R., Heng, H.H., Nowak, T., Im, D.S., Lynch, K.R., George, S.R., and O'Dowd, B.F. 1999. Discovery of three novel orphan G-protein-coupled receptors. Genomics 56:12 -21.[CrossRef][Medline]

Mouse Genome Sequencing Consortium. 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420:520 -562.[CrossRef][Medline]

Mural, R.J., Adams, M.D., Myers, E.W., Smith, H.O., Miklos, G.L., Wides, R., Halpern, A., Li, P.W., Sutton, G.G., Nadeau, J., et al. 2002. A comparison of whole-genome shotgun-derived mouse Chromosome 16 and the human genome. Science 296:1661 -1671.[Abstract/Free Full Text]

Ohtaki, T., Kumano, S., Ishibashi, Y., Ogi, K., Matsui, H., Harada, M., Kitada, C., Kurokawa, T., Onda, H., and Fujino, M. 1999. Isolation and cDNA cloning of a novel galanin-like peptide (GALP) from porcine hypothalamus. J. Biol. Chem. 274:37041 -37045.[Abstract/Free Full Text]