|
|
|
|
Vol. 9, Issue 12, 1204-1213, December 1999
LETTER
|
| |
ABSTRACT |
|---|
|
|
|---|
Gene duplication is believed to be an important evolutionary
mechanism for generating functional diversity within genomes. The
accumulated products of ancient duplication events can be readily
observed among the genes encoding voltage-dependent Ca2+ ion
channels. Ten paralogous genes have been identified that encode
isoforms of the
1 subunit, four that encode
subunits, and three that encode
2
subunits. Until
recently, only a single gene encoding a muscle-specific isoform of the
Ca2+ channel
subunit (CACNG1) was known.
Expression of a distantly related gene in the brain was subsequently
demonstrated upon isolation of the Cacng2 gene, which is
mutated in the mouse neurological mutant stargazer (stg). In
this study, we sought to identify additional genes that encoded
subunits. Because gene duplication often generates paralogs that remain
in close syntenic proximity (tandem duplication) or are copied onto
related daughter chromosomes (chromosome or whole-genome duplication),
we hypothesized that the known positions of CACNG1 and
CACNG2 could be used to predict the likely locations of
additional
subunit genes. Low-stringency genomic sequence analysis of targeted regions led to the identification of three novel
Ca2+ channel
subunit genes, CACNG3,
CACNG4, and CACNG5, on chromosomes 16 and 17. These
results demonstrate the value of genome evolution models for the
identification of distantly related members of gene families.
[The sequence data described in this paper have been submitted to the GenBank data library under accession numbers AF142618-AF142625 and AF148220.]
| |
INTRODUCTION |
|---|
|
|
|---|
Voltage-dependent Ca2+ channels couple
membrane depolarization in a wide variety of cellular
processes, including action potential generation, neurotransmitter and
hormone release, muscle contraction, neurite outgrowth, synaptogenesis,
Ca2+-dependent gene expression, synaptic plasticity, and cell
death. This broad range of biological activity is regulated by distinct channel subtypes whose biophysical properties are determined
predominantly by subunit isoform composition. Ca2+ channels
are believed to be heteromultimers of
1,
,
2
, and
subunits that associate in a
1:1:1:1 stoichiometry (De Waard et al. 1996
). To date, 10 genes have been identified and localized that encode isoforms of the
pore-forming
1 subunit
(
1A-
1I,
1S; Chin et al.
1991
; Powers et al. 1991
; Drouet et al. 1993
; Gregg et al. 1993a
; Iles
et al. 1993
; Diriong et al. 1995
; Fisher et al. 1997
; Cribbs et al.
1998
; Perez-Reyes et al. 1998
; Lee et al. 1999
), 4 that encode
subunits (
1-4; Gregg et al. 1993b
; Collin et al. 1994
;
Taviaux et al. 1997
), 3 that encode
2
subunits
(
2
1-3; Powers et al. 1994
; Klugbauer et
al. 1999
), and 2 that encode
subunits (
1,
2; Powers et al. 1993
; Letts et al. 1998
). All but the
two skeletal muscle isoforms,
1S and
1,
are expressed in the central nervous system (CNS).
Until recently, only a single gene encoding a muscle-specific
Ca2+ channel
subunit was known (CACNG1).
Subsequent isolation of the molecular defect in the mouse neurological
mutant stargazer (stg) identified a second
subunit gene,
Cacng2, expressed exclusively in the CNS (Letts et al. 1998
).
Expression of a single isoform in neurons distinguishes the
subunit from other Ca2+ channel subunits, which utilize
genetic heterogeneity as an important mechanism for generating
functional diversity in these cells. In addition, the expression of
significant levels of either CACNG1 or CACNG2 mRNA
has not been reported in tissues such as heart, kidney, and testis,
which express high levels of other Ca2+ channel subunits and
produce measurable Ca2+ currents (Jay et al. 1990
; Letts et
al. 1998
). We therefore hypothesized the existence of additional
subunit genes. The low level of amino acid identity between the
1 and
2 proteins suggested that novel
subunit paralogs might be difficult to identify using gene
isolation methods dependent only on nucleic acid hybridization, such as
low-stringency cross-hybridization to known
subunit cDNA
fragments or PCR amplification between conserved domains using
degenerate oligonucleotides. An alternative approach based on the use
of similarity search algorithms to screen genome-wide sequence
databases for homologous genes can be utilized, but this sometimes
produces large numbers of ambiguous identifications when low levels of
homology are predicted and low-stringency search criteria are employed.
We hypothesized that a modification of genome-wide database searching
might prove useful under these conditions. To test this, we applied a
search paradigm based on sequence similarity analyses but restricted to
small genomic regions predicted by gene duplication models as likely
locations of unidentified
subunit genes.
The expansion of gene families through evolution is thought to rely on
two principal mechanisms of gene duplication (Ohno 1970
; Nadeau and
Sankoff 1997
). Tandem duplication generates paralogs that often remain
in close proximity on the same chromosome. Chromosome or whole-genome
duplication results in the simultaneous duplication of many genes,
which retain their initial order on paralogous daughter chromosomes.
Both models were used to predict the most likely locations of
additional
subunit genes. We then performed an extensive
low-stringency comparative analysis of CACNG1 and CACNG2 cDNA and amino acid sequences to all available genomic sequences localized to the predicted regions. We report the
identification of three novel Ca2+ channel
subunit
genes, CACNG3, CACNG4, and CACNG5, on
chromosomes 16 and 17. Phylogenetic analysis supports a complex model
of
subunit gene family evolution requiring a minimum of two
ancient tandem duplications that preceded at least two chromosome
duplication events. The identification of expressed sequences in the
brain corresponding to CACNG3 and CACNG4 suggests
that the
subunit, like the
1,
, and
2
subunits, regulates Ca2+ currents in
the CNS from multiple genetic loci.
| |
RESULTS |
|---|
|
|
|---|
Three Novel Members of the Ca2+ Channel
Subunit
Gene Family
The Ca2+ channel
subunit genes CACNG1 and
CACNG2 are located on chromosome bands 17q24 and 22q12-q13,
respectively (Iles et al. 1993
; Powers et al. 1993
; Letts et al. 1998
).
We reasoned that any unidentified paralogous genes generated by tandem
duplication would most likely have remained close to CACNG1
and CACNG2 in these regions throughout evolution. Chromosome
band 16p11-p13 was the only additional genomic region we identified
that contained several genes with paralogs on 17q11-q25 and 22q11-q24
(Giles et al. 1998
) and was therefore a good candidate location for
subunit genes created by ancient chromosome or whole-genome
duplications. A target sequence database was constructed that contained
only those human genomic sequences from the GenBank database that could be localized unambiguously to the paralogous chromosome bands 17q11-q25, 22q11-q24, and 16p11-p13 (Methods). The estimated number of nonredundant sequence residues contained in this target region database (~1.5 × 107) comprised <0.6% of the
total number of sequence residues contained in the concurrent release
of GenBank (release 110.0; 2.57 × 109).
Low-stringency similarity searches of the target database with human
CACNG1 and mouse Cacng2 sequences identified exons of the genes CACNG1 (17q24) and CACNG2 (22q12-q13), as
expected. Several additional related sequences were identified that
were clearly distinct from CACNG1 and CACNG2 but were
organized into similar gene structures. One of these putative genes was
identical to a gene product located on chromosome band 16p12-p13.1
that had been predicted previously by automated gene identification programs associated with large-scale genome sequencing efforts (GenBank
accession no. AAC15246). Because this gene shared significant sequence
similarity with the Ca2+ channel
1 and
2 subunits, was organized into an intron-exon configuration identical to CACNG1 and CACNG2, and was
located within a paralogous chromosome region, we tentatively
designated it CACNG3 as a novel member of this gene family.
The remaining sequences with similarity to CACNG1 and
CACNG2 were all located on chromosome 17 and were derived from
two partially overlapping bacterial artificial chromosome (BAC)
clones that also contained the CACNG1 gene. These sequences
were organized into two putative genes that were nearly identical in
structure to CACNG1, CACNG2, and CACNG3 and
were designated CACNG4 and CACNG5 (Fig.
1).
|
The existing annotation of CACNG3 as an unknown gene product
within its larger genomic sequence database entry distinguished it from
CACNG4 and CACNG5 sequences, which were not annotated as potential genes. This difference reflects the fact that a subset of
sequencing centers do not routinely perform or report gene identification analysis of large genomic sequences using methods such
as XGRAIL, Genefinder, and Genscan. To determine if these analyses
would have predicted CACNG4 and CACNG5, we analyzed
the relevant genomic sequences (GenBank accession nos. AC005544 and
AC005988) with the Genscan program (Burge and Karlin 1997
, 1998
). Using
default parameters, Genscan predicted the structure of both
CACNG4 and CACNG5. The accuracy was generally
high, although variable among different exons. The borders of
exons 1, 2, and 3 were predicted exactly as shown in Figure 1, while
the Genscan-predicted end of exon 4 was 6 bp short for CACNG4
and 321 bp short for CACNG5. P values were >0.99 for each
of the predicted exons except exon 4 of CACNG4
(P = 0.128) and exon 1 of CACNG5
(P = 0.425). Genscan also predicted the promoter region of
both genes upstream of the first exon.
Identification of Expressed Sequences from CACNG3, CACNG4, and CACNG5
Although comparisons based on sequence, gene structure, open reading
frame, and chromosome location supported the inclusion of
CACNG3, CACNG4, and CACNG5 in the
Ca2+ channel
subunit gene family, we sought additional
evidence that these loci encoded functional genes rather than
pseudogenes. The identification of several expressed sequence tags
(ESTs) representing CACNG3 and CACNG4 was consistent
with transcription and splicing of these genes as predicted by the
genomic sequence motifs. ESTs corresponding to the 5' UTR (GenBank
accession nos. H38324 and T07086) and 3' UTR (H38292, T23680,
H04803, and H11477) of CACNG3 were identified by sequence
similarity searches of GenBank. Three of the four 3' ESTs
terminated in poly(A) sequences 31 bp downstream of a consensus
polyadenylation motif, ATTAAA. Three additional ESTs spanned the coding
region of CACNG3 (W29095, H11833, and H04905) and confirmed
the splicing of all three predicted introns. The mRNA source of the
ESTs was fetal adult brain tissue (except for a single cDNA derived
from adult retina) and suggested that CACNG3 was expressed in
neurons or glia. CACNG4 was also represented in GenBank by
multiple human ESTs, corresponding to the 3' UTR (M78316 and
AI207906) and the protein coding region (AA970202, AI423159, and
AI146595). The sequence of one EST spanned exons 3 and 4 of
CACNG4 and confirmed that mRNA from this gene was also spliced
as predicted. The CACNG4 ESTs were derived from fetal brain,
glioblastoma, and oligodendroglioma cDNA libraries, suggesting that
this gene, like CACNG3, was also expressed in neurons or glia.
We did not identify any EST or cDNA sequences in GenBank or other
sequence databases corresponding to CACNG5. To determine if
this gene was expressed, we generated oligonucleotide primers
corresponding to exon 3 and exon 4 of the CACNG5 genomic
sequence and screened cDNA libraries by PCR. A single product of the
expected size was amplified from a human fetal kidney cDNA library.
Sequencing of this product demonstrated that it was identical to the
predicted spliced cDNA of CACNG5, and confirmed that this gene
was transcribed and the mRNA was processed (GenBank accession no.
AF148220).
We further sought to determine if sequence from the promoter regions of CACNG3, CACNG4, and CACNG5 contained consensus transcription factor binding motifs that might be useful for predicting tissue-specific expression patterns. However, although web-based promoter analysis programs (Methods) were successful in identifying numerous potential binding sites for various proteins, we did not identify any patterns that predicted the preferred transcription of these genes in specific tissues or in response to specific stimuli (data not shown).
Phylogenetic Relationships
Although the skeletal muscle
1 and neuronal
2 subunits exhibited low amino acid identity (~25%),
the predicted transmembrane topologies were nearly indistinguishable
(Letts et al. 1998
), suggesting strong selective constraints on this
aspect of secondary structure. We extended this analysis to include the
3,
4, and
5 subunits.
Examination of all five isoforms predicted very similar transmembrane
topologies for
1,
2,
3,
4, and
5 (Fig. 2). The
presence of greater amino acid identity within the putative transmembrane domains, as compared to other regions of the protein, was
consistent with selective conservation of sequence identity in these
regions (Fig. 1). Comparison of the hydrophobicity plots indicated that
2,
3, and
4 were more
similar to each other in secondary structure than to
1
or
5, and that
5 was intermediate in
structure between
1 and the others. This was somewhat
unexpected, because CACNG2, CACNG3, and
CACNG4 are each located on different chromosomes (chromosomes
22, 16, and 17, respectively), and CACNG4 is located between
CACNG1 and CACNG5 on chromosome 17.
|
To clarify the evolutionary relationship among these genes, a
phylogenetic analysis of
subunit amino acid sequences was conducted under the assumption of maximum parsimony (Fig.
3A). The mouse protein Claudin 4, which we determined
to be distantly related to the Ca2+ channel
subunits by
comparative sequence analysis, was defined as the out- group. The
recently identified Claudin proteins comprise a family of
four-transmembrane-domain proteins believed to be important in the
formation of tight junctions (Morita et al. 1999
). The topology of the
inferred tree was consistent with the hydrophobicity analysis and
strongly supported a close relationship between
2 and
3. The remaining branchpoints were also inferred with
high confidence levels and indicated the branching of
4,
5, and
1 in reverse chronological order
from the
2
3 node. These
relationships were concordant with the levels of pairwise amino acid
identity among the proteins (Fig. 3B).
|
Our data suggested a model of Ca2+ channel
subunit gene
family evolution in which at least two ancient tandem gene
duplications preceded the chromosome duplication events that led to the
modern chromosome regions 17q11-q25, 22q11-q24, and 16p11-p13 (Fig.
4). The phylogenetic clustering of
2
(chromosome 22) with
3 (chromosome 16), and their more
distant relationship to
4 (chromosome 17), could be
interpreted as evidence of more recent divergence between chromosomes
22 and 16. A logical extrapolation would be that other paralogous genes
on chromosomes 16 and 22 would also be more closely related to each
other, on average, than to any paralogs on chromosome 17. To
investigate this hypothesis further, we examined sequences immediately
surrounding the
subunit genes on these three chromosomes. Several
additional genes were identified, but comparisons among these failed to
support a more recent divergence between chromosomes 16 and 22. In
fact, the presence of paralogous protein kinase C genes
(PRKCB1 and PRKCA) immediately telomeric of
CACNG3 and CACNG5, respectively, and the absence of a
PRKC paralog telomeric of CACNG2 on chromosome
22,supported a greater similarity between chromosomes 16 and 17 in
these regions (Fig. 5). To resolve this ambiguity, we
carried out a comprehensive comparison of paralogous genes located
on chromosome bands 17q11-q25, 22q11-q24, and 16p11-p13, including
some novel loci that were identified through analysis of our regionally
restricted target sequence database (Table 1). However, although the data demonstrated a clear relationship among all three chromosome regions, their ancestral relationships remained equivocal and additional studies will be needed to clarify this issue.
|
|
|
| |
DISCUSSION |
|---|
|
|
|---|
We applied two models of gene family expansion, tandem duplication
and chromosome duplication, to facilitate the identification of three
novel members of the Ca2+ channel
subunit gene family:
CACNG3, CACNG4, and CACNG5. The aim of this
approach was to maximize the likelihood of correct gene identifications
by low-stringency similarity searches of localized DNA sequences and
reduce the large number of biologically irrelevant matches usually
generated by genome-wide database analysis. The amino acid identity
between the
1 subunit and the
3,
4, and
5 subunits was 22%-26%. This
low degree of similarity may explain why the CACNG3,
CACNG4, and CACNG5 genes were previously undetected
by standard whole-genome database searches, which often employ higher
stringency alignment parameters as the default criteria for defining
similarity. In contrast, the amino acid identity between the
2 subunit and the
3,
4,
and
5 subunits was variable, measuring 84%, 64%, and
32%, respectively. Recently, Black and Lennon (1999)
also identified
the CACNG3 gene by computer similarity searches of genomic
sequence databases using the human CACNG2 sequence as the
query. The high degree of similarity between
2 and
3 (84%), and the pre-existing GenBank annotation of
3 as an unknown gene product, may have facilitated
identification of CACNG3 using standard search parameters. The
fact that the CACNG4 and CACNG5 genes were not
detected by that approach, although they are both located within 100 kb
of CACNG1, demonstrates the value of gene duplication models
for predicting gene location and improving gene identification
efficiency. The additional observation that both CACNG4 and
CACNG5 would have been predicted by gene identification
software such as Genscan, underscores the importance of improved
genomic sequence annotation.
Although our approach was successful in identifying CACNG3, CACNG4, and CACNG5, there are important limitations to its efficacy. Foremost, many paralogous members of gene families are not located in tandem or in duplicated chromosomal regions that exhibit conserved gene order with other family members. Some of these paralogs may have been generated by gene duplication mechanisms not considered in the model we employed. For example, genes duplicated by retrotransposition via an mRNA intermediate could theoretically insert anywhere in the genome with respect to the parental gene. In most cases, however, it seems probable that complex genomic rearrangements occurring over large time scales have obliterated the initial positional relationships among distantly related genes. Predictions of paralogous gene locations based on common duplication models will, by design, exclude such genes. This approach should therefore be considered primarily as a complement to other sequence-based gene identification techniques.
Role of
Subunits in Ca2+ Channel Function
Two of the three genes identified in this study, CACNG3 and
CACNG4, were represented by several ESTs derived from brain
mRNA. We found expressed sequences corresponding to CACNG5 by
PCR amplification of a human fetal kidney cDNA library but did not
exclude the possibility of CACNG5 expression in the brain or
other tissues. It is worth noting that several tissues that express
multiple
1,
, and
2
subunit
isoforms, including testes, ovary, lung, pancreas, spleen, liver, and
kidney, do not express
1 or
2 (Biel et
al. 1990
; Jay et al. 1990
; Castellano and Perez-Reyes 1994
; Yu 1995
;
Williams et al. 1999
). The
3,
4, and
5 isoforms are possible components of Ca2+
channels in these tissues. The coexpression of multiple isoforms of
1,
, and
2
subunits in
individual neurons is a valuable mechanism for generating functional
variability among Ca2+ channels in brain, and our data
suggest that the
subunit could contribute to channel diversity in
a similar manner. For example, the mouse Cacng2 gene is widely
expressed in the brain (Letts et al. 1998
) and may be coexpressed in
some regions with the mouse homologs of CACNG3 and
CACNG4. If this is confirmed, then it will be important to
examine the relative contributions of each
subunit isoform to the
structure and function of distinct Ca2+ channel types in
vivo. A comprehensive comparative analysis of
subunit gene
expression in the brain and other tissues will provide insight into the
physiological role of these isoforms. Confirmation will ultimately
require the demonstration that the predicted
3,
4, and
5 proteins can directly modulate
channel biophysical properties, or influence the stability or
subcellular localization of the channel complex.
Less is understood about the function of the Ca2+ channel
subunit than about
1,
, and
2
. Coexpression of the cardiac
1 subunit (
1C) with the skeletal muscle
1
isoform was used to demonstrate that it shifted the inactivation curve
of the channel to negative potentials and accelerated current
inactivation without significantly affecting other voltage-dependent
properties (Singer et al. 1991
; Eberst et al. 1997
). Another study
indicated that the
1 subunit did not have a significant
effect on
1C-mediated channel currents unless
coexpressed with a
subunit (Wei et al. 1991
). Coexpression
analysis of the
2 subunit, which is disrupted in the
mouse neurological mutant stg, showed that it increased the
steady-state inactivation of
1A-containing
Ca2+ channels (Letts et al. 1998
). Sequence similarity to
1 and
2 supports a prediction that the
3,
4, and
5 proteins may
also regulate the inactivation properties of Ca2+ channels.
In general, the effects of
subunit regulation on Ca2+
currents that have been described are small in magnitude (Walker and De
Waard 1998
). However, if modulation of channel properties is dependent
on the coexpression of specific
1 and
isoforms, the experimental results described above may not accurately represent
subunit function in vivo. For example, the skeletal muscle
1 isoform is not expressed at high levels in heart with
the cardiac
1C isoform, and it is not known whether the
2 isoform actually associates with
1A in
the brain, although both are widely coexpressed (Tanaka et al. 1995
;
Letts et al. 1998
). Instead, it is possible that other
subunit
isoforms associate preferentially with
1C and
1A in vivo. Functional coexpression of
3,
4, and
5 in combination with various
1,
, and
2
isoforms in vitro
may illustrate distinct regulatory functions for the
subunit,
whereas coimmunoprecipitation analysis of different tissues or brain
regions will be helpful in determining which isoforms are associated
preferentially in vivo. It is also worth noting that Ca2+
channel
subunits exhibit a low level of amino acid identity and
similar hydrophobicity profiles to several Claudin proteins, to the
lens intrinsic membrane protein MP20, and to peripheral myelin protein
PMP22 (data not shown). However, it is not known whether any functional
similarities exist among the members of this extended family of
four-transmembrane domain proteins.
The chromosome locations of the
subunit genes, CACNG3,
CACNG4, and CACNG5, suggests they could be candidates
for involvement in hereditary disease. CACNG3 is located on
chromosome band 16p12-p13.1 in the vicinity of the ICCA locus
[infantile convulsions and paroxysmal choreoathetosis; Online Mendelian Inheritance
in Man (OMIM) no. 602066]. CACNG3 is expressed in the brain
and it is worth noting that mutation of the closely related
Cacng2 gene in the stg mouse results in epilepsy
and ataxia (Noebels et al. 1990
; Letts et al. 1998
). The mouse ortholog
of CACNG3 is predicted to map to chromosome 7 near
Szv2, a quantitative trait locus (QTL) influencing seizure
response to kainic acid (Ferraro et al. 1997
). CACNG4 and
CACNG5 are located on chromosome band 17q24 in tight
physical linkage to CACNG1. A locus for neuralgic amyotrophy
with brachial predilection (NAPB; OMIM 162100) has been mapped
to 17q24-q25 and is characterized by severe pain, weakness, wasting,
depression of reflexes, and sensory loss (Jacob et al. 1961
). However,
NAPB was localized close to marker D17S939
(Pellegrino et al. 1997
), whereas CACNG1 (and CACNG4
and CACNG5 by association) was significantly more centromeric,
near D17S807, and is therefore an unlikely candidate for this
disorder. Comparison of conserved linkage groups suggests that mouse
Cacng1, Cacng4, and Cacng5 are probably
located near Pkca, which is on the consensus map of chromosome
11 at 68 cM [Mouse Genome Informatics (MGI) database]. Interestingly,
this position is near a second locus for seizure susceptibility,
Szs3, at 66 cM (Ferraro et al. 1997
). The close association of
epilepsy and ataxia with mutations in other neuronal voltage-dependent Ca2+ channels (Burgess and Noebels 1999
) suggests these are
potential candidate phenotypes for defects in the CACNG3,
CACNG4, or CACNG5 genes.
| |
METHODS |
|---|
|
|
|---|
Target Region Sequence Database Construction
A database of human genomic sequences derived from chromosome bands
17q11-q25, 22q11-q24, and 16p11-p13 (target regions) was constructed
using Microsoft Excel '97. Sequences were compiled from two sources:
the Human Genome Sequencing Index (HGSI) database contains sequences
from large clones (cosmids, BACs, or PACs), and clone contigs that have
been localized unambiguously to specific genomic regions. Additional
sequences were obtained by screening GenBank for genomic sequences
identical to genes or cDNA that were localized previously to the target
regions, using the programs BLASTN or TBLASTN (release 2.0; Altschul et
al. 1997
). Approximately 200-400 bp of nonrepetitive sequence from the
ends of each genomic sequence obtained in this way was used for
additional rounds of database screening and sequence contig extension,
terminating when no additional sequences were identified.
Electronic Database Information
Data presented are consistent with the following databases as of September 1999: GenBank (GB), http://www.ncbi.nlm.nih.gov/Web/Genbank/; GeneMap'98, http://www.ncbi.nlm.nih.gov/genemap/; Genestream, http://vega.crbm.cnrs-mop.fr/home.html; Genscan, http://gnomic.stanford.Edu/~chris/GENSCANW.html; HUGO/GDB Human Gene Nomenclature, http://www.gene.ucl.ac.uk/nomenclature/; LocusLink, http://www.ncbi.nlm.nih.gov/LocusLink/; Human Genome Sequencing Index (HGSI), http://www.ncbi.nlm.nih.gov/HUGO/; Mouse Genome Informatics (MGI), http://www.informatics.jax.org/; National Center for Biotechnology Information (NCBI), http://www.ncbi.nlm.nih.gov/; Online Mendelian Inheritance in Man (OMIM), http://www.ncbi.nlm.nih.gov/Omim/; Prosite, http://www.expasy.ch/prosite/.
Accession Numbers
Genes referred to in the text and figures are listed in alphabetical order, followed by genomic sequences listed by chromosome. Genes identified within larger genomic sequences are identified by base-pair (bp) position in the genomic sequence. All accession numbers refer to GenBank unless otherwise indicated: CACNG1 (AC005544), CACNG2 (Z83733, AL022313, AL031845), CACNG3 (AC004125), CACNG4 (AC005544, AC005988, AF142622, AF142623, AF142624, AF142625), CACNG5 (AC005988, AF142618, AF142619, AF142620, AF142621, AF148220), CBP-P22 (2695572), CSF2RB (OMIM 138981), EIF3-P66 (U54558), G6PDP1 (M12996; AC005988, bp 8473-9248), HSRASR (AL022729), HPSP1 (U65676; AL022313, bp 90054-88792), NCF4 (AH004909), ORFA05 (D29677), PRKCA (X52479), PRKCB1 (M13975), PVALB (OMIM 168890), TRX2 (U78678). Chromosome 16 genomic sequences: CIT987-SKA-345G4 (AC002302), CIT987-SKA-113A6 (AC002299), CIT987SK-625P11 (AC004125). Chromosome 17 genomic seqences: hCIT.187_K_10 (AC006263), hRPK.115_C_3 (AC006947), hRPK.74_H_8 (AC005918), hRPK.299_G_24 (AC005988), hRPK.349_A_8 (AC005544). Chromosome 22 genomic seqences: E132D12 (Z80897), 833B7 (AL008637), 24E5 (Z82185), 566H6 (AL031845), 1119A7 (AL022313), 126G10 (Z82184), 293L6 (Z82197; Z83733; Z83732), 4G12 (Z70289).
Low Stringency Similarity Searches
Low-stringency similarity searches for novel Ca2+ channel
subunit genes were limited to sequences contained in the target region database. The BLASTN program (NCBI) was utilized for pairwise comparisons between large genomic sequences (10-150 kb) and cDNA sequences. Default filters used to mask sequences of low compositional complexity were turned off. The BLOSUM62 alignment scoring matrix was
used. Alignment parameters were adjusted to reduce stringency (default
value, value used): expectation value,
e (10.0, 100.0); gap-opening penalty,
G (5, 3); gap-extension penalty,
E (2, 1); mismatch penalty in the blast portion of the
run,
q (
2,
1); word size (11, 7).
Sequence Analysis of Promoter Regions
Analysis of promoter region sequence for transcription factor
binding sites utilized the web-based programs: PatSearch 1.1, utilizing
the TRANSFAC 3.4 and TRRD 3.5 databases (Heinemeyer et al. 1998
, 1999
);
MatInspector Version 2.2 (Quandt et al. 1995
), utilizing the TRANSFAC
3.5 database; TFSEARCH (Yutaka Akiyama, http://www.rwcp.or.jp/papia/),
utilizing the TRANSFAC 3.3 database; and TESS (J. Schug and G. Christian Overton, http://www.cbil.upenn.edu/tess/), utilizing the
TRANSFAC 3.3 database.
Multiple Sequence Alignments and Phylogenetic Analysis
Protein sequences were aligned for phylogenetic analysis using the
ClustalX multiple alignment package (Thompson et al. 1997
) with default
values. Alignments were carried out using full length sequences or only
conserved regions as indicated in the text. Pair-wise percent amino
acid identity was calculated following local alignment by BLASTP
(release 2.0) using the BLOSUM62 scoring matrix and gap
opening/extension penalties of 8 and 2, and following global alignment
with the ALIGN program (Genestream) using the codaa.mat scoring
matrix and gap opening/extension penalties of 12 and 2. The sequence
alignment shown in Figure 1 was manually edited for display but not for
phylogenetic analysis or amino acid identity calculations. Phylogenetic
relationships were inferred using the neighbor-joining method (Saitou
and Nei 1987
) of the ClustalX multiple alignment package (Thompson et
al. 1997
). The reliability of tree topology was evaluated using
bootstrap analysis (Felsenstein 1985
) with 10,000 iterations to provide
confidence levels. Unrooted trees were plotted as rectangular
cladograms using the TreeView program (Page 1996
).
Identification of CACNG5 cDNA
Oligonucleotide PCR primers predicted to amplify a 283-bp cDNA product were designed according to the sequence of the CACNG5 gene: HG5-F (exon 3): 5'-GATACTGGCCTTTGTCTCTGG-3'; HG5-R (exon 4): 5'-TTGTGGAATGTCCCTTCTCC-3'. A single product of ~283 bp was amplified from a PCR reaction containing 1 µl of phage suspension from a human fetal kidney cDNA library (Clontech, HL5004a) as template in a 50-µl reaction volume, 50 mM KCl, 10 mM Tris-HCl, 1.5 mM MgCl2, 0.1% Triton X-100, 25 mM dNTPs, 0.8 µM each oligonucleotide (HG5-F and HG5-R), and 1 Unit of Taq polymerase (Promega). The reaction was carried out using a PTC-100 thermocycler (MJ Research) with an initial denaturation of 94°C for 2 min, followed by 35 cycles of 94°C (30 sec), 55°C (60 sec), 72°C (60 sec), and a final extension of 72°C for 10 min. Following agarose gel electrophoresis, the PCR product was isolated using the QIAquick gel extraction kit (Qiagen) and sequenced by the Baylor College of Medicine DNA Sequencing Core Facility.
| |
ACKNOWLEDGMENTS |
|---|
This research was supported by an American Epilepsy Society postdoctoral fellowship and a Methodist Hospital Foundation (Houston, TX) grant to D.L.B. and National Institutes Health grant NS29709 to J.L.N. We thank T. Cormier in the laboratory of Dr. Huda Zoghbi for providing a sample of the Clontech fetal kidney cDNA library.
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| |
FOOTNOTES |
|---|
2 Corresponding author.
E-MAIL dburgess{at}bcm.tmc.edu; FAX (713) 798-7528.
| |
REFERENCES |
|---|
|
|
|---|
2 and
3 subunits: Neurologic implications.
Mayo. Clin. Proc.
4:
357-361.
subunits.
Biochem. Soc. Trans.
22:
483-488[Medline].
1 subunit gene (CCHL1A2) maps to mouse chromosome 14 and human chromosome 3.
Genomics
11:
914-919[CrossRef][Medline].
3 subunit.
Eur. J. Biochem.
220:
257-262[Medline].
1H from human heart, a member of the T-type calcium channel gene family.
Circ. Res.
83:
103-109
-1 subunit of the skeletal dihydropyridine receptor (Cchl1a3=mdg) maps to mouse chromosome 1 and human 1q32.
Mamm. Genome
4:
499-503[CrossRef][Medline].
subunit.
Pflügers Arch.
433:
633-637[CrossRef][Medline].
1 subunit of the skeletal muscle DHP-sensitive Ca2+ channel (CACNL1A3) to chromosome 1q31-q32.
Genomics
15:
107-112[CrossRef][Medline].
subunit of the voltage-dependent calcium channel (CACNLB1) to chromosome 17 using somatic cell hybrids and linkage mapping.
Genomics
15:
185-187[CrossRef][Medline].
A database of membrane spanning protein segments.
Biol. Chem. Hoppe-Seyler
374:
166.
2
subunit.
J. Neurosci.
19:
684-691
1 subunit of the cardiac DHP-sensitive Ca2+ channel (CCHL1A1) to chromosome 12p12-pter.
Genomics
10:
835-839[CrossRef][Medline].
subunit of the human skeletal muscle 1,4-dihydropyridine-sensitive Ca2+ channel (CACNLG), cDNA sequence, gene structure, and chromosomal location.
J. Biol. Chem.
268:
9275-9279
2/
subunit (CACNL2A) of the human skeletal muscle voltage-dependent Ca2+ channel to chromosome 7q21-q22 by somatic cell hybrid analysis.
Genomics
19:
192-193[CrossRef][Medline].
New fast and versatile tools for detection of consensus matches in nucleotide sequence data.
Nucleic Acids Res.
23:
4878-4884