|
|
|
|
Published online before print
December 14, 2005, 10.1101/gr.4356206 Genome Res. 16:223-230, 2006 ©2006 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/06 $5.00
Letter Functional genomics of membrane transporters in human populations1 Department of Biopharmaceutical Sciences, San Francisco, California, 94143, USA 2 Center for Human Genetics, University of California San Francisco, San Francisco, California, 94143, USA
Although considerable progress has been made toward characterizing human DNA sequence variation, there remains a deficiency in information on human phenotypic variation at the single-gene level. We systematically analyzed the function of all protein-altering variants of eleven membrane transporters in heterologous expression systems. Coding-region variants were identified by screening DNA from a large sample (n = 247-276) of ethnically diverse subjects. In total, we functionally analyzed 88 protein-altering variants. Fourteen percent of the polymorphic variants (defined as variants with allele frequencies 1% in at least one major ethnic group) had no activity or significantly reduced function. Decreased function variants had significantly lower allele frequencies and were more likely to alter evolutionarily conserved amino acid residues. However, variants at evolutionarily conserved positions with approximately normal activity in cellular assays were also at significantly lower allele frequencies, suggesting that some variants with apparently normal activity in biochemical assays may influence occult functions or quantitative degrees of function that are important in human fitness but not measured in these assays. For example, eight (14%) of the 58 variants for which we had measured the transport of at least two substrates showed substrate-specific defects in transport. These variants and the reduced function variants provide plausible candidates for disease susceptibility or variation in clinical drug response.
Since the completion of the human genome project, considerable progress has been made in characterizing the nature and degree of human DNA sequence variation (Cargill et al. 1999
Analysis of causal mutations in Mendelian diseases has demonstrated that the majority of these diseases are caused by rare nonsynonymous variants, specifically, amino acid substitutions (Risch 2000
One of the primary goals of current large-scale sequencing projects is to identify SNPs that may be used in candidate gene association studies. However, a major obstacle in designing association studies is choosing appropriate SNPs to genotype. One strategy is to choose SNPs that are expected a priori to affect protein function and are therefore more likely to be associated with an altered phenotype. A variety of algorithms and bioinformatics tools have been developed in recent years to predict the functional consequences of protein-altering variants (Ng et al. 2000
Solute Carrier (SLC) transporters maintain cellular and total body homeostasis by importing nutrients and exporting cellular waste products and toxic compounds. These transporters also play a critical role in drug response, serving as drug targets and facilitating drug absorption, metabolism, and elimination. The SLC superfamily is comprised of transporters from a wide range of functional classes, including neurotransmitter, nutrient, heavy metal, and xenobiotic transporters. Genetic defects in SLC transporters have been associated with a variety of Mendelian diseases, including metabolic disorders such as glucose-galactose malabsorption (Pascual et al. 2004 In this study, we characterized the function of protein-altering variants of 11 SLC transporters belonging to three different families: SLC22, SLC28, and SLC29. These transporters are present in a variety of epithelial tissues and have diverse biological roles. Although several of these transporters have specific functions that are important for normal human physiology, they are all capable of transporting xenobiotic small molecules (i.e., drugs), and were selected for screening as candidate genes to explain variability in drug response. In addition to identifying and functionally characterizing all naturally occurring protein-altering variants of these transporter genes, we attempted to identify characteristics of protein-altering variants that are predictive of alterations in protein function, both in biochemical assays and in vivo.
Fourteen percent of all protein-altering polymorphisms identified in 11 SLC transporters have decreased function in biochemical assays We systematically analyzed the function of all protein-altering variants of 11 membrane transporters in the Solute Carrier families SLC22, SLC28, and SLC29 in heterologous expression systems. Coding-region variants were identified by screening many DNA samples (n = 247-276) from ethnically diverse human populations. The transporters are dispersed throughout the human genome on five chromosomes, although some pairs (OCT1-OCT2, OAT1-OAT3, OCTN1-OCTN2) are found in tandem at a single locus and presumably arose by gene duplication (Table 1). The amino acid diversity ( NS) of the nine transporters ranges from 0.11 x 10-4 to 8.7 x 10-4. Previous large-scale sequencing studies have found that the average amino acid diversity in the human genome is 2.0 x 10-4 (Cargill et al. 1999
The distribution of uptake values for all variants analyzed is shown in Figure 2. The uptake values show a multimodal distribution, with breaks in the distribution at 25%-40%, 55%-60%, and 150%-175% of the activity of the control. Twenty-two (25%) of the 88 variants tested exhibited decreased transport function (defined as uptake <60% of control) (Table 2). Of the 88 protein-altering variants tested, 50 (57%) of the variants were polymorphic (defined as allele frequency 1% in at least one ethnic population), and seven (14%) of those 50 polymorphisms had decreased transport function. Interestingly, three variants appeared to be hyperfunctional, that is, had uptake values >150% of the control. These three variants shared the properties (discussed below) of the other variants with greater than 60% activity. Therefore, we used a bimodal retained-function vs. reduced-function model (as opposed to a trimodal normal-function vs. altered-function model) to analyze the data.
Variants with decreased function are more likely to alter evolutionarily conserved amino acid residues To learn about the characteristics of variants that decrease function and thus aid in the development of prediction tools, we evaluated the amino acid substitutions in our data set using several measures (based on degree of chemical change, evolutionary conservation, and/or location in the protein) and examined relationships between the nature of the amino acid substitution and protein function. For these analyses, only amino acid substitutions were considered, due to problems inherent in quantifying the chemical change or evolutionary conservativeness of insertions, deletions, and nonsense mutations. Of the five frameshift and nonsense mutations in the dataset, all showed virtually no activity.
The amino acid substitution matrix of Grantham (1974
The amino acid residues found in the transmembrane regions of proteins are highly conserved throughout evolution, owing to unique physical constraints on membrane-spanning helices (Leabman et al. 2003
We have used two methods to evaluate evolutionary conservation of the variant sites in our dataset. In the first method, based on multiple sequence alignment with known vertebrate orthologs, each amino acid substitution was scored as either evolutionarily conserved (EC) or evolutionarily unconserved (EU). We observed that 12 of the 35 (34%) EC variants resulted in decreased function compared with only four of the 45 (9%) EU variants (
Selection acts on variants with decreased function in cellular assays, and on variants that retain function in cellular assays but alter evolutionarily conserved residues We then plotted the allele frequency distributions of the EC variants that retained function and the EU variants that retained function (Fig. 3B). Interestingly, the EC variants that retained function had an allele frequency distribution that was skewed toward lower frequencies and was significantly different from that of the EU variants that retained function (Log-Rank, P = 0.02). The data suggest that variants that appear to retain function in biochemical assays, but alter evolutionarily conserved residues, may affect some function important in organism fitness that is not measured in these assays. For example, this function may be an entirely different (i.e., nontransport) function mediated by the same gene, or may simply be the transport activity with respect to substrates that were not studied. Figure 4 shows one variant of OAT3, OAT3-I305F, which retained activity toward one substrate, the peptic ulcer drug, cimetidine, but had reduced activity toward the model substrate estrone sulfate, an endogenous steroid hormone.
Since the uptake of multiple substrates had been measured for variants of nine of the 11 transporters in our dataset, we calculated the fraction of variants that showed substrate-specific changes in uptake activity. Of the 58 variants for which multiple substrates had been assayed, eight (14%) showed substrate-specific differences (Table 3). The distribution of allele frequencies for those eight substrate-specificity variants was comparable to that of the entire dataset, and contained both rare (<1%) EC variants and common (>10%) EU variants. Notably, however, the allele frequency distribution of these specificity variants was significantly different from that of the reduced function variants, with the specificity variants having higher allele frequencies than variants that exhibited reduced activity toward the prototypical substrate (Log-Rank, P = 0.01).
Our study suggests that healthy human populations harbor a significant number of severely reduced function polymorphisms and rare variants. In a set of 88 protein-altering variants from 11 membrane transporter genes, we found that 14% of the polymorphic (allele frequency 1% in at least one ethnic population) variants had decreased transport function (see Fig. 2). We then examined the variants to identify any characteristics that could be used to predict a reduction in function. First, we found that mutations that alter more than a single amino acid (e.g., frameshift and nonsense mutations) all showed virtually complete loss of function. For the amino acid substitutions, we examined the magnitude of the chemical change, and found that there was a trend toward larger chemical changes in variants with decreased function compared with those that retained function. These data are consistent with Miller and Kumar who demonstrated that amino acid substitutions associated with disease had higher Grantham values (larger chemical changes) than amino acid substitutions across species (Miller and Kumar 2001
We then assessed the ability of evolutionary conservation to predict effects on protein function. Previous studies have demonstrated that EC residues are under stronger purifying selection than EU residues, suggesting that variation at EC residues is more likely to affect protein function than variation at EU residues. For example, Miller and Kumar demonstrated that nonsynonymous variants associated with disease occur at EC sites more frequently than expected by chance (Miller and Kumar 2001 We found that our measure of protein function in cellular assays correlated significantly with allele frequency, a measure of effect on human fitness. That is, variants with grossly impaired function in biochemical assays were present at lower allele frequencies than variants that retained function, consistent with the idea that biochemical assays should be performed as a confirmatory measure for variants found to be associated with a disease phenotype. However, we found that alleles that altered evolutionarily conserved amino acid residues, but retained apparently normal function in biochemical assays, were also under negative or purifying selection. This finding suggests that even direct biochemical assay of variant protein function is not perfectly predictive of function in vivo, and that evolutionary conservation contains residual information independent of loss/retention of function in biochemical assays. An implication of this is that a negative finding in a biochemical assay of a disease-associated polymorphism is not necessarily evidence against a role of that variant in the phenotype of interest. This may be particularly important to the genetics of complex disease, in which the contribution of any individual risk-conferring polymorphism is expected to be very small, and thus may not be detectable in cellular assays.
Negative selection may act on variants that appear to "retain" function in cellular assays when those variants specifically alter occult functions of the protein that aren't measured in the assay or when small changes in protein function have large physiological consequences. For the membrane transport proteins in our study, possible occult functions include the transport of physiologically relevant substrates other than the model substrate. We examined this possibility by calculating the fraction of variants for which the transport of more than one substrate had been measured that showed substrate-specific changes in transport. Although relatively few variants (14%) showed substrate-specific changes in function, we likely underestimated the true fraction of these variants, since not all of the physiologically relevant substrates are known for each transporter and not all known substrates were tested. Variants with substrate-specific effects on function are probably not unique to membrane transporters, but common to all proteins that have multiple catalytic activities, multiple substrates, or multiple binding partners. This has important implications for pharmacogenetic association studies, since some of the protein variants that associate with variation in response to one drug may not associate with variation in the other drugs that interact with the same protein. Future biochemical assays of variant protein function should be interpreted with respect to how well the pertinent functions of the studied protein are known and how many of those functions are measured by the assay. Likewise, our best measure of evolutionary conservation (SIFT) failed to predict 25% of the reduced-function variants. Since measures of evolutionary conservation ignore species-specific physiology and are extremely sensitive to the availability of homologous sequence, they cannot substitute for direct measurement of protein function to predict and understand phenotypic diversity.
Early successes in pharmacogenetics (for example, the identification of the genetic determinants of polymorphism in debrisoquine metabolism [Gonzalez et al. 1988
As we and others have seen, the relative frequency of deleterious mutations (the allelic spectrum) varies from gene to gene, and it may be that for many genes, association with a particular phenotype cannot be explained by one or a small number of high-frequency variants, even when the effect of that gene is significant and the phenotype is relatively common. A well-studied example is the association between the MC4R gene and obesity, in which no single variant occurs at a sufficiently high frequency to establish a significant association, yet the sum of deleterious variants of the MC4R gene has been shown consistently to be higher in obese individuals than in non-obese controls (Vaisse et al. 1998
Variant identification The coding regions (all exons and 50-100 bp of flanking intronic region per exon) of 11 membrane transporter genes [SLC22A1 (OCT1), U77086 [GenBank] ; SLC22A2 (OCT2), X98333 [GenBank] ; SLC22A4 (OCTN1), NM_003059 [GenBank] ; SLC22A5 (OCTN2), NM_003060 [GenBank] ; SLC22A6 (OAT1), AF097490 [GenBank] ; SLC22A8 (OAT3), NM_004254 [GenBank] ; SLC28A1 (CNT1), U62968 [GenBank] ; SLC28A2 (CNT2), U84392 [GenBank] ; SLC28A3 (CNT3), AF305210 [GenBank] ; SLC29A1 (ENT1), U81375 [GenBank] ; SLC29A2 (ENT2), AF029358 [GenBank] ] were screened for polymorphism by denaturing HPLC or by direct sequencing of a large number of DNA samples collected from ethnically diverse populations. Set I genes (SLC22A1, SLC22A2, SLC28A1, SLC28A2, SLC29A1, and SLC29A2) were screened using ethnically identified DNA samples (100 African-Americans and 100 European-Americans) from the Coriell Institute; Set II genes (SLC22A4, SLC22A5, SLC22A6, SLC22A8, and SLC28A3) were screened using a cohort of individuals (80 African-Americans, 80 European-Americans, 60 Asian-Americans, 50 Mexican-Americans, and six Pacific Islanders) from the San Francisco Bay Area enrolled in the SOPHIE project (Studies of Pharmacogenetics in Ethnically Diverse Populations). Nucleotide diversity ( ), which is the average proportion of nucleotide differences between all possible pairs of sequences in the sample, was used to estimate nucleotide diversity at synonymous sites ( S) and amino acid-altering or nonsynonymous sites ( NS) (Tajima 1989
Functional characterization
Data Analysis
Each amino acid substitution was then evaluated for characteristics that might be expected to aid in prediction of functional activity, i.e., evolutionary conservation, degree of chemical change, and location in the protein (transmembrane domain vs. intracellular or extracellular loop regions). The degree of chemical change for each amino acid substitution was scored using the substitution matrix of Grantham (1974
Prediction of in vivo function
This work was supported by the National Institutes of Health (NIH) GM61390 and GM36780.
Article published online ahead of print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.4356206.
3 Corresponding author.
Badagnani, I., Chan, W., Castro, R.A., Brett, C.M., Huang, C.C., Stryke, D., Kawamoto, M., Johns, S.J., Ferrin, T.E., Carlson, E.J., et al. 2005. Functional analysis of genetic variants in the human concentrative nucleoside transporter 3 (CNT3; SLC28A3). Pharmacogenomics J. 5: 157-165.[CrossRef][Medline] Botstein, D. and Risch, N. 2003. Discovering genotypes underlying human phenotypes: Past successes for mendelian disease, future approaches for complex disease. Nat. Genet. 33: 228-237. Cargill, M., Altshuler, D., Ireland, J., Sklar, P., Ardlie, K., Patil, N., Lane, C.R., Lim, E.P., Kalayanaraman, N., Nemesh, J., et al. 1999. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat. Genet. 22: 231-238.[CrossRef][Medline] Chasman, D. and Adams, R.M. 2001. Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: Structure-based assessment of amino acid variation. J. Mol. Biol. 307: 683-706.[CrossRef][Medline] Cohen, J.C., Kiss, R.S., Pertsemlidis, A., Marcel, Y.L., McPherson, R., and Hobbs, H.H. 2004. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305: 869-872. Cooper, G.M., Brudno, M., Green, E.D., Batzoglou, S., and Sidow, A. 2003. Quantitative estimates of sequence divergence for comparative analyses of mammalian genomes. Genome Res. 13: 813-820. Fay, J.C., Wyckoff, G.J., and Wu, C.I. 2001. Positive and negative selection on the human genome. Genetics 158: 1227-1234. Fujita, T., Brown, C., Carlson, E.J., Taylor, T., de la Cruz, M., Johns, S.J., Stryke, D., Kawamoto, M., Fujita, K., Castro, R., et al. 2005. Functional analysis of polymorphisms in the organic anion transporter, SLC22A6 (OAT1). Pharmacogenet. Genomics 15: 201-209.[Medline] Gonzalez, F.J., Skoda, R.C., Kimura, S., Umeno, M., Zanger, U.M., Nebert, D.W., Gelboin, H.V., Hardwick, J.P., and Meyer, U.A. 1988. Characterization of the common genetic defect in humans deficient in debrisoquine metabolism. Nature 331: 442-446.[CrossRef][Medline] Grantham, R. 1974. Amino acid difference formula to help explain protein evolution. Science 185: 862-864. Gray, J.H., Mangravite, L.M., Owen, R.P., Urban, T.J., Chan, W., Carlson, E.J., Huang, C.C., Kawamoto, M., Johns, S.J., Stryke, D., et al. 2004. Functional and genetic diversity in the concentrative nucleoside transporter, CNT1, in human populations. Mol. Pharmacol. 65: 512-519. Halushka, M.K., Fan, J.B., Bentley, K., Hsie, L., Shen, N., Weder, A., Cooper, R., Lipshutz, R., and Chakravarti, A. 1999. Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis. Nat. Genet. 22: 239-247.[CrossRef][Medline] Hartl, D.L. and Clark, A.G. 1997. Principles of population genetics. Sinauer Associates, Sunderland, MA. Hirschhorn, J.N. and Altshuler, D. 2002. Once and again-issues surrounding replication in genetic association studies. J. Clin. Endocrinol. Metab. 87: 4438-4441. Hirschhorn, J.N., Lohmueller, K., Byrne, E., and Hirschhorn, K. 2002. A comprehensive review of genetic association studies. Genet. Med. 4: 45-61.[Medline] Howard, H.C., Mount, D.B., Rochefort, D., Byun, N., Dupre, N., Lu, J., Fan, X., Song, L., Riviere, J.B., Prevost, C., et al. 2002. The K-Cl cotransporter KCC3 is mutant in a severe peripheral neuropathy associated with agenesis of the corpus callosum. Nat. Genet. 32: 384-392.[CrossRef][Medline] Leabman, M.K., Huang, C.C., Kawamoto, M., Johns, S.J., Stryke, D., Ferrin, T.E., DeYoung, J., Taylor, T., Clark, A.G., Herskowitz, I., et al. 2002. Polymorphisms in a human kidney xenobiotic transporter, OCT2, exhibit altered function. Pharmacogenetics 12: 395-405.[CrossRef][Medline] Leabman, M.K., Huang, C.C., DeYoung, J., Carlson, E.J., Taylor, T.R., de la Cruz, M., Johns, S.J., Stryke, D., Kawamoto, M., Urban, T.J., et al. 2003. Natural variation in human membrane transporter genes reveals evolutionary and functional constraints. Proc. Natl. Acad. Sci. 100: 5896-5901. Lockridge, O. 1990. Genetic variants of human serum cholinesterase influence metabolism of the muscle relaxant succinylcholine. Pharmacol. Ther. 47: 35-60.[CrossRef][Medline] Miller, M.P. and Kumar, S. 2001. Understanding human disease mutations through the use of interspecific genetic variation. Hum. Mol. Genet. 10: 2319-2328. Muller, T., Rahmann, S., and Rehmsmeier, M. 2001. Non-symmetric score matrices and the detection of homologous transmembrane proteins. Bioinformatics 17: S182-S189.[Abstract] Ng, P.C. and Henikoff, S. 2001. Predicting deleterious amino acid substitutions. Genome Res. 11: 863-874. . 2002. Accounting for human polymorphisms predicted to affect protein function. Genome Res. 12: 436-446. . 2003. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 31: 3812-3814. Ng, P.C., Henikoff, J.G., and Henikoff, S. 2000. PHAT: A transmembrane-specific substitution matrix. Predicted hydrophobic and transmembrane. Bioinformatics 16: 760-766. Osato, D.H., Huang, C.C., Kawamoto, M., Johns, S.J., Stryke, D., Wang, J., Ferrin, T.E., Herskowitz, I., and Giacomini, K.M. 2003. Functional characterization in yeast of genetic variants in the human equilibrative nucleoside transporter, ENT1. Pharmacogenetics 13: 297-301.[CrossRef][Medline] Owen, R.P., Gray, J.H., Taylor, T.R., Carlson, E.J., Huang, C.C., Kawamoto, M., Johns, S.J., Stryke, D., Ferrin, T.E., and Giacomini, K.M. 2005. Genetic analysis and functional characterization of polymorphisms in the human concentrative nucleoside transporter, CNT2. Pharmacogenet. Genomics 15: 83-90.[Medline] Pascual, J.M., Wang, D., Lecumberri, B., Yang, H., Mao, X., Yang, R., and De Vivo, D.C. 2004. GLUT1 deficiency and other glucose transporter diseases. Eur. J. Endocrinol. 150: 627-633.[Abstract] Patil, N., Berno, A.J., Hinds, D.A., Barrett, W.A., Doshi, J.M., Hacker, C.R., Kautzer, C.R., Lee, D.H., Marjoribanks, C., McDonough, D.P., et al. 2001. Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science 294: 1719-1723. Ramensky, V., Bork, P., and Sunyaev, S. 2002. Human non-synonymous SNPs: Server and survey. Nucleic Acids Res. 30: 3894-3900. Risch, N.J. 2000. Searching for genetic determinants in the new millennium. Nature 405: 847-856.[CrossRef][Medline] Risch, N., Burchard, E., Ziv, E., and Tang, H. 2002. Categorization of humans in biomedical research: Genes, race and disease. Genome Biol. 3: comment2007. Shu, Y., Leabman, M.K., Feng, B., Mangravite, L.M., Huang, C.C., Stryke, D., Kawamoto, M., Johns, S.J., DeYoung, J., Carlson, E., et al. 2003. Evolutionary conservation predicts function of variants in the human organic cation transporter, OCT1. Proc. Natl. Acad. Sci. 100: 5902-5907. Stephens, J.C., Schneider, J.A., Tanguay, D.A., Choi, J., Acharya, T., Stanley, S.E., Jiang, R., Messer, C.J., Chew, A., Han, J.H., et al. 2001. Haplotype variation and linkage disequilibrium in 313 human genes. Science 293: 489-493. Sunyaev, S., Ramensky, V., and Bork, P. 2000. Towards a structural basis of human non-synonymous single nucleotide polymorphisms. Trends Genet. 16: 198-200.[CrossRef][Medline] Sunyaev, S., Ramensky, V., Koch, I., Lathe III, W., Kondrashov, A.S., and Bork, P. 2001. Prediction of deleterious human alleles. Hum. Mol. Genet. 10: 591-597. Sunyaev, S., Kondrashov, F.A., Bork, P., and Ramensky, V. 2003. Impact of selection, mutation rate and genetic drift on human genetic variation. Hum. Mol. Genet. 12: 3325-3330. Tajima, F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123: 585-595. Vaisse, C., Clement, K., Guy-Grand, B., and Froguel, P. 1998. A frameshift mutation in human MC4R is associated with a dominant form of obesity. Nat. Genet. 20: 113-114.[CrossRef][Medline] Wang, Z. and Moult, J. 2001. SNPs, protein structure, and disease. Hum. Mutat. 17: 263-270.[CrossRef][Medline] Yeo, G.S., Farooqi, I.S., Aminian, S., Halsall, D.J., Stanhope, R.G., and O'Rahilly, S. 1998. A frameshift mutation in MC4R associated with dominantly inherited human obesity. Nat. Genet. 20: 111-112.[CrossRef][Medline]
Received November 9, 2004; accepted in revised format October 4, 2005. This article has been cited by other articles:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||