|
|
|
|
Published online before print
November 21, 2007, 10.1101/gr.6859308 Genome Res. 18:77-87, 2008 ©2008 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/08 $5.00 OPEN ACCESS ARTICLE
Letter Features of 5'-splice-site efficiency derived from disease-causing mutations and comparative genomics1 Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; 2 Department of Genetics, Institute for Cancer Research, Rikshospitalet-Radiumhospitalet Medical Centre, Montebello 0310, Oslo, Norway; 3 Faculty of Medicine, University of Oslo, Norway; 4 Department of Human Genetics, Aarhus University, Aarhus 8000C, Denmark; 5 Aarhus University Hospital, Sygehus 8000N, Denmark
Many human diseases, including Fanconi anemia, hemophilia B, neurofibromatosis, and phenylketonuria, can be caused by 5'-splice-site (5'ss) mutations that are not predicted to disrupt splicing, according to position weight matrices. By using comparative genomics, we identify pairwise dependencies between 5'ss nucleotides as a conserved feature of the entire set of 5'ss. These dependencies are also conserved in human–mouse pairs of orthologous 5'ss. Many disease-associated 5'ss mutations disrupt these dependencies, as can some human SNPs that appear to alter splicing. The consistency of the evidence signifies the relevance of this approach and suggests that 5'ss SNPs play a role in complex diseases.
The sequenced genomes of a wide range of organisms allow global, comparative analyses of regulatory sequences. The genomic set of splice-site sequences corresponds to a large-scale splicing experiment performed by nature under evolutionary constraints. Here we focus on 5'-splice-site (5'ss) sequences of the U2-type GT-AG class, which comprise over 98% of all splice sites, and use disease-causing mutations, human single nucleotide polymorphisms (SNPs), and variations in natural splice sites in the genome (within and between species) to infer properties inherent to 5'ss, with important implications for human genetics.
Splice sites are conserved sequences at both ends of an intron that are recognized during the initial steps of splicing (Hastings and Krainer 2001
Even though the mammalian 5'ss consensus sequence (CAG|GTAAGT) is perfectly complementary to the 5' end of U1 snRNA, individual 5'ss exhibit considerable variation at different positions, indicating a tolerance for mismatches in U1 base pairing. The free energy of the 5'ss/U1 base pairing is not always a good predictor of 5'ss efficiency (Roca et al. 2005
Mutations at 5'ss are frequent among mutations that cause human disease, from genetic disorders to cancer (Krawczak et al. 1992
We present a comprehensive analysis of the pairwise associations between nucleotides at different 5'ss positions, using the splice-site compilation from SpliceRack (Sheth et al. 2006
The primary determinant of the efficiency of a 5'ss is its match to the PWM (Zhuang and Weiner 1986
Two-nucleotide associations
The surprising observation is that the pattern of deviations of the actual counts from the expected counts is, to a great extent, conserved between species. This suggests that these patterns of deviation are a result of the mechanisms of 5'ss recognition by the splicing machinery (e.g., cf. upper and lower triangles in Fig. 2 as well as Supplemental Figs. S1, S2). The mouse and human patterns show remarkably close similarity. We show the variability for a few pairs within the human genome and across genomes in Table 1. There are some differences, but the overall patterns of deviation are clearly preserved and presumably reflect the pressures that arise from the conservation of the splicing machinery.
The depletion (maroon coloration) in the parts of the matrix that connect the exonic and intronic portions of the 5'ss is a striking feature in all species (Figs. 2, S1, S2; see associations between positions –2 or –1 and positions +3, +4, +5, or +6). This implies that having a nonconsensus nucleotide on the exonic side causes a depletion of nonconsensus nucleotides on the intronic side, and vice versa, which is consistent with the proposal of a seesaw linkage pattern (Burge and Karlin 1997
Some features in the pairwise association matrix might be the result of constraints other than the splicing mechanism. For example, the combinations –3T–2A and –2T–1A are severely depleted in the five species, probably because these combinations can be part of two of the three stop codons (TAG and TAA). The combination +5C+6C is enriched, probably reflecting the gradual conversion of U12-type GT-AG introns into U2-type GT-AG introns (Burge et al. 1998
Another possible explanation for some of the pairwise associations could lie in nucleotide biases in error-prone DNA repair. However, the dependence of these mechanisms on the neighboring nucleotides is not very pronounced (Krawczak et al. 1998 There are some species-specific features in the pairwise associations. For example, there are tighter associations (brighter blues) between the exonic nucleotides at positions –2 and –1 in mammals and A. thaliana than in the two invertebrates (Figs. 2, S1, S2). In contrast, D. melanogaster and C. elegans have more biases between intronic nucleotides (positions +3 to +6 show brighter blues).
The human genome is organized into isochores, i.e., regions with relatively stable GC content compared with the variation of the GC content across the genome (Costantini et al. 2006
Implications for the splicing machinery
After U1 snRNA is displaced from the 5'ss, U5 snRNA binds to the exonic 5'ss positions via a U-rich sequence in the invariant loop 1 (Newman and Norman 1992
We observed that the pair +3C+4T is enhanced in the five species. This association likely reflects base pairing to the U6 ACAGAG box (the nucleotides involved in base pairing to positions +3 and +4 are in italics) (Wassarman and Steitz 1992
There are a number of biased combinations that cannot be explained by base pairing to any of these snRNAs, such as an enhancement of +4C+5C in all species, or –2C–1T in the vertebrates. Most importantly, these tend to be less conserved than the previous associations. It is possible that these combinations are part of binding sites for proteins that influence 5'ss selection, such as U1-C, PRPF8, SR proteins, or hnRNP proteins (Mayeda and Krainer 1992
Scoring the pairwise associations within 5'ss In Table 1 we show the variations for a few pairs across isochores within the human genome. This establishes that, overall, the compositional variations can change the ratios, but the trend remains the same; depleted (enhanced) pairs are depleted (enhanced) in all sets, irrespective of the origin of the 5'ss. Most patterns are maintained across species, with the CpG motifs being prominent exceptions. The justification for the scoring scheme lies in the conserved nature of these depletions and enhancements and in the fact that the scoring scheme is an indicator of the effects of disrupting the pairwise association patterns.
Disease-causing mutations at 5'ss The reduction in PWM scores caused by many of the mutations cannot, by itself, explain the severe effects on splicing, because there are other pairs of functional 5'ss in the human genome that have the same nucleotide change. Indeed, we observed that the natural 5'ss tend to have better pairwise-association scores than the mutant 5'ss (Fig. 3).
We picked a well-studied set among disease-causing 5'ss mutations, i.e., a subclass that consists of A-to-G transitions at position +3, for further theoretical and experimental analyses. A (59%) and G (35%) are both conserved at position +3 (Sheth et al. 2006 ), a post-transcriptionally modified uridine isomer (Reddy et al. 1981
These observations can be explained by the dependencies between position +3 (A/G) and the nucleotides at positions +4 and +5 (Fig. 2). The association of +3A to nonconsensus nucleotides at +4 (C, G, T) and +5 (A, C, T) is blue (enhanced), whereas the association of +3G to nonconsensus nucleotides at +4 and +5 is maroon (depleted). An interesting prediction is that if both +4 and +5 are nonconsensus, then the splicing defect in the mutant can be fixed by converting either +4 or +5 independently to the consensus (see below).
Experimental tests of the associations
First, we found that the +3 A-to-G mutation in the ACADSB 5'ss severely reduces the 5'ss strength (Fig. 6). When the ACADSB wild-type 5'ss (+3A) or its +3 A-to-G mutant version (+3G) were tested in competition against the beta-globin cryptic 5'ss at –16 (Treisman et al. 1983
Second, we found that correcting positions +4 and/or +5 to match the consensus can alleviate the effects of the +3 A-to-G mutation, which is in concordance with our above-mentioned prediction derived from the pairwise associations. To test for the rescue of splicing by correcting these positions, we compared two ACADSB 5'ss with the same combination of nucleotides at positions +4, +5, and +6, but one having A at +3 and the other one G at +3 (Fig. 7). In general, we found that +3G 5'ss use was positively correlated with the number of consensus nucleotides at positions +4 to +6 (lanes 1–3, 5). However, the two reciprocal experiments shown in this figure do not match, in the sense that the activation of the +3G 5'ss is influenced by the relative position of the two competing 5'ss (lanes 1–6, cf. the +3G-spliced band between both panels). This difference is probably due to the influence of the sequences that flank the 5'ss at positions –16 and +1. Notwithstanding these positional effects, our conclusion that correction of positions +4 and/or +5 rescues splicing of the +3G 5'ss is still supported by the data.
Finally, the +3G 5'ss can be activated by correcting position +4 alone (Fig. 7, lane 3). This is not surprising, because a base pair at +4 with U1 might stabilize the wobble G- base pair at +3. Strikingly, correcting position +5 alone alleviated the severity of the A-to-G mutation at +3 (Fig. 7, lane 5). This indicates that nonadjacent interactions (such as those involving positions +3 and +5) at the 5'ss can affect splice-site efficiency.
Orthologous 5'ss between the mouse and human genomes We found that 36.5% of these pairs have identical splice-site sequences and 36.4% show a single nucleotide change. The nucleotide change between orthologous pairs that differ at a single position does not usually disrupt the PWM score and the pairwise associations within 5'ss (Fig. 3). We also observed a compensatory trend: If one member of an orthologous 5'ss pair has a weaker PWM score, then its pairwise-association score tends to be higher (Fig. 8).
In the orthologous 5'ss pairs with two nucleotide changes (19.2%), certain coordinated changes are preferred over others. For example, +5C+6C changes to +5T+6G more often than expected (72 vs. 28 expected; both combinations are strongly enriched), whereas –1T+6T to –1C+6A is strongly suppressed (occurs once vs. 20 expected; the first combination is enhanced, whereas the second combination is strongly depleted). The probability of a single change occurring, e.g., +5C to +5T, is calculated by counting the number of such transitions over the pairs of which at least one of them has a C at +5. The expected number of coordinated pairwise changes is calculated by assuming independence between the transitions at the two positions in the 5'ss. Evidently, the pairwise-association patterns reflect constraints on the evolution of 5'ss due to the conserved splicing machinery, and these association patterns could hold clues into mechanistic features of the splicing machinery.
SNPs at 5'ss We calculated the PWM scores and pairwise-association score differences for SNPs (Fig. 3C). The average PWM score for 5'ss with SNPs (5.92) is higher than that for simulated SNPs (5.54), which in turn is higher than the average score for disease-causing mutations (5.31). The PWM scores of orthologs that show one difference between human and mouse are higher on average than the SNP set (6.2 vs. 5.92), suggesting that strong 5'ss (PWM score >6.2) can tolerate such changes without affecting splicing. From Figure 3, we can see that the disruption of the associations by the SNPs is smaller than the disruption by the simulated SNPs, which in turn is smaller than the disruption by the 5'ss disease-causing mutations, suggesting selective pressure on SNPs to preserve the associations. For instance, Figure 4 shows that SNPs with alleles A and G at position +3 tend to have consensus nucleotides at positions +4 and +5, as expected from the previous discussion on +3 A-to-G disease mutations.
By using these findings (see Fig. 3), we were able to establish criteria to identify SNPs that might disrupt splicing. We used the average score of the disease set to identify low PWM scores (
Disruptive 5'ss SNP predictions and evidence for their effect on splicing
We used EST and mRNA alignments to the genome to establish disruption caused by SNPs at 5'ss, using the UCSC browser (Kent et al. 2002 In agreement with our expectations, we found that the sets LHH (10 disruptive SNPs out of 35), LHL (eight disruptive SNPs out of 35), and LLH (12 disruptive SNPs out of 43) are more disruptive than the control sets HLL (five disruptive SNPs out of 58) and LLL (19 disruptive SNPs out 111). By using bootstrapping to estimate the variance of these numbers, we found that the LLL and LLH sets are significantly different from each other (P-value of 0.0003 using an unpaired t-test). The stronger disruption shown by the LLH set compared with the LLL set confirms the predictive nature of the association scores. A sampling of SNPs from these sets is given in Table 2.
This analysis is confounded by the fact that the ESTs might not have sampled the SNP alleles of interest, due to the genotype of the source. The intronic SNPs are also not likely to be sampled in the ESTs. In addition, some of the alternative spliced products might be degraded by NMD, due to the presence of a premature stop codon and thus would not be observed (Lejeune and Maquat 2005
We expected the genotypes to show a preference for alleles of SNPs at 5'ss that do not disrupt splicing, but we did not detect a strong signal in the distribution of genotypes. The effect of SNPs at 5'ss might be alleviated by the sequence context. For example, in the ACADM (also called MCAD) gene (Nielsen et al. 2007
We have shown that the effects of disease-causing mutations are often a result of the disruption of conserved patterns in associations between nucleotides at different positions within the 5'ss, and SNPs that disrupt these pairwise associations tend to affect splicing. A majority of SNPs respect the associations, as expected from SNP neutrality. In addition, we have shown that orthologous 5'ss mouse–human pairs show changes that likewise respect the associations, which suggests the existence of selective pressure to maintain them. A set of simulated SNPs is more disruptive of associations than neutral SNPs but is better than the disease-causing mutations in this respect. This is expected from the lack of selection pressure on the simulated SNPs.
The conservation of the associations is indicative of selective pressures reflecting functional features of the splicing machinery, and allows inferences to be made about the underlying mechanisms. The pairwise associations confirm many known effects, but also suggest new areas for exploration. We found pairwise associations probably related to U1 base pairing, such as specific patterns involving consensus and nonconsensus nucleotides across the exon–intron boundary. We found one association probably related to base pairing to U6, but none related to U5. The long-range associations are likely related to protein–RNA interactions, and further experiments should shed light on them. Similar studies could be carried out using 3'ss. However, the longer span of this motif (Sheth et al. 2006 By using the data from disease-causing mutations in 5'ss, differences between orthologous pairs of mouse and human 5'ss, and genomic data for five species, we were able to generate criteria for prediction of splicing-disruptive SNPs at 5'ss. We have shown that circumstantial evidence from ESTs provides support for these predictions and encourages further experiments to study their effects in vivo.
The disruptive SNPs may provide insights into genetic networks. If a proven disruptive SNP is in Hardy-Weinberg equilibrium, it suggests that the genetic network is immune to the change, and this can be a starting point for investigating the reasons for this robustness of the network. Alternatively, such SNPs might be implicated in complex diseases, wherein their effects are apparent only under certain genetic and environmental conditions. For example, there has been work on SNPs affecting the p53 pathway, such as a SNP in the MDM2 gene that alters a transcription-factor binding site and hence the levels of p53 (Bond and Levine 2007 This study illustrates the power of the convergence of different data sets for obtaining insights into mechanisms of gene expression and for understanding the neutral and disease-causing nucleotide changes found in human populations.
Scoring splice sites with PWM PWMs reflect the frequencies of the 4 nt (A, C, G, and T) at each position of the splice site. PWMs can be used to score a site by converting the frequencies into a log-odds score (log of the ratio of the actual frequency and the expected frequency) (Sheth et al. 2006
Scoring associations within splice sites In order to avoid small statistical fluctuations affecting the scores (e.g., an actual occurrence of three versus an expected number of 45 would give the relevant pair a strong negative score), we excluded those pairs that are expected less than 50 times in the genome. This does not affect the scoring of most splice sites, as these pairs are by definition rare.
Collection of data sets
In vitro splicing
The beta-globin wild-type 5'ss and the cryptic 5'ss at position –16 were replaced by various mutant ACADSB 5'ss by PCR mutagenesis. We inactivated a second cryptic 5'ss in beta-globin that is located 38 nt upstream of the authentic 5'ss (Treisman et al. 1983
EST analysis
Jeremiah Faith and Susan Janicki gave insightful comments on the manuscript. The anonymous reviewers gave useful criticisms that helped improve the paper. A.J.O. and R.S. thank the DART Neurogenomics Alliance for support. X.R. and A.R.K. acknowledge support from NIH grant CA13106. A.R.R. acknowledges support from the Department of Biotechnology (India) grant BT/IN/BTOA/03/2005.
6 Present address: IASRI, New Delhi 110012, India.
E-mail sachidan{at}cshl.edu; fax (516) 367-8389. [Supplemental material is available online at www.genome.org.] Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.6859308
Bond, G.L. and Levine, A.J. 2007. A single nucleotide polymorphism in the p53 pathway interacts with gender, environmental stresses and tumor genetics to influence cancer in humans. Oncogene 26: 1317–1323.[CrossRef][Medline] Brow, D.A. 2002. Allosteric cascade of spliceosome activation. Annu. Rev. Genet. 36: 333–360.[CrossRef][Medline] Brunak, S., Engelbrecht, J., and Knudsen, S. 1991. Prediction of human mRNA donor and acceptor sites from the DNA sequence. J. Mol. Biol. 220: 49–65.[CrossRef][Medline] Buratti, E., Chivers, M., Kralovicova, J., Romano, M., Baralle, M., Krainer, A.R., and Vorechovsky, I. 2004. Aberrant 5' splice sites in human disease genes: Mutation pattern, nucleotide structure and comparison of computational tools that predict their utilization. Nucleic Acids Res. 35: 4250–4263.[CrossRef] Buratti, E., Baralle, M., Conti, L.D., Baralle, D., Romano, M., Ayala, Y.M., and Baralle, F.E. 2007. hnRNP H binding at the 5' splice site correlates with the pathological effect of two intronic mutations in the NF-1 and TSHb genes. Nucleic Acids Res. 32: 4224–4236.[CrossRef] Burge, C. and Karlin, S. 1997. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268: 78–94.[CrossRef][Medline] Burge, C.B., Padgett, R.A., and Sharp, P.A. 1998. Evolutionary fates and origins of U12-type introns. Mol. Cell 2: 773–785.[CrossRef][Medline] Cáceres, J.F., Stamm, S., Helfman, D.M., and Krainer, A.R. 1994. Regulation of alternative splicing in vivo by overexpression of antagonistic splicing factors. Science 265: 1706–1709. Carmel, I., Tal, S., Vig, I., and Ast, G. 2004. Comparative analysis detects dependencies among the 5' splice-site positions. RNA 10: 828–840. Cartegni, L., Chew, S., and Krainer, A.R. 2002. Listening to silence and understanding nonsense: Exonic mutations that affect splicing. Nat. Rev. Genet. 3: 285–298.[CrossRef][Medline] Costantini, M., Clay, O., Auletta, F., and Bernardi, G. 2006. An isochore map of human chromosomes. Genome Res. 16: 536–541. Crotti, L.B., Bacikova, D., and Horowitz, D.S. 2007. The PRP18 protein stabilizes the interaction of both exons with the U5 snRNA during the second step of pre-mRNA splicing. Genes & Dev. 21: 1204–1216. Du, H. and Rosbash, M. 2002. The U1 snRNP protein U1C recognizes the 5' splice site in the absence of base pairing. Nature 419: 86–90.[CrossRef][Medline] ElSharawy, A., Manaster, C., Teuber, M., Rosenstiel, P., Kwiatkowski, R., Huse, K., Platzer, M., Becker, A., Nurnberg, P., Schreiber, S., et al. 2006. SNPSplicer: systematic analysis of SNP-dependent splicing in genotyped cDNAs. Hum. Mutat. 27: 1129–1134.[CrossRef][Medline] Field, L.L., Bonnevie-Nielsen, V., Pociot, F., Lu, S., Nielsen, T.B., and Beck-Nielsen, H. 2005. OAS1 splice site polymorphism controlling antiviral enzyme activity influences susceptibility to type 1 diabetes. Diabetes 54: 1588–1591.[CrossRef][Medline] Hastings, M.L. and Krainer, A.R. 2001. Pre-mRNA splicing in the new millennium. Curr. Opin. Cell Biol. 13: 302–309.[CrossRef][Medline] Jurica, M.S. and Moore, M.J. 2003. Pre-mRNA splicing: Awash in a sea of proteins. Mol. Cell 12: 5–14.[CrossRef][Medline] Kandels-Lewis, S. and Séraphin, B. 1993. Involvement of U6 snRNA in 5' splice site selection. Science 262: 2035–2039. Kawase, T., Akatsuka, Y., Torikai, H., Morishima, S., Oka, A., Tsujimura, A., Miyazaki, M., Tsujimura, K., Miyamura, K., Ogawa, S., et al. 2007. Alternative splicing due to an intronic SNP in HMSD generates a novel minor histocompatibility antigen. Blood 110: 1055–1063. Kent, W.J., Sugnet, C.W., Furey, T.S., Roskin, K.M., Pringle, T.H., Zahler, A.M., and Haussler, D. 2002. The Human Genome Browser at UCSC. Genome Res. 12: 996–1006. Kim, E., Magen, A., and Ast, G. 2007. Different levels of alternative splicing among eukaryotes. Nucleic Acids Res. 35: 125–131. Krawczak, M., Reiss, J., and Cooper, D.N. 1992. The mutational spectrum of single base-pair substitutions in mRNA splice junctions of human genes: Causes and consequences. Hum. Genet. 90: 41–54.[Medline] Krawczak, M., Ball, E.V., and Cooper, D.N. 1998. Neighboring-nucleotide effects on the rates of germ-line single-base-pair substitution in human genes. Am. J. Hum. Genet. 63: 474–488.[CrossRef][Medline] Krawczak, M., Thomas, N.S., Hundrieser, B., Mort, M., Wittig, M., Hampe, J., and Cooper, D.N. 2007. Single base-pair substitutions in exon-intron junctions of human genes: Nature, distribution, and consequences for mRNA splicing. Hum. Mutat. 28: 150–158.[CrossRef][Medline] Ladd, A.N. and Cooper, T.A. 2002. Finding signals that regulate alternative splicing in the post-genomic era. Genome Biol. 3: 1–16. doi: 10.1186/gb-2002-3-11-reviews0008.[Medline] Lejeune, F. and Maquat, L.E. 2005. Mechanistic links between nonsense-mediated mRNA decay and pre-mRNA splicing in mammalian cells. Curr. Opin. Cell Biol. 17: 309–315.[CrossRef][Medline] Lesser, C.F. and Guthrie, C. 1993. Mutations in U6 snRNA that alter splice site specificity: Implications for the active site. Science 6: 1982–1988. Madsen, P.P., Kibaek, M., Roca, X., Sachidanandam, R., Krainer, A.R., Christensen, E., Steiner, R.D., Gibson, K.M., Corydon, T.J., Knudsen, I., et al. 2006. Short/branched-chain acyl-CoA dehydrogenase deficiency due to an IVS3+3A>G mutation that causes exon skipping. Hum. Genet. 118: 680–690.[CrossRef][Medline] Maroney, P.A., Romfo, C.M., and Nilsen, T.W. 2000. Functional recognition of 5' splice site by U4/U6.U5 tri-snRNP defines a novel ATP-dependent step in early spliceosome assembly. Mol. Cell 6: 317–328.[CrossRef][Medline] Mayeda, A. and Krainer, A.R. 1992. Regulation of alternative pre-mRNA splicing by hnRNP A1 and splicing factor SF2. Cell 68: 365–375.[CrossRef][Medline] Mayeda, A. and Krainer, A.R. 1999a. Mammalian in vitro splicing assays. Methods Mol. Biol. 118: 315–321.[Medline] Mayeda, A. and Krainer, A.R. 1999b. Preparation of HeLa cell nuclear and cytosolic S100 extracts for in vitro splicing. Methods Mol. Biol. 118: 309–314.[Medline] Nakai, K. and Sakamoto, H. 1994. Construction of a novel database containing aberrant splicing mutations of mammalian genes. Gene 141: 171–177.[CrossRef][Medline] Newman, A.J. and Norman, C. 1992. U5 snRNA interacts with exon sequences at 5' and 3' splice sites. Cell 68: 743–754.[CrossRef][Medline] Nielsen, K.B., Sorensen, S., Cartegni, L., Corydon, T.J., Doktor, T.K., Schroeder, L.D., Reinert, L.S., Elpeleg, O., Krainer, A.R., Gregersen, N., et al. 2007. Seemingly neutral polymorphic variants may confer immunity to splicing-inactivating mutations: A synonymous SNP in exon 5 of MCAD protects from deleterious mutations in a flanking exonic splicing enhancer. Am. J. Hum. Genet. 80: 416–432.[CrossRef][Medline] Ohno, K., Brengman, J.M., Felice, K.J., Cornblath, D.R., and Engel, A.G. 1999. Congenital end-plate acetylcholinesterase deficiency caused by a nonsense mutation and an A OKeefe, R.T., Norman, C., and Newman, A.J. 1996. The invariant U5 snRNA loop 1 sequence is dispensable for the first catalytic step of pre-mRNA splicing in yeast. Cell 86: 679–689.[CrossRef][Medline] Reddy, R., Henning, D., and Busch, H. 1981. Pseudouridine residues in the 5'-terminus of uridine-rich nuclear RNA I (U1 RNA). Biochem. Biophys. Res. Commun. 98: 1076–1078.[CrossRef][Medline] Roca, X., Sachidanandam, R., and Krainer, A.R. 2003. Intrinsic differences between authentic and cryptic 5' splice sites. Nucleic Acids Res. 31: 6321–6333. Roca, X., Sachidanandam, R., and Krainer, A.R. 2005. Determinants of the inherent strength of human 5' splice sites. RNA 11: 683–698. Senapathy, P., Shapiro, M.B., and Harris, N.L. 1990. Splice junctions, branch point sites, and exons: Sequence statistics, identification, and applications to genome project. Methods Enzymol. 183: 252–278.[Medline] Séraphin, B., Kretzner, L., and Rosbash, M. 1988. A U1 snRNA:pre-mRNA base pairing interaction is required early in yeast spliceosome assembly but does not uniquely define the 5' cleavage site. EMBO J. 183: 2533–2538. Shapiro, M.B. and Senapathy, P. 1987. RNA splice junctions of different classes of eukaryotes: Sequence statistics and functional implications in gene expression. Nucleic Acids Res. 15: 7155–7174. Sheth, N., Roca, X., Hastings, M.L., Roeder, T., Krainer, A.R., and Sachidanandam, R. 2006. Comprehensive splice-site analysis using comparative genomics. Nucleic Acids Res. 34: 3955–3967. Shinmura, K., Tao, H., Yamada, H., Kataoka, H., Sanjar, R., Wang, J., Yoshimura, K., and Sugimura, H. 2004. Splice-site genetic polymorphism of the human kallikrein 12 (KLK12) gene correlates with no substantial expression of KLK12 protein having serine protease activity. Hum. Mutat. 24: 273–274.[Medline] Siliciano, P.G. and Guthrie, C. 1988. 5' splice site selection in yeast: Genetic alterations in base pairing with U1 reveal additional requirements. Genes & Dev. 2: 1258–1267. Skarratt, K.K., Fuller, S.J., Sluyter, R., Dao-Ung, L.P., Gu, B.J., and Wiley, J.S. 2005. A 5' intronic splice site polymorphism leads to a null allele of the P2X7 gene in 1-2% of the Caucasian population. FEBS Lett. 579: 2675–2678.[CrossRef][Medline] Smigielski, E.M., Sirotkin, K., Ward, M., and Sherry, S.T. 2000. dbSNP: A database of single nucleotide polymorphisms. Nucleic Acids Res. 28: 352–355. Stenson, P., Ball, E.V., Mort, M., Phillips, A.D., Shiel, J.A., Thomas, N.S., Abeysinghe, S., Krawczak, M., and Cooper, D.N. 2003. Human Gene Mutation Database (HGMD): 2003 update. Hum. Mutat. 21: 577–581.[CrossRef][Medline] Teraoka, S.N., Telatar, M., Becker-Catania, S., Liang, T., Onengut, S., Tolun, A., Chessa, L., Sanal, O., Bernatowska, E., Gatti, R.A., et al. 1999. Splicing defects in the ataxia-telangiectasia gene, ATM: Underlying mutations and consequences. Am. J. Hum. Genet. 64: 1617–1631.[CrossRef][Medline] Treisman, R., Orkin, S.H., and Maniatis, T. 1983. Specific transcription and RNA splicing defects in five cloned β-thalassaemia genes. Nature 302: 591–596.[CrossRef][Medline] Tweedie, S., Charlton, J., Clark, V., and Bird, A. 1997. Methylation of genomes and genes at the invertebrate-vertebrate boundary. Mol. Cell. Biol. 17: 1469–1475.[Abstract] Wassarman, D.A. and Steitz, J.A. 1992. Interactions of small nuclear RNAs with precursor messenger RNA during in vitro splicing. Science 257: 1918–1925. Yeo, G. and Burge, C.B. 2004. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J. Comput. Biol. 11: 377–394.[CrossRef] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||