|
|
|
|
Genome Res. 14:2102-2110, 2004 ©2004 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/04 $5.00 Methods High-Throughput Expression of C. elegans Proteins1 Center for Biophysical Sciences and Engineering, Southeast Collaboratory for Structural Genomics, University of Alabama at Birmingham, Birmingham, Alabama 35294, USA 2 Center for Cancer Systems Biology and Department of Cancer Biology, Dana-Farber Cancer Institute, and Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
Proteome-scale studies of protein three-dimensional structures should provide valuable information for both investigating basic biology and developing therapeutics. Critical for these endeavors is the expression of recombinant proteins. We selected Caenorhabditis elegans as our model organism in a structural proteomics initiative because of the high quality of its genome sequence and the availability of its ORFeome, protein-encoding open reading frames (ORFs), in a flexible recombinational cloning format. We developed a robotic pipeline for recombinant protein expression, applying the Gateway cloning/expression technology and utilizing a stepwise automation strategy on an integrated robotic platform. Using the pipeline, we have carried out heterologous protein expression experiments on 10,167 ORFs of C. elegans. With one expression vector and one Escherichia coli strain, protein expression was observed for 4854 ORFs, and 1536 were soluble. Bioinformatics analysis of the data indicates that protein hydrophobicity is a key determining factor for an ORF to yield a soluble expression product. This protein expression effort has investigated the largest number of genes in any organism to date. The pipeline described here is applicable to high-throughput expression of recombinant proteins for other species, both prokaryotic and eukaryotic, provided that ORFeome resources become available.
The nematode Caenorhabditis elegans is one of the best-studied multicellular model organisms (Wood 1988
Proteome scale studies of protein structure, function, and interactions have become a new paradigm for both investigating basic biology and developing therapeutics, as exemplified by the worldwide structural genomics initiatives and numerous proteomics projects (Burley et al. 1999
Under the NIH-NIGMS-sponsored Protein Structural Initiative, we selected C. elegans as our model genome to systematically express its proteins and solve their three-dimensional structures by x-ray crystallography and NMR. This effort is facilitated by the C. elegans ORFeome project, an effort that aims at cloning all predicted protein-encoding ORFs as Gateway Entry clones, which, in turn, enables a high-throughput (HTP) approach of recombinant protein expression (Reboul et al. 2003
It is generally recognized that the production of proteins in soluble form, sufficient (milligram) quantity, and homogeneity for structural analyses is the most prodigious part of a structural genomics project (Stevens and Wilson 2001
For a genome-scale undertaking, target prioritizing (target selection) becomes an issue that is dictated by a multitude of considerations, namely, the diversity and significance of biological functions, interests of discovery science, therapeutic development, and the practical perspective of whether the ORF is experimentally tractable for structure determination by x-ray crystal-lography and NMR. For structural genomics, priority is given to novel proteins, that is, proteins without a reliable structural homolog in the Protein Data Bank (PDB; Berman et al. 2000
For HTP protein expression, we developed a robotic pipeline to systematically process all cloned ORFs from the C. elegans ORFeome (Reboul et al. 2003
An Integrated Robotic Pipeline We developed a robotic pipeline on the basis of an approach of step-wise automation on an integrated robotic platform that is versatile for handling multistep processes for HTP recombinant protein expression. We optimized and miniaturized the basic protocols and developed a novel transformation platform, including a robotic device for heat-shock in 96-well plate, as well as an ElectroTip for automated electroporation (Finley et al. 2004
Using our HTP pipeline, we are able to process 384 unique C. elegans ORFs every 3 wk, starting with the ORFeome collection of Entry clones (Reboul et al. 2003 The pipeline developed using C. elegans ORFs is applicable to HTP recombinant protein expression for other species, provided that ORFeome resources become available. We have tested the pipeline on 350 genes from a bacterial genome arrayed on four 96-well plates. Small-scale protein expression was verified by both ELISA and SDS-PAGE, and the results are >98% consistent between the two methods (C.-H. Luan, S.H. Qiu, R.J. Gray, P.S. Horanyi, Z.-J. Liu, J. Zhou, M. Luo, and B.C. Wang, unpubl.).
Statistics of Protein Expression of C. elegans
Reproducibility of Protein Expression in 96-Well Format Because of the cost and effort involved in the recombinant protein expression using our approach, it is not feasible or practical to repeat the experiments routinely on each plate to be confident about the expression and solubility of all tested proteins. Therefore, it is important to have confidence in the reproducibility of the screening results. The data in Figure 1 demonstrate the reproducibility of our multistep protein-expression process. These data are from two experiments of small-scale expression, purification, and solubility profiling for Plate 19 from the same expression clones and experimental conditions. In the first experiment, nine ORF clones expressed soluble protein in E. coli. In the second experiment, the same nine ORFs displayed soluble expression in almost the same order of solubility ranking, except for one. As shown in Figure 1, 19-A7 was ranked as the third most soluble ORF in the first experiment, whereas it was the ninth most soluble in the repeat experiment. This means that if only the second experiment were performed, 19-A7 may not have been identified as a candidate for further effort. There are 32 proteins expressed in the first experiment and 32 in the second experiment. Among them, 31 are common to both experiments. Each experiment missed one protein having mid-level expression in the other experiment. Taking into account the fact that each experiment involves multiple steps, these data demonstrate reliable reproducibility in terms of classifying candidates on the basis of protein expression and solubility profiling for the multi-ORF, multistep experiments, and are certainly satisfactory for a high-throughput screening approach.
Temperature Dependence of Total Expression and Soluble Protein Expression The temperature dependence of protein expression and solubility was optimized as described (Finley et al. 2004
Expression Vector Engineering The expression vector, pDEST17.1 (Invitrogen, http://www.invitrogen.com), was less favorable for these studies, because the His6-tag on the expressed proteins is not amenable to proteolytic cleavage. It is desirable to have the option of removing the tag for protein crystallization. In addition, the yield of soluble proteins obtained with pDEST17.1 is relatively low. To overcome these problems, we engineered two vectors using two pET-vectors (Novagen, http://www.novagen.com), each encoding a thrombin cleavage site within the peptide sequence LVPRGS. We generated Gateway-compatible versions of these two pET vectors as follows: pET15G, which encodes the 21 amino acid sequence MGSSHHHHHHSS GLVPRGSQS in the pET15b backbone vector, and pET21G, which uses the pET21b backbone with the 29 amino acid sequence MASMTGGQQMG SSHHHHHSSGLVPRG SQS. For the pET21G vector, an additional fusion tag, T7 tag with epitope sequence MASMTGGQQMG, is included in the N-terminal sequence upstream of the His6-tag and thrombin cleavage site. The T7 tag promotes protein expression, as observed in a number of experiments in our study (data not shown). Figure 3 compares the protein expression results obtained using pDEST17.1 and pET15G for ORFs in three 96-well plates. The total number of bacterial transformants with heterologous protein expression, as well as those with soluble expression increased in all cases when using pET15G. The same soluble proteins obtained by using pDEST17.1 were also obtained when pET15G was used. To improve soluble protein production, we examined three bacterial strains, BL21(DE3), BL21-AI, and BL21(DE3)pLysS (Invitrogen) in combination with the three destination vectors, pDEST17.1, pET15G, and pET21G. The most successful combination uses either the pET15G or pET21G vector and the E. coli strain BL21-AI, as suggested by the 0.6-mL scale expression data in Table 2, whereas pDEST17.1 is most effective when coupled with BL21(DE3)pLysS.
Scale-Up of the Small-Scale Expression Before attempting large-scale production (six 1-Liter scale), we perform 1 L-scale expression on the soluble candidates identified in the small-scale screening. Figure 4 exemplifies the correspondence between the results of soluble proteins of 0.6 mL and 1 L expressions. The soluble protein bands in the SDS-polyacrylamide gels are consistent with ELISA data in small-scale screening. Approximately 85% of the proteins in 1-L expression have the correct molecular size. The majority of the rest either have molecular weight lower than the theoretical value or have multiple bands. These could presumably be due to incorrect gene annotation, truncation in expression, or degradation after expression.
For large-scale expression, we have used both the conventional LB (Luria Broth) medium and the medium developed by F.W. Studier at Brookhaven National Laboratory, in which auto-induction of expression occurs close to saturation of bacterial growth. There is no need to monitor culture densities or add inducer at the proper time. In our experiments, the autoinduction medium led to higher levels of protein expression than LB medium in 80% of cases, depending on the specific protein. Multistep purification is performed using Ni-affinity filtration-, and ion-exchange-chromatography, taking into account the size and charge of the individual proteins.
Bioinformatics Analysis
One of the missions of structural genomics is to prioritize those sequences with a protein family (Pfam) domain whose structure is unknown. We analyzed our expression results with respect to Pfam domains found in each ORF tested. A summary of these results is shown in Table 4. The Pfam domains associated with the highest percentage of soluble expressions (RRM_1, Motile_Sperm, Helicase_C, adh_short, Histone) all have well-known structural domains. Conversely, the targets more interesting to structural genomics (the 7TM chemoreceptor, the Neurotransmitter-gated ion-channel, and the Collagen triple helix repeat) prove difficult to express. Our data are consistent with previous observations for bacterial expression of eukaryotic proteins (Braun et al. 2002
GRAVY and Expression Our bioinformatics analysis of the expression data for the 10,167 ORFs suggests that overall hydrophobicity is the most important factor for an ORF to yield a soluble expression product. This provides significant experimental validation for using hydrophocity in bioinformatics analysis. This conclusion is demonstrated by 0.6-mL scale expression data for 87 genes in one plate, which was not included in the data set used in the analysis. Figure 6 lists the genes by GRAVY value, annotated with the presence of a signal peptide and transmembrane helices, and color coded with expression data.
No expression was observed for the 19 genes with the highest GRAVY values. The more negative the GRAVY value becomes, the more likely that an ORF exhibits soluble protein expression. Nine of the 11 soluble proteins, including the three with the highest total protein yield, have GRAVY values less than -0.4, the average GRAVY value for the C. elegans genome. Although protein expression was observed for the majority of the genes in that range, they are not all soluble. Even for those with the lowest GRAVY, solubility varies. In summary, the observation is that low GRAVY implies expressability. Soluble expression, however, depends on other factors and can not be accurately predicted by bioinformatics methods alone. Thus, for the foreseeable future, empirical screening appears to be the only reliable way to identify those ORFs that can be expressed in a soluble form.
In the genome-wide protein expression effort, we have analyzed >10,167 ORFs, comprising nearly half of the predicted C. elegans ORFeome. By comparison, Zhu and colleagues expressed 5800 yeast genes, accounting for 93% of the genome (Zhu et al. 2001 1000 (Christendat et al. 2000
Our current HTP protein expression pipeline uses a single expression vector, pET15G with a His6-tag, in combination with one E. coli host strain. Protein expression was observed on 47.7% of the 10,167 ORFs studied, by comparison to the success rates of 50% in a study of 65 genes (Reboul et al. 2003
A number of factors contribute to whether or not any given gene expresses soluble protein in an E. coli-based heterologous system. The first is the biological properties of the target gene. The bioinformatics analysis of our expression data indicates that the most significant trend was that a homologous structure deposited in the PDB implied solubility. However, sequences with a PDB homolog are the lowest priority for structural genomics efforts. We had prioritized our plates by giving a higher priority to the plates that contain fewer ORFs having PDB homologs. Therefore, the half of the genome for which we report data herein is enriched for those ORFs encoding fewer PDB homologs. This may affect the soluble rate in a negative way. On the other hand, our current approach is not directed at expressing folded membrane proteins, which constitute 30% of a typical proteome (Christendat 2000; Heinemann 2002 A second factor is protein expression conditions. To achieve a proteome-scale analysis, expression and solubility data were obtained using the same conditions for all genes. Generally, expression solubility can be improved by optimization of expression conditions for each clone. HTP operation in 96-well format precludes such individual ORF-based optimization.
The third factor in the limitation resulted from the current single ORF-based approach, where each ORF is placed in one well and each protein expression construct contains only one ORF of interest. As previously observed (Adams et al. 2003 The fourth factor is simply expressing eukaryotic proteins in a prokaryotic host. We are expressing eukaryotic proteins that are known in their native environment to undergo posttranslational modification, whereas E. coli lacks posttranslational modification and other properties of the eukaryotic system. One of the consequences is that eukaryotic multidomain proteins cannot be expressed in E. coli in soluble form.
The above limitations also point to avenues for future improvements. There are various approaches for improving soluble expression that are feasible in an HTP environment. For example, the use of Multi-fusion tags has been demonstrated in several studies to improve the overall soluble expression rate. Using four fusion tags, 128 human proteins were expressed in E. coli with a combined soluble expression of 83% (Braun et al. 2002 E. coli is notorious for driving overexpressed foreign proteins into inclusion bodies, therefore rendering them insoluble. Many proteins in our experiments have high-expression yields, but low solubility. Refolding is a practical approach after treatment of inclusion bodies to solubilize the proteins.
We are exploring a number of these approaches, namely, potential new fusion tags, coexpression of potential partner ORFs, eukaryotic expression systems, and HTP refolding to increase soluble protein production. Our work reported with the E. coli expression system, however, demonstrates the ability to develop automation strategies and methods for HTP recombinant protein expression. The pipeline thus developed can be readily adapted to expressions using different vectors and/or host combinations, a eukaryotic expression system, or coexpression of two or more ORFs simultaneously. In addition, the availability of an evolving ORFeome resource (Lamesch et al. 2004 HTP methods are clearly an important tool for structural proteomics. Our integrated robotic pipeline streamlines the complex experimental procedures and makes it possible to carry out protein expression for thousands of genes in a timely and reproducible manner. The effort of the largest recombinant protein expression for >10,167 genes from a single organism has yielded a significant number of novel targets for structural characterization. Furthermore, the efforts have given scientific insights on using bacterial hosts to express eukaryotic proteins, as well as providing enough data to critically consider the often anecdotal results of recombinant protein expression.
There are three aspects in developing a robotic pipeline for HTP recombinant protein expression, that is, cloning of expression constructs and optimization of the related basic molecular biology protocols, miniaturization of the protocols, and automation of bench processes.
Our basic molecular biology approach is based on the Gateway cloning and expression technology (Hartley et al. 2000
The focus of our experiments was to adapt the Gateway technology to our general pipeline by selecting and engineering compatible vectors and host cell lines for DNA plasmid miniprep and protein expression, and by converting and developing protocols that are amenable to automation. The E. coli expression system based on bacteriophage T7 RNA polymerase developed by Studier (Studier et al. 1990
Miniaturization of Basic Protocols
Automation of Miniaturized Protocols Our integrated robotic platform is centered on the Beckman/Sagian core system, including a Biomek FX and Biomek 2000 liquid handlers (Beckman-Coulter), a DNA Engine Tetrad Cycler (MJ Research), an ELX-405UV plate washer (Bio-Tek), a SpectraMax UV/Vis (Molecular Devices), plate reader, a Polarstar (BMG Lab-Technologies) fluorescence plate reader, four temperature-controlled shaker-incubators, a centrifuge with microplate carrier, and a BioRobot 9600 (QIAGEN, http:www1.qiagen.com). The integrated robotic system is supported by its system software and in-house programs for specific applications. An automated method for DNA plasmid miniprep was developed on Biomek FX configured with a vacuum manifold, a plate shaker, a 96-well pipetting tool, along with a labware gripping tool for plate movement. A semiautomated method was developed on a BioRobot 9600 robot, which has a dynamically controlled vacuum manifold, but the movement of plates requires hands-on operation. A manual vacuum device by Eppendorf having four filter plate positions was also used for plasmid miniprep.
To automate the bacterial transformation using the standard heat-shock method, which requires strict time and temperature control, we created a novel heat-shock station and software control for the Beckman core robotic system, whereby four complete 96-well plates may be transformed in an automatic fashion in Small-scale protein purification was automated on a Biomek FX robot with two vacuum manifolds. The pellets of bacterial cell cultures were harvested and lysed by lysozyme using robot manipulations, followed by centrifugation to separate the supernatant from pellets, and then Ni affinity purification using Ni-NTA (QIAGEN) beads in 96-well filter plates. The purification was coupled to an automated ELISA in 96-well format for protein expression analysis. The current manual centrifugation can be integrated into the automated process by purchasing a robotic-compatible centrifuge.
Streamlined Process
Construction of the Expression Clones Our starting material for the HTP protein expression was the ORFeome collection of full-length C. elegans ORFs (Reboul et al. 2003
The product of the LR reactions were transformed into competent E. coli DH5
Protein Expression and Solubility Profiling For any proteome-scale protein expression in E. coli, the percentage of soluble proteins found by using a single host and expression vector combination usually is <25%. We carried out protein expression in two steps, a small-scale screen of 0.6 mL in 96-well plates at two temperatures, typically 18 and 37°C, to profile the protein expression level and identify soluble candidates for scale-up expression, which is at 1-L or 6-L scales for protein production.
Small-scale protein expression was carried out in 2-mL 96-well block plates. After bacterial growth, cell pellets were lysed by freezing overnight at -80°C and then thawed at room temperature for 15 min before adding 500 µL lysis buffer (50 mM NaH2PO4, 300 mM NaCl, 10 mM imidazole, 1 mg/mL lysozyme at pH 8.0). After mixing, cell lysis continued by shaking for Solubility profiling uses a fully automated ELISA based on a new profiling concept (C.-H. Luan, S.H. Qui, R.J. Gray, B.J. Finley, and M. Luo, in prep.). The multi-data-set ELISA was analyzed by using in-house software to score solubility and to assist in determining optimal conditions for large-scale expression. The analysis also provides information to assist refolding decisions for those not yielding enough soluble proteins. The method has a success rate of >95% in expression of C. elegans proteins in E. coli as judged by 1-L scale-up expression of the soluble candidates identified in small-scale screen. ELISA, however, does not show whether a protein expressed has the correct molecular size. Therefore, orthogonal methods, such as SDS-PAGE and DNA or protein sequencing were used in a 1-L confirmation stage of the soluble candidates identified in a small-scale screen. In using the multidata-set ELISA, each bacterial culture plate was separated into four plates for analysis, one for supernatant without purification, one for supernatant with purification, one for pellet without purification, and one for pellet with purification. The corroboration of results across plates reduces the error in detection by ELISA. Each gene plate was expressed at two temperatures, 37°C and 18°C. Thus, each gene was associated with eight ELISA data sets, effectively increasing the accuracy of the solubility profiling. Because of the concern that the scale up from a 0.6-mL to a 1-L format is not always successful, the soluble candidates from the small-scale screen were expressed in 1-L cultures for confirmation and pilot study of large-scale purification. The 1-L expression was performed using two liter flasks in a temperature-controlled incubation shaker. To accommodate the individuality of each protein while in a high throughput experimental operation, a small number of purification conditions are designed and used to select optimal purification condition for large-scale production. Scale-Up Protein Expression and Purification After profiling expression level, solubility, and optimal expression conditions for each protein, individual clones were selected for production in six 1-L cultures. The soluble proteins were purified by use of the standard protocols with affinity, ion-exchange, and size exclusion chromatography to obtain homogenous protein preparations. The purified proteins were then concentrated and used in crystallization trials. Insoluble proteins were subject to in vitro refolding, if necessary.
We acknowledge funding from NIH (NIGMS 1P50 [PDB] -GM62407), usage of a robotic system purchased by funds from NSF (EPSCOR), and partial support to C.-H. Luan from NASA's cooperative agreement (NCC8-126) to the Center for Biophysical Sciences and Engineering. We thank ResGen for providing a number of the Entry clones during the initial phase of this work.
3 Corresponding authors. E-MAIL luanch{at}uab.edu; FAX (205) 934-7341. E-MAIL mingluo{at}uab.edu; FAX (205) 975-9578. Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.2520504.
Adams, M.W.W., Dailey, H.A., DeLucas, L.J., Luo, M., Prestegard, J.H., Rose, J.P., and Wang, B.C. 2003. The Southeast collaboratory for structural genomics: A high-throughput gene to structure factory. Acc. Chem. Res. 36: 191-198.[CrossRef][Medline]
Bairoch, A., Bucher, P., and Hofmann, K. 1997. The PROSITE database, its status in 1997. Nucleic Acids Res. 25: 217-221.
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., and Bourne, P.E., 2000. The Protein Data Bank. Nucleic Acids Res. 28: 235-242. Braun, P. and LaBaer, J. 2003. High throughput protein production for functional proteomics. Trends Biotechnol. 21: 383-388.[CrossRef][Medline]
Braun, P., Hu, Y., Shen, B., Halleck, A., Koundinya, M., Harlow, E., and LaBar, J. 2002. Proteome-scale purification of human proteins from bacteria. Proc. Natl. Acad. Sci. 99: 2654-2659.
Brenner, S. 1974. The genetics of Caenorhabditis elegans. Genetics 77: 71-94. Burley, S.K., Almo, S.C., Bonanno, J.B., Capel, M., Chance, M.R., Gaasterland, T., Lin, D., Sali, A., Studier, F.W., and Swaminathan, S. 1999. Structural genomics: Beyond the human genome project. Nat. Genet. 23: 151-157.[CrossRef][Medline]
The C. elegans Sequencing Consortium. 1998. Genome sequence of the nematode C. elegans: A platform for investigating biology. Science 282: 2012-2018. Chambers, S.P. 2002. High-throughput protein expression for the post-genomic era. Drug Discov. Today 7: 759-765.[CrossRef][Medline]
Chance, M.R., Bresnick, A.R., Burley, S.K., Jiang, J.-S., Lima, C.D., Sali, A., Almo, S.C., Bonanno, J.B., Buglino, J.A., Boulton, S., et al. 2002. Structural genomics: A pipeline for providing structures for the biologist. Protein Sci. 11: 723-738 Christendat, D., Yee, A., Dharamsi, A., Kluger, Y., Savchenko, A., Cort, J.R., Booth, V., Mackereth, C.D., Saridakis, V., Ekiel, I., et al. 2000. Structural proteomics of an archaeon. Nat. Struct. Biol. 7: 903-909.[CrossRef][Medline] Ellis, H.M. and Horvitz, H.R. 1986. Genetic control of programmed cell death in the nematode C. elegans. Cell 44: 817-829.[CrossRef][Medline] Finley, B.J., Qiu, S.H., Luan, C.-H., and Luo, M. 2004. Structural genomics for Caenorhabditis elegans: High throughput protein expression analysis. Protein Expr. Purif. 34: 49-55.[CrossRef][Medline]
Gasteiger, E., Gattiker, A., Hoogland, C., Ivanyi, I., Appel, R.D., and Bairoch, A. 2003. ExPASy: The proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res. 31: 3784-3788.
Hartley, J.L., Temple, G.F., and Brasch, M.A. 2000. DNA cloning using in vitro site-specific recombination. Genome Res. 10: 1788-1795. Heinemann, U. 2002. Establishing a structural genomics platform: The Berlin-based Protein Structural Factory. Gene Funct. Disease 3: 25-32. Krogh, A., Larsson, B., von Heijne, G., and Sonnhammer, E.L.L. 2001. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. J. Mol. Biol. 305: 567-580.[CrossRef][Medline] Kyte, J. and Doolittle, R.F. 1982. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157: 105-132.[CrossRef][Medline] Lamesch, P., Milstein, S., Hao, T., Rosenberg, J., Li, N., Sequerra, R., Bosak, S., Doucette-Stamm, L., Vandenhaute, J., Hill, D.E., et al. 2004. C. elegans ORFeome version 3.1: Increasing the coverage of ORFeome resources with improved gene predictions. Genome Res. (this issue).
Lesley, S.A., Kuhn, P., Godzik, A., Deacon, A.M., Mathews, I., Kreusch, A., Spraggon, G., Klock, H.E., McMullan, D., Shin, T., et al. 2002. Structural genomics of the Thermotoga maritime proteome implemented in a high-throughput structure determination pipeline. Proc. Natl. Acad. Sci. 99: 11664-11669.
Nielsen, H., Engelbrecht, J., Brunak, S., and von Heijne, G. 1997. Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 10: 1-6. Norvell, J.C. and Zapp-Machalek, A. 2000. Structural genomics programs at the US National Institute of General Medical Sciences. Nat. Struct. Biol. 7: 931. Reboul, J., Vaglio, P., Rual, J.-F., Lamesch, P., Martinez, M., Armstrong, C.M., Li, S., Jacotot, L., Bertin, N., Janky, R., et al. 2003. C. elegans ORFeome version 1.1: Experimental verification of the genome annotation and resource for proteome-scale protein expression. Nat. Genet. 34: 35-41.[CrossRef][Medline]
Shih,Y.-P., Kung, W.-M., Chen, J.-C., Yeh, C.-H., Wang, A. H.-J., and Wang, T.-F. 2002. High-throughput screening of soluble recombinant proteins. Protein Sci. 11: 1714-1719.
Stevens, R.C. and Wilson, I.A. 2001. Industrializing structural biology. Science 293: 519-520. Studier, F.W., Rosenberg, A.H., Dunn, J.J., and Dubendorff, J.W. 1990. Use of T7 RNA polymerase to direct expression of cloned genes. Methods Enzymol. 185: 60-89.[Medline] Walhout, A.J., Temple, G.F., Brasch, M.A., Hartley, J.L., Lorson, M.A., van den Heuvel, S., and Vidal, M. 2000. GATEWAY recombinational cloning: Application to the cloning of large numbers of open reading frames or ORFeomes. Methods Enzymol. 328: 575-592.[Medline] Wood, W.B. 1988. The nematode Caenorhabditis elegans. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York.
Zhu, H., Bilgin, M., Bangham, R., Hall, D., Casamayor, A., Bertone, P., Lan, N., Jansen, R., Bidlingmaier, S., Houfek, T., et al. 2001. Global analysis of protein activities using proteome chips. Science 293: 2101-2105.
http://sgce.cbse.uab.edu; Structural Genomics of C. elegans. http://sgi.com; MineSet decision tree software. http://ww.invitrogen.com; Gateway Cloning and Expression TechnologiesInvitrogen. http://ww1.qiagen.com; Qiagen. http://ww1.novagen.com; Novagen. http://gce.cbse.uab.edu; SGCE. http://www.cbs.dtu.dk/services/SignalP/; Signal Peptides. http://www.cbs.dtu.dk/services/TMHMM-2.0/; Tramsmembrane Helices. http://ca.expasy.org/tools/protparam.html; EXPASY (for MW, pI, GRAVY).
Received February 26, 2004; accepted in revised format August 16, 2004. This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||