|
|
|
|
Genome Res. 14:1447-1461, 2004 ©2004 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/04 $5.00 The Complete Genome and Proteome of Mycoplasma mobile1 Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts 02138, USA 2 Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA 3 The Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02141, USA 4 The Rowland Institute at Harvard, Cambridge, Massachusetts 02141, USA
Although often considered "minimal" organisms, mycoplasmas show a wide range of diversity with respect to host environment, phenotypic traits, and pathogenicity. Here we report the complete genomic sequence and proteogenomic map for the piscine mycoplasma Mycoplasma mobile, noted for its robust gliding motility. For the first time, proteomic data are used in the primary annotation of a new genome, providing validation of expression for many of the predicted proteins. Several novel features were discovered including a long repeating unit of DNA of 2435 bp present in five complete copies that are shown to code for nearly identical yet uniquely expressed proteins. M. mobile has among the lowest DNA GC contents (24.9%) and most reduced set of tRNAs of any organism yet reported (28). Numerous instances of tandem duplication as well as lateral gene transfer are evident in the genome. The multiple available complete genome sequences for other motile and immotile mycoplasmas enabled us to use comparative genomic and phylogenetic methods to suggest several candidate genes that might be involved in motility. The results of these analyses leave open the possibility that gliding motility might have arisen independently more than once in the mycoplasma lineage.
Often considered the smallest independently self-replicating organisms, the mycoplasmas are wall-less bacteria characterized by small physical dimensions and genome sizes (Razin et al. 1998
Mycoplasma mobile is one of the flask-shaped mycoplasmas (
We also sought to use a new combined genomic/proteomic approach to genome sequencing and annotation called proteogenomic mapping (Jaffe et al. 2004a
Primary Genome and Proteome Features Our assembly of the complete genome (Fig. 1) consists of a single circular chromosome of 777,079 bp with a GC content of 24.9 mole%, low even among the GC-poor mycoplasmas (range 24%40%). We have proposed 635 protein coding sequences (CDSs), of which 557 (88%) have been validated as expressed proteins by proteogenomic mapping. Initially, 612 genes were computationally predicted, but four of these were manually removed and one other gene was added prior to incorporation of proteomic evidence based on homology searches. Subsequently, 26 additional genes were added through proteomics to reach the total of 635. Of these proteins, 463 could be placed into clusters of orthologous groups (COGs) via homology search (Tatusov et al. 2000 180° away (with respect to the circular chromosome) from the conserved 16S23S rDNA, suggesting a major genome rearrangement may have taken place about this axis (see Fig. 1). We have found evidence for 28 tRNA genes, the fewest of any organism yet reported (see Supplemental tables of RNA and Codon Usage; Chambaud et al. 2001
We have assigned the origin of replication based on homology to other mycoplasma genomes (Cordova et al. 2002 dnaA dnaN is present as is observed in M. pulmonis and several other mycoplasmas. In this case there also is a copy of the presumed major surface antigen in between rpmH and dnaA (see below). In following the convention of M. pulmonis, we have numbered as base 1 the first nucleotide after the stop codon of the gene immediately preceding dnaA (Chambaud et al. 2001Because we were able to collect and analyze proteomic data concurrently with genome sequencing, we have good evidence for many of the proteins that we predict for this genome (Fig. 2; Supplemental Gene Table). As always with a novel genome, there are a fair number of "unknown" proteins that have no homologs or have dubious functionality based on homology searches, but now with proteomics we have verified that many of them are expressed. Hence, we have adopted a controlled vocabulary in annotating these "unknowns." Predicted proteins for which there are no proteomic data are annotated as "hypothetical" or "conserved hypothetical" if there is supporting evidence of homology in other species. Proteins for which we do have proteomic evidence but little functional information are usually annotated as "expressed protein of unknown function." In our annotation scheme, the word "putative" refers only to the presence of a particular functionality, and never to the existence of the protein or not. Because GenBank does not yet have a facility for integrating proteomic data into genome annotation, we have provided a summary of proteomic evidence for each protein in the "note" field of the GenBank annotation, where we specify the number of unique peptides detected for the protein and the percent coverage by amino acid sequence for the protein. Full proteomic coverage can be viewed interactively at http://www.broad.mit.edu/annotation/microbes/mycoplasma/.
General features of the proteome closely mirrored those in M. pneumoniae (Jaffe et al. 2004a
We detected 94% of well-annotated proteins but only 76% of poorly annotated proteins. In fact, the proposed gene products not detected by proteomics are functionally enriched for the "unknown function" category of annotation (55% of all undetected ORFs belong to COG category S, p-value = 1 x 109 for functional enrichment using the hypergeometric distribution method; Sokal and Rohlf 1995 Several other "named" genes were not detected by proteomics. As with M. pneumoniae, the Holliday-junction helicases ruvA and ruvB were not detected. One copy of RNAse H II was not detected, but a second copy was, thus its functionality is clearly present. As well, M. mobile's copy of ribosomal protein L36 (rpmJ) was not detected, but this particular gene has a suspicious phylogeny despite its conserved gene order with M. pulmonis and M. pneumoniae (see below). The rpmJ protein was detected in M. pneumoniae, but this protein is extremely small and missing it could have simply been an unlucky happenstance. Curiously, ATP synthase subunit C (atpE) was not detected although its presence in vivo is assumed. Twenty-six proteins that were not computationally predicted were detected by proteomics (identification methods for each gene call can be seen in the Supplemental Table of Computational and Proteomic Predictions). There was no detectable size or functional category bias to these proteins, but eight of the 26 were <100 amino acids. Thirty-nine additional modifications were made to the start codons of computationally predicted genes based on proteomic evidence. This represents a net 10% modification to genome annotation based on incorporation of proteomics into gene modeling, and the additional 26 proteins represent 2.1% of the total genome coding capacity. Of the 608 genes originally predicted by computational methods, 531 were supported by proteomic evidence (84%). Of the 11,825 positionally unique peptides used to construct the proteogenomic map for M. mobile, 96.7% were mapped onto computationally predicted genes whereas the remainder provided the basis for the additional 26 ORFs. No detected peptides conflicted with a computationally predicted gene in terms of translational frame, however, and thus no computationally predicted genes were removed based on proteomics.
We searched for possible phosphorylation (on serine, threonine, or tyrosine), methylation (on lysine, aspartic acid, or glutamic acid), and acetylation (on lysine) of proteins by varying our SEQUEST search strategy accordingly. We did not detect any convincing posttranslational modifications by proteomics, although the data may be reanalyzed more thoroughly at a later date. This is in contrast to M. pneumoniae in which we detected phosphorylation of the HPr phosphocarrier protein using similar methods (Jaffe et al. 2004a
Functional Analysis and Metabolism
The genome encodes numerous transporters with a wide range of substrates and possesses some additional transporters of unknown specificity. Both ATPase Binding Cassette (ABC) and Phosphoenolpyruvate-dependent System (PTS) transporters are represented. It appears that M. mobile should be able to transport and metabolize glucose, sucrose, fructose, maltose/maltodextrin, xylose, and trehalose as energy sources. This has been shown experimentally for glucose and sucrose (Jaffe et al. 2004b
Fermentation of sugars appears to be the only method of ATP production in M. mobile. A complete glycolysis pathway is present that terminates in the formation of lactate. Most of the nonoxidative branch of the pentose phosphate pathway is present except for the reaction normally carried out by transaldolase. This is similar to the situation in M. mycoides and other mycoplasmas, and presumably this function is carried out by an as-yet-unrecognized protein (Pollack 2002
Other ionic homeostases are achieved through a variety of transporters. Phosphate, formate/nitrite, and cobalt transporters are specifically observed, as well as the Na+/Ca2+ antiporter ecm27 and the K+/Na+ symporter ktrAB. We also note the presence of mscL, the large conductance mechanosensitive channel for response to osmotic stress (Sukharev et al. 1994
As with other mycoplasmas, de novo amino acid and nucleotide synthesis is lacking in M. mobile (Pollack 2002
A variety of DNA restriction/modification system enzymes are detected in M. mobile. This may account for their recalcitrance to genetic manipulation via transposon mutagenesis (J. Jaffe, unpubl.; M. Miyata, pers. comm.) even though it is a popular mutagenesis technique in other mycoplasma species. A similar obstacle was overcome in Mycoplasma arthritidis by using an appropriate DNA modification enzyme to circumvent that species' endogenous restriction enzymes (Voelker and Dybvig 1996
Long Repeat Region
The meaning of these repeated sequences is unclear. Because of the orientation and the site of the truncation of the abortive repeat, it appears that the repeats are generated from 5' to 3' on the plus strand, possibly by polymerase slippage during replication or by replication stalling with subsequent upstream repriming by the leading strand. Yet the proteins that these sequences encode are transcribed from the minus strand of the genome. At 20 mole%, this region of the genome is also even lower in GC-content than the rest of the genome. Southern blot experiments (data not shown) indicate that the number of repeats is stable over several passages in culture in a population. Only a single 14.9-kb band was visualized when a BbvI restriction digest designed to cut around the repeat region was probed with a sequence targeted to all of the repeat regions. Coincidentally, these experiments allowed us to fix the number of full repeats in the genome at five and complete its assembly. This finding indicates that either homologous recombination that would result in "looping out" of the repeats occurs at only a low level in M. mobile or that this region is somehow protected from that type of recombination. M. mobile should be capable of homologous recombination, as it possesses and expresses the recA gene (Cassuto et al. 1980
Putative Major Surface Antigen
Many mycoplasmas have a major antigenic protein of which there are multiple copies with variable sequences present in the genome, one of which is expressed at a given time. For example, M. pulmonis has the vsa proteins and M. pneumoniae has the P1 protein present many times each in the genome (Himmelreich et al. 1996
Other Genes Present in Multiple Copies
Several other duplications of note are present in the M. mobile genome. There are three copies of manB scattered throughout the genome as opposed to one in M. pulmonis. There are also multiple copies of ATP synthase and subunits located outside the main ATP synthase cluster. This feature is shared in common with M. pulmonis and other mycoplasmas. Again, our proteomics data show that each copy of ATP synthase and is uniquely expressed based on detected variations at the coding level of the multiple paralogs. The genome segment containing MMOB0970 and MMOB0980 appears to have been duplicated at MMOB5690 and MMOB5700. As well, the tandem duplication of MMOB0190 and MMOB0200 is repeated at MMOB6040 and MMOB6050a strange example of a nontandem duplication of a tandem duplication.
Potential Motility Genes
There are now several motile and immotile mycoplasmas that have had their genomes sequenced, and we attempted a comparative genomic analysis to search for potential motility proteins. We searched for several patterns that might indicate a gene's relevance for motility. According to the literature, M. pneumoniae, M. pulmonis, M. genitalium, and Mycoplasma gallisepticum all possess the ability to locomote, although the latter appears to do so extremely slowly (Kirchhoff 1992 We then reanalyzed the data under the premise that ancestral motility genes might be present throughout the mycoplasmas, but mutation might have specifically inactivated them in the immotile ones. Because we hold M. mobile as the paradigmatic motile mycoplasma, we attempted to look for M. mobile genes that were phylogenetically clustered with genes from the other motile mycoplasmas and more divergent from their orthologs in the immotile species. We used a rudimentary form of nearest-neighbor clustering termed the "Group Phylogenetic Bias" (see Methods for details). In this case, the two groups considered are the Motile Group and the Immotile Group, as defined above. We used the BLAST E-value as a surrogate for phylogenetic distance of each M. mobile gene to orthologs in the other mycoplasma species discussed above (i.e., a lower E-value implies a closer phylogenetic relationship). Table 4 shows the high-scoring Motile Group Phylogenetic Bias genes. Several interesting candidates are identified by this method.
The second highest-scoring gene is secDF, part of the protein secretion apparatus. This is notable because it is likely to be localized to the membrane in the cell and, in combination with secA and secY, hydrolyzes ATP to translocate proteins across the membrane. One hypothesis might be that M. mobile has a specialized sec system that effects motility rather than secretes proteins. However, the only other member of the Motile Group to have secDF is M. pulmonis, which means that a specialized sec system would not be a universal effector of motility in the mycoplasmas. Those in the M. pneumoniae branch would require a separate mechanism. A specialized sec system might also indicate that protein extrusion plays a role in motility, analogous to the slime extrusion hypothesis for some cyanobacteria (Hoiczyk and Baumeister 1998 Another interesting candidate is MMOB2040, which is annotated as an unspecified permease of the major facilitator superfamily. Again, this would be a membrane protein that would have the capability of ATP hydrolysis when coupled to other permease subunits. In contrast to secDF, this protein is found in the motile mycoplasmas M. genitalium, M. pneumoniae, and M. pulmonis and the immotile species M. penetrans and M. mycoides. Because its permease specificity is ambiguous based on homology to other transporters, one can again imagine that its activity might be altered to provide a motility function.
Yet another noteworthy gene is the mgpA gene. This gene was originally identified as a major antigen and cytadhesin in M. genitalium, but every sequenced mycoplasma genome so far has revealed an ortholog (Hu et al. 1987 We attempted one other approach to search for motility genes in the mycoplasmas, which we term the "required core set" hypothesis. If the mechanism of motility is the same among the motile mycoplasmas, then they should all contain a core set of genes that potentiates motility. The immotile mycoplasmas might have one or more members of the core set, but not all of them. There are 52 genes shared by all the motile mycoplasmas where at least one of the immotile species lacks it. This list was then reduced by eliminating any genes that had an obvious functional assignment that seemed unlikely to contribute to motility. The remaining set of 12 genes is shown in Table 5, and fulfills the requirement that each of the immotile mycoplasmas could have one or more but not the complete set of these genes. It might be relatively easy to test the hypothesis that any one of these genes is required for motility by obtaining a suitable deletion in any of the mycoplasmas with amenable genetic tools, or conversely by adding back the putatively required components to any one of the immotile mycoplasmas by similar means. We note that another mgpA homolog (different from the one suggested by the previous analysis) is suggested by this method, further implicating it in motility.
Lateral Gene Transfer and Atypical Gene Phylogeny M. mobile contains 35 genes that have homologs outside of the mycoplasmas but none within them (see Table 6 and Supplemental Table of Ambiguous Gene Phylogenies). These may be genes remaining from the last common Gram-positive ancestor (LCA) of the mycoplasmas before the onset of reductive evolution, or they may be genes inserted into M. mobile by lateral transfer. Of the 35 genes, 22 have a clear Gram-positive ancestry, and therefore they may be vestiges of the LCA. However, for two of them, bioF (MMOB5780) and wcaG (MMOB5790), sequence similarity to the closest Gram-positive is too good to represent a divergent copy of the gene after reductive evolution. These genes have 84% conserved residues (69% identity) over their entire sequence length to their ortholog in Enterococcus faecium (average hits to E. faecium are 45% ± 16% conserved, 28% ± 12% identical), and are adjacent in genomic sequence just as they are presumed to be in E. faecium (GenBank accessions gi 22992057 and gi 22992058). Given the promiscuous conjugative nature of the Enterococci,itis likely that these genes have been transferred to M. mobile via a direct encounter between the two organisms (Ruffin et al. 2000
Another convincing example of lateral gene transfer is the gene cluster consisting of MMOB0940 and MMOB0950. Both of these genes have orthologs in Helicobacter pylori, a member of the -proteobacteria, although they do not appear as adjacent genes in H. pylori. They may be derived from an as-yet-unsequenced Helicobacter species in which this gene order is present. As well, the latter gene is annotated as a DNA or RNA methylase whose closest BLAST hit is to a virus of Chlorella, a green algae. The homology to a viral gene may suggest a mechanism of transfer from one species to another. Other candidates for lateral gene transfer and genes with atypical phylogenies are shown in Table 6 and in the Supplemental Ambiguous Gene Phylogeny Table. One particularly interesting instance is a gene (MMOB5030) whose closest homolog is found in a Pirellula species, a marine bacterium. This may reflect an environmental opportunity of M. mobile to acquire new genes based on its natural aquatic habitat. Another gene's (MMOB5990) closest homolog is found in an archaeon, Ferroplasma acidarmanus.
The mycoplasmas are the most deeply sequenced genus to date. Here, we add not only the genome but also the proteome of M. mobile. When sequencing a new genome, many new and unexpected coding sequences are discovered. It is useful to be able to attribute these to real protein products at the outset of annotation rather than afterward. We discovered 26 genes only through proteomics and not through other gene-calling methods. Some of these genes were small, which suggests a reason that they may have been missed by gene-calling programs. One example is an expressed open reading frame (ORF) of 40 amino acids located between the genes atpG and atpA in the ATP synthase operon. In light of our proteomic results, we have used a controlled vocabulary and added notes to our GenBank annotation to indicate which coding sequences are validated as expressed at the protein level. M. mobile now stands as the organism with the greatest degree of proteomic coverage (88% of all predicted genes, 40% average sequence coverage of those detected) of any organism to date. This represents a new standard in genome sequencing and annotation and, we hope, will encourage future microbial genome consortia to include proteomics as an integral part of their efforts.
The importance of proteomics is also demonstrated in the detection of the multiple unique isoforms of the lrp proteins. Even with coding identities >90%, proteomic techniques can make useful distinctions when the data sets are sufficiently comprehensive. Proteomics also helped to establish that multiple isoforms of the putative major surface antigen are simultaneously expressed in culture, a key difference from other mycoplasma species studied so far. We were disappointed not to detect any posttranslational modifications of proteins, but one advantage of proteomic data sets is that they can be augmented by exploring different environmental conditions for cell growth or targeted experiments designed to capture various protein classes at a later date (Ficarro et al. 2002 We hoped that the abundance of genomic sequences for both motile and immotile mycoplasmas would shed light on the protein components involved in motility in the mycoplasmas. However, no obvious pattern of genes was detected that separated the motile and the immotile groups. We used two strategies to suggest candidate genes that may be involved in motility. First, we looked for a phylogenetic bias toward the M. mobile ortholog of a gene in the motile mycoplasmas and against it in the immotile species (the Group Phylogenetic Bias). Second, we looked for a minimal set of genes that were all present in the motile mycoplasmas but had one or more members missing in each of the immotile mycoplasmas (the Required Core Set). Using these methods, we have suggested several genes that might be involved in motility in the mycoplasmas. We hope that these predictions can be tested experimentally in the future. To that end, genomic sequencing has now revealed several DNA restriction/modification systems that might aid in the development of genetics tools for M. mobile, and has suggested that Enterococcus might make a good conjugation partner for M. mobile.
However, given the difficulty thus far of identifying motility genes by comparative genomic methods, we must raise the question of whether gliding motility was a common feature in an ancestral mycoplasma or if it arose more than once in the mycoplasma phylogeny. The motile mycoplasmas fall distinctly in two branches of the phylogeny (Fig. 4; Maniloff 2002
Although the search for motility genes is ongoing, we did discover a surprising and novel proteogenomic element in M. mobile that is unprecedented in other bacteria. The long repeat region initially complicated the assembly of this genomic sequence, and we were surprised that the repeats, in fact, code for highly similar yet uniquely expressed proteins. Deciphering the function of these repeats and their proteins should prove an interesting challenge for the future. This discovery underscores the reality that there are no "simple genomes" and demonstrates the value of continuing to sequence the genomes of even the smallest organisms.
Growth Conditions, DNA Isolation, and Protein Isolation M. mobile 163K (ATCC 43663) was kindly provided by Makoto Miyata of Osaka City University, who provided bacterial cultures and expertise. Cultures of M. mobile were grown in 150-mL plastic flasks without shaking in Aluotto medium at 22°C (Aluotto et al. 1970
DNA Sequencing, Assembly, and Finishing
Sequence reads were derived from both ends of the inserts to generate paired-end reads as previously described (Lander et al. 2001
The optimal assembly was generated with a total of 16,376 reads derived from the 2-kb, 4-kb, 6-kb, 8-kb, and 10-kb inserts and yielded 12-fold sequence coverage of the M. mobile genome with a PHRED quality score of
Proteogenomic Mapping
Annotation and Analysis
tRNA genes were detected by the tRNAscan-SE program (Lowe and Eddy 1997
BIOPERL was used in parsing of homology search results and generation of some figures for this paper (Stajich et al. 2002
Phylogenetic-Distance Search for Motility Genes
We thank Makoto Miyata for his gift of M. mobile and for sharing his expertise in cultivation of this organism and Jeremy Zucker for assistance in our METACYC analysis. We gratefully acknowledge Eric S. Lander for his critical support of this project. We also thank all members of the genome sequencing platform at the Broad Institute. Funding of this work was provided by support from the Broad Institute and by grants to H.C.B. (NIH Grant # AI16478), and G.M.C. (DOE GTL). There are no known competing financial interests involved in this work. The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.2674004.
5 Corresponding author. [Supplemental material is available online at www.genome.org. The sequence data from this study have been submitted to GenBank under accession no. AE017308 [GenBank] . The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: M. Miyata, T. Knight, and N. Stange-Thomann.]
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25: 33893402. Aluotto, B.B., Wittler, R.G., Williams, C.O., and Faber, J.E. 1970. Standardized bacteriologic techniques for the characterization of mycoplasma species. Intl. J. Syst. Bacteriol. 20: 3558.[CrossRef] Barre, A., de Daruvar, A., and Blanchard, A. 2004. MolliGen, a database dedicated to the comparative genomics of Mollicutes. Nucleic Acids Res. 32 Database issue: D307D310.
Batzoglou, S., Jaffe, D.B., Stanley, K., Butler, J., Gnerre, S., Mauceli, E., Berger, B., Mesirov, J.P., and Lander, E.S. 2002. ARACHNE: A whole-genome shotgun assembler. Genome Res. 12: 177189.
Bautsch, W. 1988. Rapid physical mapping of the Mycoplasma mobile genome by two-dimensional field inversion gel electrophoresis techniques. Nucleic Acids Res. 16: 1146111467. Bhugra, B., Voelker, L.L., Zou, N., Yu, H., and Dybvig, K. 1995. Mechanism of antigenic variation in Mycoplasma pulmonis: Interwoven, site-specific DNA inversions. Mol. Microbiol. 18: 703714.[CrossRef][Medline] Bredt, W. 1968. Growth morphology of Mycoplasma pneumoniae strain FH on glass surface. Proc. Soc. Exp. Biol. Med. 128: 338340.[Medline]
Bredt, W. and Radestock, U. 1977. Gliding motility of Mycoplasma pulmonis. J. Bacteriol. 130: 937938.
Brown, J.W. 1999. The Ribonuclease P Database. Nucleic Acids Res. 27: 314.
Cassuto, E., West, S.C., Mursalim, J., Conlon, S., and Howard-Flanders, P. 1980. Initiation of genetic recombination: Homologous pairing between duplex DNA molecules promoted by recA protein. Proc. Natl. Acad. Sci. 77: 39623966.
Chambaud, I., Heilig, R., Ferris, S., Barbe, V., Samson, D., Galisson, F., Moszer, I., Dybvig, K., Wroblewski, H., Viari, A., et al. 2001. The complete genome sequence of the murine respiratory pathogen Mycoplasma pulmonis. Nucleic Acids Res. 29: 21452153.
Cordova, C.M., Lartigue, C., Sirand-Pugnet, P., Renaudin, J., Cunha, R.A., and Blanchard, A. 2002. Identification of the origin of replication of the Mycoplasma pulmonis chromosome and its use in oriC replicative plasmids. J. Bacteriol. 184: 54265435. Dandekar, T., Snel, B., Schmidt, S., Lathe, W., Suyama, M., Huynen, M., and Bork, P. 2002. Comparative genome analysis of Mollicutes. In Molecular biology and pathogenicity of mycoplasmas (eds. S. Razin and R. Herrmann), pp. 255278. Kluwer Academic/Plenum, New York.
Delcher, A.L., Harmon, D., Kasif, S., White, O., and Salzberg, S.L. 1999. Improved microbial gene identification with GLIMMER. Nucleic Acids Res. 27: 46364641. Ficarro, S.B., McCleland, M.L., Stukenberg, P.T., Burke, D.J., Ross, M.M., Shabanowitz, J., Hunt, D.F., and White, F.M. 2002. Phosphoproteome analysis by mass spectrometry and its application to Saccharomyces cerevisiae. Nat. Biotechnol. 20: 301305.[CrossRef][Medline]
Fraser, C.M., Gocayne, J.D., White, O., Adams, M.D., Clayton, R.A., Fleischmann, R.D., Bult, C.J., Kerlavage, A.R., Sutton, G., Kelley, J.M., et al. 1995. The minimal gene complement of Mycoplasma genitalium. Science 270: 397403.
Galagan, J.E., Nusbaum, C., Roy, A., Endrizzi, M.G., Macdonald, P., FitzHugh, W., Calvo, S., Engels, R., Smirnov, S., Atnoor, D., et al. 2002. The genome of M. acetivorans reveals extensive metabolic and physiological diversity. Genome Res. 12: 532542. Galagan, J.E., Calvo, S.E., Borkovich, K.A., Selker, E.U., Read, N.D., Jaffe, D., FitzHugh, W., Ma, L.J., Smirnov, S., Purcell, S., et al. 2003. The genome sequence of the filamentous fungus Neurospora crassa. Nature 422: 859868.[CrossRef][Medline] Glass, J.I., Lefkowitz, E.J., Glass, J.S., Heiner, C.R., Chen, E.Y., and Cassell, G.H. 2000. The complete sequence of the mucosal pathogen Ureaplasma urealyticum. Nature 407: 757762. |