|
|
|
|
Published online before print
January 25, 2007, 10.1101/gr.5969107 Genome Res. 17:377-386, 2007 ©2007 by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/07 $5.00 OPEN ACCESS ARTICLE
Resource MEGAN analysis of metagenomic data1 Center for Bioinformatics, Tübingen University, Sand 14, 72076 Tübingen, Germany; 2 Center for Comparative Genomics and Bioinformatics, Center for Infectious Disease Dynamics, Penn State University, University Park, Pennsylvania 16802, USA
Metagenomics is the study of the genomic content of a sample of organisms obtained from a common habitat using targeted or random sequencing. Goals include understanding the extent and role of microbial diversity. The taxonomical content of such a sample is usually estimated by comparison against sequence databases of known sequences. Most published studies use the analysis of paired-end reads, complete sequences of environmental fosmid and BAC clones, or environmental assemblies. Emerging sequencing-by-synthesis technologies with very high throughput are paving the way to low-cost random "shotgun" approaches. This paper introduces MEGAN, a new computer program that allows laptop analysis of large metagenomic data sets. In a preprocessing step, the set of DNA sequences is compared against databases of known sequences using BLAST or another comparison tool. MEGAN is then used to compute and explore the taxonomical content of the data set, employing the NCBI taxonomy to summarize and order the results. A simple lowest common ancestor algorithm assigns reads to taxa such that the taxonomical level of the assigned taxon reflects the level of conservation of the sequence. The software allows large data sets to be dissected without the need for assembly or the targeting of specific phylogenetic markers. It provides graphical and statistical output for comparing different data sets. The approach is applied to several data sets, including the Sargasso Sea data set, a recently published metagenomic data set sampled from a mammoth bone, and several complete microbial genomes. Also, simulations that evaluate the performance of the approach for different read lengths are presented.
The genomic revolution of the early 1990s targeted the study of individual genomes of microorganisms, plants, and animals. While this type of analysis has almost become routine, the genomic analysis of complex mixtures of organisms remains challenging. Metagenomics has been defined as "the genomic analysis of microorganisms by direct extraction and cloning of DNA from an assemblage of microorganisms" (Handelsman 2004
Early metagenomics projects (Béja et al. 2000
These projects all use "Sanger sequencing," based on cloning, fluorescent dideoxynucleotides, and capillary electrophoresis (Meldrum 2000a In this study, we present a new approach to the initial analysis of a metagenomic data set that avoids the problems associated with environmental assemblies or the use of a limited number of phylogenetic markers. Our strategy can be applied to DNA reads collected within the framework of any metagenomics project, regardless of the sequencing technology used, and thus provides an easily deployable alternative to other types of analysis. We provide a new computer program called MEGAN (Metagenome Analyzer) that allows analysis of large data sets by a single scientist. In a pre-processing step, the set of DNA reads (or contigs) is compared against databases of known sequences using a comparison tool such as BLAST (see Fig. 1). MEGAN is then used to estimate and interactively explore the taxonomical content of the data set, using the NCBI taxonomy to summarize and order the results. The program uses a simple algorithm that assigns each read to the lowest common ancestor (LCA) of the set of taxa that it hit in the comparison (see Fig. 2). As a result, species-specific sequences are assigned to taxa near the leaves of the NCBI tree, whereas widely conserved sequences are assigned to high-order taxa closer to the root.
We first illustrate this approach by applying it to a subset of the Sargasso Sea data set (Venter et al. 2004 300,000 reads obtained from a sample of mammoth bone (Poinar et al. 2006Ease of use is a main design criterion of MEGAN. An analysis is initiated by simply opening the output file of any member of the BLAST family of programs, or from some other sequence comparison tool, and is then performed interactively via a graphical user interface. The program was carefully engineered to run quickly and responsively on a laptop, even when processing large data sets. For maximum portability, the program is written in Java, and installers for Linux/Unix, MacOS and Windows are freely available to the academic community from http://www-ab.informatik.uni-tuebingen.de/software/megan.
The MEGAN processing pipeline Figure 1 illustrates a typical processing pipeline in which MEGAN is used to perform the initial analysis of a metagenomic sample. Firstly, reads are collected from the sample using any random shotgun protocol. Secondly, a sequence comparison of all reads against one or more databases of known reads is performed, using BLAST or a similar comparison tool. Thirdly, MEGAN processes the results of the comparison to collect all hits of reads against known sequences and assigns a taxon ID to each sequence based on the NCBI taxonomy. This produces a MEGAN file that contains all information needed for analyzing and generating graphical and statistical output. Fourthly, the user interacts with the program to run the lowest common ancestor (LCA) algorithm (see Fig. 2), to analyze the data, to inspect the assignment of individual reads to taxa based on their hits, and to produce summaries of the results at different levels of the NCBI taxonomy (see Figs. 3 and 58 below).
As different metagenomics projects need to use different alignment tools and databases, we have designed MEGAN in such a way that gives users unrestricted choice in this matter. In our studies, we used BLAST comparisons (Altschul et al. 1990 Although well established and trivial to carry out, sequence comparison is the main computational bottleneck in metagenomic analysis and will become increasingly critical, as the size of data sets and databases continues to grow. There is a tradeoff to be considered: Whole-genome approaches are easier to execute and potentially provide better taxonomical resolution than projects that target specific phylogenetic markers, but the additional computational burden can be immense.
Re-analysis of the Sargasso Sea data set
The Venter et al. (2004)
Figure 3 demonstrates that MEGAN can easily detect a sampling bias between Sample 1 and pooled Samples 24, despite the fact that only a small fraction (20,000 reads, 1% of the total data set) was analyzed. This discrepancy, referred to as "microheterogeneity" by Venter et al. (2004)To describe our process in more detail, firstly, we downloaded the complete set of Sargasso Sea Samples 14 from DDBJ/EMBL/GenBank (accession no. AACY0100000). We then selected the first 10,000 reads from Sample 1 and randomly selected a pooled set of 10,000 reads from Samples 24. On both data sets, we ran a BLASTX comparison against the NCBI-NR database, using default parameters. For the Sample 1 data set, only 1% of the reads had no hits (13) or remained unassigned (1051). Similarly, for the Sample 24 data set, <3% of the reads had no hits (69) or remained unassigned (2778).
We performed a MEGAN analysis of both data sets using a bit-score threshold of 100 (min-score filter; see Methods for more details on these parameters) and retaining only those hits whose bit scores lie within 5% of the best score (top-percent filter). In addition, all isolated assignments (that is, taxa that were hit by only one read) were discarded (min-support filter). For Sample 1,
The analysis of the 16 taxonomic groups performed in Venter et al. (2004)
Analysis of the mammoth data set
To identify those reads that come from the mammoth genome, we performed BLASTZ (Schwartz et al. 2003
To determine the distribution of environmental sequences in the sample, we first used BLASTX to compare all reads against the NCBI-NR ("non-redundant") protein database (Benson et al. 2006 Here we provide details of the MEGAN analysis, using a bit-score threshold of 30 and discarding any isolated assignments, that is, any taxon that has only a single read assigned to it. The LCA algorithm assigned 50,093 reads to taxa, and 2086 remained unassigned either because the bit-score of their matches fell below the threshold or because they gave rise to an isolated hit.
A total of 19,841 reads were assigned to Eukaryota, of which 7969 were assigned to Gnathostomata (jawed vertebrates) and thus presumably derive from mammoth sequences. Furthermore, a total of 16,972 reads were assigned to Bacteria, 761 to Archea, and 152 to Viruses, respectively. These numbers are marginally lower than those reported in Poinar et al. (2006) Figures 5 and 6 demonstrate the ability of MEGAN to summarize results at different levels of the NCBI taxonomy. A distinctive feature of the program is that such summaries are computed dynamically on-the-fly, as the user changes parameters of the LCA algorithm or expands or collapses parts of the taxonomy. The relative abundance of reads at a certain node or leaf is indicated visually by the size of the circle representing the node, or by numerical labels. The cladograms produced by MEGAN can be considered "species profiles" and can be produced as tables, for example, for side-by-side comparisons of series of samples (see Fig. 4).
Species identification from short reads Several companies are developing new sequencing technologies that promise to produce high-throughput sequencing at substantially reduced cost, albeit with reads as short as 35 bp. The average length of reads produced using current Roche GS20 sequencing technology, introduced last year (Margulies et al. 2005 100 bp, and reads obtainable by current Sanger sequencing are 800 bp in length (Franca et al. 2002A simple approach to addressing this is to collect a set of reads from a known genome, to process the data as a metagenomic data set (as described above), and then to evaluate the accuracy of the assignments. For this purpose, the genome sequence of the two organisms E. coli K12 and B. bacteriovorus HD100 were used. We chose E. coli as it is used as a cloning host in most clone-based sequencing projects and is thus likely to occur in several different database sequences by mistake. The second test organism, B. bacteriovorus, is very distinctive in its sequence from other Proteobacteria and has no close relatives that are currently represented in the sequence databases. Its metagenomic analysis should therefore result in a much better signal/noise ratio than for E. coli.
We show the results of simulation studies for the two genomes in Tables 1 (E. coli) (Blattner et al. 1997
Using Roche GS20 sequencing technology, we sequenced a test set of 2000 reads from random positions in the E. coli K12 genome of length 100 bp. Figure 7 shows the details of a MEGAN analysis of these data, which is based on a BLASTX comparison of the reads against the NCBI-NR database, using the same parameters as above. Of the 2000 reads, 25% (432) have no hits, and 110 reads are not assigned. Of the remaining 1458 reads, 75% (1052) are assigned to Enterobacteriaceae, thus making a correct assignment up to the taxonomic level of family. All other reads, except two, are assigned to super-taxa, thus producing correct, if increasingly weak, predictions.
The two false-positive assignments to Haemophilus somnus appear to be due to false entries in the NCBI-NR database: the two database sequences are labeled "hypothetical proteins"; however, one is identical to the 16S rRNA sequence in E. coli, and the other is identical to the 23S rRNA sequence in E. coli.
In a second experiment, we considered 2000 reads of length
In Figure 8B, we show a similar MEGAN analysis obtained when using a copy of the NCBI-NR database from which all sequences representing the B. bacteriovorus HD100 genome have been removed. This mimics the case in which reads are obtained from a genome that is not yet represented in the database. Of the 2000 reads, 65% (1361) have no hits, and 13% (253) are not assigned. A small number of false positives occur up to the level of Bacteria. While these two experiments conducted with organisms of known phylogenetic distance demonstrate the robustness of the LCA algorithm, its performance on unknown, more distantly related sequences can only be estimated. Given the logical structure of the LCA algorithm, however, we predict a low rate of false-positive assignments at the price of producing fairly large numbers of unspecific assignments or no hits. Independent of MEGANs design, the outcome of each analysis will be biased by the content of the database used and will only improve as sequence databases become more complete. In addition to the generation of more sequence data, new algorithms will be required to structure databases of environmental content, as currently the taxon frequencies of unknown organisms cannot be assessed.
Species and strain identification through speciesspecific genes
Early metagenomic studies resorted to screening of environmental libraries for the presence of known phylogenetic markers and subsequent sequencing of clones of interest (Béja et al. 2000 The problem of species identification in a mixture of organisms has been addressed using proven phylogenetic markers, such as the ribosomal genes (16S, 18S, and 23S rRNA) or coding sequences of genes involved in the transcription or translation machinery of the cell (e.g., recA/radA, hsp70, EF-Tu, Ef-G, rpoB). By definition, such markers are based on slow-evolving genes and aim at distinguishing between species at large evolutionary distances, and are thus unsuitable for resolving closely related organisms. MEGAN deviates from the analytical pattern of previous metagenomic analysis pipelines and builds on the statistical power of comparing random sequence intervals with unspecified phylogenetic properties against databases of known sequences. This study demonstrates that even given the current incomplete and biased state of the DNA-, protein-, and environmental databases, a meaningful categorization of random reads is possible as a useful first phylogenetic analysis of metagenomic data. The ability to identify species depends, of course, on the presence or absence of closely related sequences in the databases, as demonstrated in Figure 8. Removal of the source genome B. bacteriovorus HD100 from the database results in a threefold increase of completely unassigned reads, while producing only a small number of false-positive identifications above the level of Proteobacteria. This underlines the fact that MEGAN takes a conservative approach to taxon identification. Lack of data may result in severe under-prediction or large numbers of unassigned reads, but will not result in a significant amount of over-prediction.
Laptop analysis MEGAN can be used to analyze DNA reads collected within the framework of any metagenomics project, regardless of the sequencing technology used. In a pre-processing step, the set of DNA reads (or contigs) is compared against databases of known sequences using BLAST or other comparison tools. This computationally demanding task will usually be performed on a high-performance computer cluster. Once completed, the resulting files can be downloaded onto a laptop or workstation and then interactively analyzed using MEGAN. Assuming that the reads are randomly selected from the metagenomic sample, MEGAN analysis can be viewed as a statistical approach with several attractive features. Because the reads are independently sampled from random regions of the genomes that can have very different levels of conservation, this type of analysis will show better resolution at all levels of the taxonomy, and particularly at the species and strain level, than an analysis based on a small set of phylogenetic markers, as their rate of evolution is slower than average. Because the analysis does not require an assembly of the reads into contigs, all problems associated with assembling data from a mixture of potentially very similar genomes are avoided. The software is easy to deploy as it operates on data produced by existing and widely available bioinformatics software tools for alignments (such as BLAST, BLASTZ, and other comparison tools) and publicly accessible data resources (sequence databases and the NCBI taxonomy). As sequence comparisons are computationally intensive and time-consuming, they should be performed only once with sufficiently relaxed alignment parameters. MEGAN provides filters to adjust the level of stringency later to an appropriate level. An investigator can perform a detailed analysis of a large metagenomic data set and manually inspect the correctness of each classification without needing to rerun the sequence comparison at various cutoff levels.
Intrinsic biases of current metagenomic analysis The third component is the taxonomical classification of species used. Our approach is based on the NCBI taxonomic system, which is maintained and updated by a team of taxonomy experts, who incorporate both sequence-based and non-sequence-based taxonomic information. However, MEGAN allows for the integration of other taxonomic systems as well.
Current issues and future extensions The current LCA assignment algorithm bases its decision solely on the presence or absence of hits between reads and taxa. We are currently contemplating a more sophisticated approach that will not only take the presence or absence of hits into account, but will also make use of the quality of the matches and the levels of similarity that are typical for given genes in a given clade of sequences. It is intriguing to see how robust and correct the taxonomical assignments based on local alignments performed with either BLASTN or BLASTX can be. While these tools create alignments of variable length from sequence intervals of unspecified phylogenetic relevance, potential problems are overcome by the power of statistics. By default, MEGAN requires that at least two reads are assigned to a taxon before that taxon is deemed to be present, and this helps to prevent false positives. Moreover, by design, short, highly conserved domains will lead to an unspecific assignment, rather than to a false one. The analysis of any metagenomic data set will produce a significant set of sequences that cannot be assigned to any known taxon, and the question arises how to estimate the number of unknown species. In our experience (data not shown), anywhere between 10% and 90% of all reads may fail to produce any hits when compared with BLASTX against NCBI-NR. To estimate how many of these reads actually come from unknown species, one must take into account that most known species are only partially represented in current databases. If, for example, only 10% of the genome of a species is present in the databases, then for every correctly identified read, there will be as many as nine that do not produce a hit. As there is insufficient information on the size of genomes to make such estimations in a precise way, such calculations have not yet been implemented in MEGAN.
Can short sequence intervals identify a species? While new developments in sequencing technology will continue to impact metagenomic projects in terms of cost and throughput, we believe that MEGAN analysis will remain a valuable tool for analyzing the new data and will help scientists to dissect the sequence information of their environmental samples.
Sequence comparisons In our studies, we performed sequence comparisons against the NCBI-NR database of nonredundant protein sequences using BLASTX with the default settings, the NCBI-NT database on nucleotide sequences using BLASTN with the default settings, and against whole-genome sequences obtained from dog, elephant, and human, using BLASTZ. Sequence comparison is a computationally challenging task that is likely to grow even more demanding as databases continue to grow and larger metagenome data sets are analyzed. For example, comparing the mammoth data set against NCBI-NR took almost 180 h real time on a cluster of 64 CPUs. We estimate that performing the same computation on the 1.6 million reads of the complete Sargasso Sea data set would require 1000 h real time on our system.
Analysis using MEGAN The program assigns reads to taxa using the LCA algorithm and then displays the induced taxonomy. Nodes in the taxonomy can be collapsed or expanded to produce summaries at different levels of the taxonomy. Additionally, the program provides a search tool to search for specific taxa, and an Inspector tool to view individual BLAST matches (see Fig. 9).
The approach uses several thresholds. First, the min-score filter sets a threshold for the score that an alignment must achieve to be considered in the calculations. For reads of length The result of the LCA algorithm is presented to the user as the partial taxonomy T that is induced by the set of taxa that have been identified (see Fig. 5). The program allows the user to explore the results at many different taxonomical levels, by providing methods for collapsing and expanding different parts of the taxonomy T. Each node in T represents a taxon t and can be queried to determine which reads have been assigned directly to t, and how many reads have been assigned to taxa below t. Additionally, the program allows the user to view the sequence alignments upon which specific assignments are based (see Fig. 9).
D.H. thanks the DFG for funding and Ramona Schmid and Mike Steel for helpful discussions. S.S. thanks The Gordon and Betty Moore Foundation for supporting a part of this project. S.S. and D.H. thank Webb Miller and Francesca Chiaromonte for stimulating discussions and comments on the computational approach.
3 Corresponding authors. E-mail huson{at}informatik.uni-tuebingen.de; fax 49-7071-295148.
E-mail scs{at}bx.psu.edu; fax (814) 863-6699. [MEGAN is freely available at http://www-ab.informatik.uni-tuebingen.de/software/megan.] Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.5969107
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215: 403410.[CrossRef][Medline] Béja, O., Aravind, L., Koonin, E.V., Suzuki, M.T., Hadd, A., Nguyen, L.P., Jovanovich, S.B., Gates, C.M., Feldman, R.A., and Spudich, J.L., et al. 2000. Bacterial rhodopsin: Evidence for a new type of phototrophy in the sea. Science 289: 19021906. Béja, O., Spudich, E.N., Spudich, J.L., Leclerc, M., and DeLong, E.F. 2001. Proteorhodopsin phototrophy in the ocean. Nature 411: 786789.[CrossRef][Medline] Benson, D., Karsch-Mizrachi, I., Lipman, D., Ostell, J., and Wheeler, D. 2006. GenBank. Nucleic Acids Res. 34: D16D20. Blattner, F.R., Plunkett III, G., Bloch, C.A., Perna, N.T., Burland, V., Riley, M., Collado-Vides, J., Glasner, J.D., Rode, C.K., and Mayhew, G.F., et al. 1997. The complete genome sequence of Escherichia coli K-12. Science 277: 14531474. DeLong, E.F.. 2005. Microbial community genomics in the ocean. Nat. Rev. Microbiol. 3: 459469.[CrossRef][Medline] DeLong, E.F., Preston, C.M., Mincer, T., Rich, V., Hallam, S.J., Frigaard, N.-U., Martinez, A., Sullivan, M.B., Edwards, R., and Rodriguez Brito, B., et al. 2006. Community genomics among stratified microbial assemblages in the oceans interior. Science 27: 496503. Franca, L.T., Carrilho, E., and Kist, T.B. 2002. A review of DNA sequencing techniques. Q. Rev. Biophys. 35: 169200.[CrossRef][Medline] Hallam, S.J., Putnam, N., Preston, C., Detter, J., and Rokhsar, D. 2004. Reverse methanogenesis: Testing the hypothesis with environmental genomics. Science 305: 14571462. Handelsman, J.. 2004. Metagenomics: Application of genomics to uncultured microorganisms. Microbiol. Mol. Biol. Rev. 68: 669685. Hicks, C.L., Kinoshita, R., and Ladds, P.W. 2000. Pathology of melioidosis in captive marine mammals. Aust. Vet. J. 78: 193195.[Medline] Margulies, M., Egholm, M., Altman, W., Attiya, S., Bader, J., Bemben, L., Berka, J., Braverman, M., Chen, Y.-J., and Chen, Z., et al. 2005. Genome sequencing in micro-fabricated high-density picolitre reactors. Nature 437: 376380.[Medline] Martiny, J.B., Bohannan, B.J., Brown, J.H., Colwell, R.K., Fuhrman, J.A., Green, J.L., Horner-Devine, M.C., Kane, M., Krumins, J.A., and Kuske, C.R., et al. 2006. Microbial biogeography: Putting microorganisms on the map. Nat. Rev. Microbiol. 4: 102112.[CrossRef][Medline] Meldrum, D.. 2000a. Automation for genomics, part one: Preparation for sequencing. Genome Res. 10: 10811092. Meldrum, D.. 2000b. Automation for genomics, part two: Sequencers, microarrays, and future trends. Genome Res. 10: 12881303. Nealson, K.H. and Scott, J. 2003. The prokaryotes: An evolving electronic resource for the microbiological community (ed. E.A. Dworkin). Springer-Verlag, New York. Poinar, H.N., Schwarz, C., Qi, J., Shapiro, B., MacPhee, R.D.E., Buigues, B., Tikhonov, A., Huson, D., Tomsho, L.P., and Auch, A., et al. 2006. Metagenomics to paleogenomics: Large-scale sequencing of mammoth DNA. Science 331: 392394. Quaiser, A., Ochsenreiter, T., Lanz, C., Schuster, S.C., Treusch, A.H., Eck, J., and Schleper, C. 2003. Acidobacteria form a coherent but highly diverse group within the bacterial domain: Evidence from environmental genomics. Mol. Microbiol. 50: 563575.[CrossRef][Medline] Rendulic, S., Jagtap, P., Rosinus, A., Eppinger, M., Baar, C., Lanz, C., Keller, H., Lambert, C., Evans, K.J., and Goesmann, A., et al. 2004. A predator unmasked: Life cycle of Bdellovibrio bacteriovorus from a genomic perspective. Science 303: 689692. Rondon, M.R., August, P.R., Bettermann, A.D., Brady, S.F., Grossman, T.H., Liles, M.R., Loiacono, K.A., Lynch, B.A., MacNeil, I.A., and Minor, C., et al. 2000. Cloning the soil metagenome: A strategy for accessing the genetic and functional diversity of uncultured microorganisms. Appl. Environ. Microbiol. 66: 25412547. Schwartz, S., Kent, W., Smit, A., Zhang, Z., Baertsch, R., Hardison, R.C., Haussler, D., and Miller, W. 2003. Humanmouse alignments with BLASTZ. Genome Res. 13: 103107. Steele, H. and Streit, W. 2005. Metagenomics: Advances in ecology and biotechnology. FEMS Microbiol. Lett. 247: 105111.[CrossRef][Medline] Treusch, A.H., Kletzin, A., Raddatz, G., Ochsenreiter, T., Quaiser, A., Meurer, G., Schuster, S.C., and Schleper, C. 2004. Characterization of large-insert DNA libraries from soil for environmental genomic studies of Archaea. Environ. Microbiol. 6: 970980.[CrossRef][Medline] Tringe, S.G., von Mering, C., Kobayashi, A., Salamov, A.A., Chen, K., Chang, H.W., Podar, M., Short, J.M., Mathur, E.J., and Detter, J.C., et al. 2005. Comparative metagenomics of microbial communities. Science 308: 554557. Tyson, G.W., Chapman, J., Hugenholtz, P., Allen, E.E., Raml, R.J., Richardson, P.M., Solovyev, V.V., Rubin, E.M., Rokhsar, D.S., and Banfield, J.F., et al. 2004. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428: 3743.[CrossRef][Medline] Venter, J.C., Remington, K., Heidelberg, J.F., Halpern, A.L., Rusch, D., Eisen, J.A., Wu, D., Paulsen, I., Nelson, K.E., and Nelson, W., et al. 2004. Environmental genome shotgun sequencing of the Sargasso sea. Science 304: 6674. Zhang, K., Martiny, A.C., Reppas, N.B., Barry, K.W., Malek, J., Chisholm, S.W., and Church, G.M. 2006. Sequencing genomes from single cells via polymerase clones. Nat. Biotechnol. 24: 680686.[CrossRef][Medline]
Received September 19, 2006; accepted in revised format December 19, 2006. This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||