Jump to content

MEGAN

From Wikipedia, the free encyclopedia

MEGAN
Developer(s)Daniel Huson et al.
Stable release
6.25.10 / 2024
Repositorygithub.com/husonlab/megan-ce
Written inJava
Operating systemWindows, Unix, Linux, macOS
PlatformJava
TypeBioinformatics
LicenseFree open source "community edition", commercial "Ultimate edition" licensed by Computomics
Websiteuni-tuebingen.de/fakultaeten/mathematisch-naturwissenschaftliche-fakultaet/fachbereiche/informatik/lehrstuehle/algorithms-in-bioinformatics/software/megan6/

MEGAN ("MEtaGenome ANalyzer") is a computer program that allows optimized analysis of large metagenomic datasets.[1][2]

Metagenomics is the analysis of the genomic sequences from a usually uncultured environmental sample. One of its long-term goals is to inventory and measure the extent and role of microbial biodiversity in the ecosystem, based on discoveries that the diversity of microbial organisms and viral agents in the environment is far greater than previously estimated.[3] MEGAN is an example of a tool that allows the investigation of very large datasets from environmental samples (using shotgun sequencing techniques in particular). It is designed to sample and investigate the unknown biodiversity of environmental samples where more precise techniques with smaller, better known samples, cannot be used.

Fragments of DNA from a metagenomics sample, such as ocean water or soil, are compared against databases of known DNA sequences using BLAST or another sequence comparison tool to assemble the segments into discrete, comparable sequences. MEGAN is then used to compare the resulting sequences with gene sequences from GenBank in NCBI.[4] This program was used to investigate the DNA of a woolly mammoth recovered from the Siberian permafrost[5] and the Sargasso Sea dataset.[6]

Introduction

[edit]

Metagenomics is the study of genomic content of samples from the same habitat, aimed at determining the role and extent of species diversity. Both targeted and random sequencing are commonly used, with comparisons made against sequence databases.[1] Recent developments in sequencing technology have led to an increase in the number of metagenomics samples. MEGAN is a tool for analyzing metagenomics data. The first version of MEGAN was released in 2007,[1] and the most recent version is MEGAN6.[7] While the initial version could analyze the taxonomic content of a single dataset, later versions can handle multiple datasets and include new features such as querying different databases and employing updated algorithms.

MEGAN Pipeline

[edit]

MEGAN analysis starts with collecting reads from any shotgun platform. Then, the reads are compared with sequence databases using BLAST or similar tools. After that, MEGAN assigns a taxon ID to processed read results based on NCBI taxonomy, creating a MEGAN file that contains the necessary information for statistical and graphical analysis. Lastly, the lowest common ancestor (LCA) algorithm can be run to inspect assignments, analyze data, and create summaries based on different NCBI taxonomy levels. The LCA algorithm identifies the lowest common ancestor among different species.[1][2]

References

[edit]
  1. ^ a b c d Huson, H.; A. Auch; Ji Qi; S. C. Schuster (2007). "MEGAN Analysis of Metagenomic Data". Genome Research. 17 (3): 377–386. doi:10.1101/gr.5969107. PMC 1800929. PMID 17255551. Retrieved April 3, 2008.
  2. ^ a b Huson, Daniel H; S. Mitra; N. Weber; H. Ruscheweyh; Stephan C. Schuster (2011). "Integrative analysis of environmental sequences using MEGAN4". Genome Research. 21 (9): 1552–1560. doi:10.1101/gr.120618.111. PMC 3166839. PMID 21690186.
  3. ^ Nee, S. (2004). "More than meets the eye". Nature. 429 (6994): 804–805. Bibcode:2004Natur.429..804N. doi:10.1038/429804a. PMID 15215837. S2CID 1699973.
  4. ^ Frias-Lopez, Jorge; Yanmei Shi; Gene W. Tyson; Maureen L. Coleman; Stephan C. Schuster; Sallie W. Chisholm; band Edward F. DeLong (March 11, 2008). "Microbial community gene expression in ocean surface waters" (PDF). PNAS. 105 (10): 3805–3810. doi:10.1073/pnas.0708897105. PMC 2268829. PMID 18316740. Retrieved April 3, 2008.
  5. ^ Poinar, Hendrik N.; Carsten Schwarz; Ji Qi; Beth Shapiro; Ross D. E. MacPhee; Bernard Buigues; Alexei Tikhonov; Daniel Huson; Lynn P. Tomsho; Alexander Auch; Markus Rampp; Webb Miller; Stephan C. Schuster (2007). "Metagenomics to Paleogenomics: Large-Scale Sequencing of Mammoth DNA". Science. 331 (6016): 392–394. doi:10.1126/science.331.6016.392. PMID 21273464. Retrieved April 3, 2008.
  6. ^ Venter JC, Remington K, Heidelberg JF, Halpern AL, Rusch D, Eisen JA, Wu D, Paulsen I, Nelson KE, Nelson W, Fouts DE, Levy S, Knap AH, Lomas MW, Nealson K, White O, Peterson J, Hoffman J, Parsons R, Baden-Tillson H, Pfannkoch C, Rogers YH, Smith HO (April 2004). "Environmental Genome Shotgun Sequencing of the Sargasso Sea". Science. 304 (5667): 66–74. Bibcode:2004Sci...304...66V. CiteSeerX 10.1.1.124.1840. doi:10.1126/science.1093857. PMID 15001713. S2CID 1454587.
  7. ^ "MEGAN6 — Algorithms in Bioinformatics". uni-tuebingen.de. Retrieved December 21, 2020. Huson, Daniel H; S. Beier; I. Flade; A. Gorska; M. El-Hadidi; H. Ruscheweyh; R. Tappu (2016). "MEGAN Community Edition - Interactive exploration and analysis of large-scale microbiome sequencing data". PLOS Computational Biology. 12 (6): e1004957. Bibcode:2016PLSCB..12E4957H. doi:10.1371/journal.pcbi.1004957. PMC 4915700. PMID 27327495.
[edit]