MEGAHIT
MEGAHIT is a lightweight next-generation sequencing assembler, designed for use with metagenomes, in addition to generic single genome or single-cell assembly.
Availability
| Cluster | Module/Version |
|---|---|
| BOSE | megahit/1.2.9 |
| BGSC | N/A |
Note: You can simply use module load megahit to activate the most recently installed version of this software.
Arguments / Options
This is a list of arguments for the megahit command that we wanted to highlight. Use this online man page for a full list.
| Option | Description |
|---|---|
| --help | Print the help message |
| -1 | Comma-separated list of fasta/q paired-end #1 files (paired with -2 option) |
| -2 | Comma-separated list of fasta/q paired-end #2 files (paired with -1 option) |
| --12 | Comma-separated list of combined/interleaved fasta/q paired-end files |
| --read | Comma-separated list of fasta/q single-end files |
| --continue | Continue running MEGAHIT from previous checkpoint |
| --test | Run MEGAHIT on a toy test dataset |
| -t |
Sample Slurm Script
submit.sh
#!/bin/bash
# -- SLURM SETTINGS -- #
# [..] other settings here [..]
# The following settings are for the overall request to Slurm
#SBATCH --ntasks-per-node=32 # How many CPU cores do you want to request
#SBATCH --nodes=1 # How many nodes do you want to request
# -- SCRIPT COMMANDS -- #
# Load the needed modules
module load megahit/1.2.9 # Load MEGAHIT
megahit --read input.fastq -o output_dir/
Real Example
Has your research group used MEGAHIT in a project? Contact the HPC Team and we'd be glad to feature your work.
Citation
Please include the following citation in your papers to support continued development of MEGAHIT.
Li, D., Liu, C-M., Luo, R., Sadakane, K., and Lam, T-W., (2015) MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics, doi: 10.1093/bioinformatics/btv033 [PMID: 25609793]. Li, D., Luo, R., Liu, C.M., Leung, C.M., Ting, H.F., Sadakane, K., Yamashita, H. and Lam, T.W., 2016. MEGAHIT v1.0: A Fast and Scalable Metagenome Assembler driven by Advanced Methodologies and Community Practices. Methods.