Skip to content

MEGAHIT

MEGAHIT is a lightweight next-generation sequencing assembler, designed for use with metagenomes, in addition to generic single genome or single-cell assembly.

Availability

Cluster Module/Version
BOSE megahit/1.2.9
BGSC N/A

Note: You can simply use module load megahit to activate the most recently installed version of this software.

Arguments / Options

This is a list of arguments for the megahit command that we wanted to highlight. Use this online man page for a full list.

Option Description
--help Print the help message
-1 Comma-separated list of fasta/q paired-end #1 files (paired with -2 option)
-2 Comma-separated list of fasta/q paired-end #2 files (paired with -1 option)
--12 Comma-separated list of combined/interleaved fasta/q paired-end files
--read Comma-separated list of fasta/q single-end files
--continue Continue running MEGAHIT from previous checkpoint
--test Run MEGAHIT on a toy test dataset
-t number of CPU threads to use

Sample Slurm Script

submit.sh
#!/bin/bash
# -- SLURM SETTINGS -- #
# [..] other settings here [..]

# The following settings are for the overall request to Slurm
#SBATCH --ntasks-per-node=32     # How many CPU cores do you want to request
#SBATCH --nodes=1                # How many nodes do you want to request

# -- SCRIPT COMMANDS -- #

# Load the needed modules
module load megahit/1.2.9    # Load MEGAHIT
megahit --read input.fastq -o output_dir/

Real Example

Has your research group used MEGAHIT in a project? Contact the HPC Team and we'd be glad to feature your work.

Citation

Please include the following citation in your papers to support continued development of MEGAHIT.

Li, D., Liu, C-M., Luo, R., Sadakane, K., and Lam, T-W., (2015) MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics, doi: 10.1093/bioinformatics/btv033 [PMID: 25609793].
Li, D., Luo, R., Liu, C.M., Leung, C.M., Ting, H.F., Sadakane, K., Yamashita, H. and Lam, T.W., 2016. MEGAHIT v1.0: A Fast and Scalable Metagenome Assembler driven by Advanced Methodologies and Community Practices. Methods.

Resources