MUSCLE
Overview
MUSCLE, or MUltiple Sequence Comparison by Log-Expectation,
is used for multiple sequence alignment of input sequences using the amino acid alphabet.
Availability
| Cluster | Module/Version |
|---|---|
| BOSE | muscle/5.3 |
| BGSC | Not Available |
Note: You can simply use module load muscle to activate the most recently installed version of this software.
Arguments / Options
This is a list of arguments for the muscle command that we wanted to highlight. Use this online man page for a full list.
| Option | Description |
|---|---|
| -align | Align FASTA input |
| -super5 | Align sequences using the Super5 algorithm |
| -efastats | Reports miscellaneous information about the MSAs stored in ensemble FASTA (EFA) format |
| -fa2efa | Creates one ensemble FASTA (EFA) file from one or more FASTA files |
| -disperse | Calculates the dispersion of an ensemble |
| -letterconf | Calculates letter confidence values. Outputs FASTA format, with letters replaced by confidence, 0-9 |
| -maxcc | Extract the MSA with highest column confidence from an ensemble |
| -resample | Creates a resampled ensemble from an existing ensemble (usually a diversified ensemble) |
Sample Slurm Script
#!/bin/bash
# -- SLURM SETTINGS -- #
# [..] other settings here [..]
# The following settings are for the overall request to Slurm
#SBATCH --ntasks-per-node=32 # How many CPU cores do you want to request
#SBATCH --nodes=1 # How many nodes do you want to request
# -- SCRIPT COMMANDS -- #
# Load the needed modules
module load muscle # Load MUSCLE
muscle -align seqs.fa -output aln.afa # Run MUSCLE
Real Example
Has your research group used MUSCLE in a project? Contact the HPC Team and we'd be glad to feature your work.
Citation
Please include the following citation in your papers to support continued development of MUSCLE.
Edgar, RC (2021), MUSCLE v5 enables improved estimates of phylogenetic tree confidence by ensemble bootstrapping, bioRxiv 2021.06.20.449169. https://doi.org/10.1101/2021.06.20.449169.