Submit a Job
To run jobs and coordinate the machines they run on, we use the Slurm job scheduling system. To operate, Slurm takes in a list of your requirements (specified in a sbatch .sh script), compares it against the current resource availability on the cluster, and runs the job when a matching node is available.
Slurm Commands
- myjobs - View your current pending and running jobs (custom to UWEC)
- squeue - View all pending and running jobs on the cluster
- savail - View current node usage and availability (custom to UWEC)
- sbatch
- Submit job script - scancel
- Cancel running job
Example Slurm Script
To get you started, below is a sample Slurm script you can use as a base for your own work. Modify the options as necessary and run it using "sbatch run-script.sh".
#!/bin/bash
# ---- SLURM SETTINGS ---- #
# -- Job Specific -- #
#SBATCH --job-name="My Job" # What is your job called?
#SBATCH --output=output.txt # Output file - Use %j to inject job id, like output-%j.txt
#SBATCH --error=error.txt # Error file - Use %j to inject job id, like error-%j.txt
#SBATCH --partition=week # Which group of nodes do you want to use? Use "GPU" for graphics card support
#SBATCH --time=7-00:00:00 # What is the max time you expect the job to finish by? DD-HH:MM:SS
# -- Resource Requirements -- #
#SBATCH --mem=5G # How much memory do you need?
#SBATCH --ntasks-per-node=4 # How many CPU cores do you want to use per node (max 64)?
#SBATCH --nodes=1 # How many nodes do you need to use at once?
##SBATCH --gpus=1 # How many GPUs do you need (max 3)? Remove first "#" to enable.
# -- Email Support -- #
#SBATCH --mail-type=END # What notifications should be emailed about? (Options: NONE, ALL, BEGIN, END, FAIL, QUEUE)
# ---- YOUR SCRIPT ---- #
module load python-libs/3 # Load the Python library / software we want to use
python my-script.py
Resource Requirements
When deciding your requirements, consult the BOSE and BGSC pages for their individual hardware specifications, such as the max number of CPU cores and memory available on each machine.
The exact resources you'll need is different depending on the program you use and may require some trial and error to find that 'sweet spot' between speed and when your job is able to run on the cluster.
Not sure where to start? You can contact the HPC Team and we'd be glad to work with you to provide some guidance.
Script Submission
Submit your .sh file with the sbatch command. It will be assigned a job number which will be printed out upon submission:
To verify that your job is running, you can use the following command that lists all of your current pending and running jobs:
Email Notifications
When you use the "#SBATCH --mail-type=X" option in your Slurm script, you can be alerted to when the job starts, finishes, or fails so you don't have to keep checking in on the job. You can choose multiple email notifications by separating them with a comma, or by using "ALL" to receive all emails.
Options
- NONE - Don't send any emails
- BEGIN - Send an email when your job starts
- END - Send an email when your job finishes
- FAIL - Send an email when your job fails
- ALL - Send an email when your job starts, finishes, or fails.
By default, emails are sent to your
View official docs for mail-type and mail-user