Skip to content

Slurm

Description

Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.

As a cluster workload manager, Slurm has three key functions:

  • It allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work.
  • It provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes.
  • It arbitrates contention for resources by managing a queue of pending work.

Usage

Slurm base commands

Slurm uses a series of base commands to execute programs and monitor the submission queues for workload tracking. Below is a list of basic commands used.

Command Action
sinfo Displays status and info of open nodes

Example: sinfo
sbatch Submits the file to the submission pool

Example: sbatch -options file
sacct Displays accounting data for all jobs

Example: sacct -options
squeue Displays information about jobs in the queue

Example: squeue -options
scancel Cancels Slurm job submitted

Example: scancel -options jobID
savail View the available resources on all of the nodes (custom to UWEC)

Example: savail
myjobs View your current jobs in the queue (custom to UWEC)

Example: myjobs
myjoblogin Log into a compute node under your job's existing allocation (custom to UWEC)

Example: myjoblogin jobID or myjoblogin jobID nodeName
myaccounts View all your accounts and your default account (custom to UWEC)

Example: myaccounts
myqos View Quality of Service groups and QoS limits associated with your accounts (custom to UWEC)

Example:myqos
sinteract Interactively run commands on a compute node. Defaults to one CPU core for eight hours. (custom to UWEC)

Example: sinteract --ntasks=4 --mem=10G
sdevelop Interactively run commands on our development node. Defaults to 16 CPU cores for eight hours. (custom to UWEC)

Example: sdevelop --ntasks=16 --mem=10G

Script syntax

Slurm submissions require submission scripts. The general format for a Slurm submission script is:

#!/bin/bash
#
#SBATCH <OPTION>
#SBATCH <OPTION>
.
.
.
#SBATCH <OPTION>

<COMMAND TO BE EXECUTED TO RUN THE DESIRED PROGRAM>

Script SBATCH Flags

The #SBATCH directives are used to specify different options unique to the submitted job's needs.

Command Action Syntax Example
#SBATCH --partition Specifies partition to use #SBATCH --partion=yourPartition
#SBATCH --time Sets maximum runtime limit for job #SBATCH --time=dd-hh:mm:ss
#SBATCH --nodes Sets the number of requested nodes. #SBATCH --nodes=numberOfNodes
#SBATCH --ntasks-per-node Specifies number of processors to use per node #SBATCH --ntasks-per-node=coresPerNode
#SBATCH --mem Sets the memory limit (in MB). DO NOT USE WITH mem-per-cpu #SBATCH --mem=memoryLimit
#SBATCH --gpus=# Specifies the requested amount of GPU's (BOSE-only and required for GPU use) #SBATCH --gpus=numberOfGPU’s
#SBATCH --job-name Sets the name of the job during runtime. #SBATCH --job-name=”YourJobName”
#SBATCH --output Sets the name of the output file #SBATCH --output=outputFileName
#SBATCH --error Sets the name of the error file. #SBATCH --error=errorFileName
#SBATCH --exclude Exclude nodes by node name. These are comma delimited #SBATCH --exclude=nodeA,nodeB,nodeC
#SBATCH --nodelist Use specific nodes. These are comma delimited #SBATCH --nodelist=nodeA,nodeB,nodeC
#SBATCH --mail-user Sets the users email notifications. Defaults to UWEC email address. #SBATCH --mail-user=user@email.mail
#SBATCH --mail-type Sets when the user receives an email (Options: NONE, ALL, BEGIN, END, FAIL, QUEUE) #SBATCH --mail-type=ALL

Temporary Overrides

Besides including these in your Slurm script, you can also set these on demand when you submit your job.

When done this way, they only take effect for just that single job and override what's specified in your script.

sbatch --ntasks=32 --mem=20G my-script.sh

The above command submits a job that temporarily requests 32 cores and 20G of memory rather than what's listed inside my-script.sh

Example Script

Below is an example script for Slurm named hello.sh that results in a ''Hello from (your computer host name)''.

submit.sh
#!/bin/bash

#SBATCH --partition=week             #Partition to submit to
#SBATCH --time=0-00:00:30             #Time limit for this job
#SBATCH --nodes=1                     #Nodes to be used for this job during runtime. Use MPI jobs with multiple nodes.
#SBATCH --ntasks-per-node=1           #Number of CPUs. Cannot be greater than number of CPUs on the node.
#SBATCH --mem=512                     #Total memory for this job
#SBATCH --job-name="Slurm Sample"     #Name of this job in work queue
#SBATCH --output=ssample.out          #Output file name
#SBATCH --error=ssample.err          #Error file name
#SBATCH --mail-type=END               #Email notification type (BEGIN, END, FAIL, ALL). To have multiple use a comma separated list. i.e END,FAIL.

# Job Commands Below
echo "Hello from $(hostname)"

Example process for script submission

Create the script.sh file using a text editor, ensuring the file follows the format guidelines noted above.

Submit script.sh with the following command. It will be assigned a job number which will be printed out upon submission:

sbatch script.sh

If output files were specified and email notifications were set, your job will complete and generate those files. To check the progress of a submitted job, you can enter the following command:

sacct yourJobID

Instructional Videos

If you are still uncertain on how to use Slurm on our cluster to submit jobs, Slurm's website has several instructional videos that show you the basics.

Please see the link here: Training Videos


Common Errors / Statuses

Below is a list of common errors / statuses that we've seen come up that causes a job to be stuck pending in the queue or end up failing. This are typically found when running squeue or myjobs.

(QOSMaxJobsPerUserLimit)

QOSMaxJobsPerUserLimit: Quality of Service: You hit the max number of jobs you can run at any one time and the job will be able to start once your other jobs finish. Some groups or classes may also have their own custom restrictions applied. Run myqos to see a list of settings applied to your user/group.

See Also: User Policy - Fair Use

(QOSMaxGRESPerUser)

QOSMaxGRESPerUser: Quality of Service: You hit the max number of GPUs you can use at any one time and the job will be able to start once your other jobs finish. Some groups or classes may also have their own custom restrictions applied. Run myqos to see a list of settings applied to your user/group.

See Also: User Policy - Fair Use

(MaxNodePerAccount)

MaxNodePerAccount: Quality of Service: Your group has hit the max number of nodes you can use at one time. Certain limited partitions such as highmemory or GPU may have their own limitations. Run myqos to see a list of settings applied to your user/group.

(Resources)

Resources: Your job currently cannot be accommodated on any node due to available resources. This could be due to your selected partition / nodes and will have to wait until another job is completed. You can see the full list of available resources on all the nodes by using the savail command.

(Priority)

Priority: If there are multiple jobs currently pending in the queue (squeue), your job may have a lower priority or was submitted after others. Your job may have to wait until other jobs in the queue are completed before yours is next in line.

CONFIGURING (CF)

CONFIGURING (CF): When your job is marked as configuring, this means that the node you are going to use was in its power saving mode and is in the process of booting up. Usually after a few minutes your job will automatically start running.

References