Slurm
Description
Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.
As a cluster workload manager, Slurm has three key functions:
- It allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work.
- It provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes.
- It arbitrates contention for resources by managing a queue of pending work.
Usage
Slurm base commands
Slurm uses a series of base commands to execute programs and monitor the submission queues for workload tracking. Below is a list of basic commands used.
Command | Action |
---|---|
sinfo | Displays status and info of open nodes Example: sinfo |
sbatch | Submits the file to the submission pool Example: sbatch -options file |
sacct | Displays accounting data for all jobs Example: sacct -options |
squeue | Displays information about jobs in the queue Example: squeue -options |
scancel | Cancels Slurm job submitted Example: scancel -options jobID |
savail | View the available resources on all of the nodes (custom to UWEC) Example: savail |
myjobs | View your current jobs in the queue (custom to UWEC) Example: myjobs |
myjoblogin | Log into a compute node under your job's existing allocation (custom to UWEC) Example: myjoblogin jobID or myjoblogin jobID nodeName |
myaccounts | View all your accounts and your default account (custom to UWEC) Example: myaccounts |
myqos | View Quality of Service groups and QoS limits associated with your accounts (custom to UWEC) Example: myqos |
sinteract | Interactively run commands on a compute node. Defaults to one CPU core for eight hours. (custom to UWEC) Example: sinteract --ntasks=4 --mem=10G |
sdevelop | Interactively run commands on our development node. Defaults to 16 CPU cores for eight hours. (custom to UWEC) Example: sdevelop --ntasks=16 --mem=10G |
Script syntax
Slurm submissions require submission scripts. The general format for a Slurm submission script is:
#!/bin/bash
#
#SBATCH <OPTION>
#SBATCH <OPTION>
.
.
.
#SBATCH <OPTION>
<COMMAND TO BE EXECUTED TO RUN THE DESIRED PROGRAM>
Script SBATCH Flags
The #SBATCH directives are used to specify different options unique to the submitted job's needs.
Command | Action | Syntax Example |
---|---|---|
#SBATCH --partition | Specifies partition to use | #SBATCH --partion=yourPartition |
#SBATCH --time | Sets maximum runtime limit for job | #SBATCH --time=dd-hh:mm:ss |
#SBATCH --nodes | Sets the number of requested nodes. | #SBATCH --nodes=numberOfNodes |
#SBATCH --ntasks-per-node | Specifies number of processors to use per node | #SBATCH --ntasks-per-node=coresPerNode |
#SBATCH --mem | Sets the memory limit (in MB). DO NOT USE WITH mem-per-cpu | #SBATCH --mem=memoryLimit |
#SBATCH --gpus=# | Specifies the requested amount of GPU's (BOSE-only and required for GPU use) | #SBATCH --gpus=numberOfGPU’s |
#SBATCH --job-name | Sets the name of the job during runtime. | #SBATCH --job-name=”YourJobName” |
#SBATCH --output | Sets the name of the output file | #SBATCH --output=outputFileName |
#SBATCH --error | Sets the name of the error file. | #SBATCH --error=errorFileName |
#SBATCH --exclude | Exclude nodes by node name. These are comma delimited | #SBATCH --exclude=nodeA,nodeB,nodeC |
#SBATCH --nodelist | Use specific nodes. These are comma delimited | #SBATCH --nodelist=nodeA,nodeB,nodeC |
#SBATCH --mail-user | Sets the users email notifications. Defaults to UWEC email address. | #SBATCH --mail-user=user@email.mail |
#SBATCH --mail-type | Sets when the user receives an email (Options: NONE, ALL, BEGIN, END, FAIL, QUEUE) | #SBATCH --mail-type=ALL |
Temporary Overrides
Besides including these in your Slurm script, you can also set these on demand when you submit your job.
When done this way, they only take effect for just that single job and override what's specified in your script.
The above command submits a job that temporarily requests 32 cores and 20G of memory rather than what's listed inside my-script.sh
Example Script
Below is an example script for Slurm named hello.sh that results in a ''Hello from (your computer host name)''.
#!/bin/bash
#SBATCH --partition=week #Partition to submit to
#SBATCH --time=0-00:00:30 #Time limit for this job
#SBATCH --nodes=1 #Nodes to be used for this job during runtime. Use MPI jobs with multiple nodes.
#SBATCH --ntasks-per-node=1 #Number of CPUs. Cannot be greater than number of CPUs on the node.
#SBATCH --mem=512 #Total memory for this job
#SBATCH --job-name="Slurm Sample" #Name of this job in work queue
#SBATCH --output=ssample.out #Output file name
#SBATCH --error=ssample.err #Error file name
#SBATCH --mail-type=END #Email notification type (BEGIN, END, FAIL, ALL). To have multiple use a comma separated list. i.e END,FAIL.
# Job Commands Below
echo "Hello from $(hostname)"
Example process for script submission
Create the script.sh file using a text editor, ensuring the file follows the format guidelines noted above.
Submit script.sh with the following command. It will be assigned a job number which will be printed out upon submission:
If output files were specified and email notifications were set, your job will complete and generate those files. To check the progress of a submitted job, you can enter the following command:
Instructional Videos
If you are still uncertain on how to use Slurm on our cluster to submit jobs, Slurm's website has several instructional videos that show you the basics.
Please see the link here: Training Videos
Common Errors / Statuses
Below is a list of common errors / statuses that we've seen come up that causes a job to be stuck pending in the queue or end up failing. This are typically found when running squeue
or myjobs
.
(QOSMaxGRESPerUser)
User Policy - Fair Use). Run myqos
to see a list of settings applied to your user/group.
(MaxNodePerAccount)
myqos
to see a list of settings applied to your user/group.
(Resources)
savail
command.
(Priority)
squeue
), your job may have a lower priority or was submitted after others. Your job may have to wait until other jobs in the queue are completed before yours is next in line.
CONFIGURING (CF)
CONFIGURING (CF): When your job is marked as configuring, this means that the node you are going to use was in its power saving mode and is in the process of booting up. Usually after a few minutes your job will automatically start running.