Skip to content

Slurm Job Array Examples

Example Job Array

We will take this Slurm script script and modify it to run as a job array. Each task in a job array will run the same script and print 'Hello!' and the job array task ID. We are going to do this by passing the job array task ID as a command line argument to the script. The script accepts the command line argument and prints 'Hello!' and the task ID passed to it.

Let's modify our basic script to look like:

hello-parallel.R
1
2
3
4
5
# accept command line arguments and save them in a list called args
args = commandArgs(trailingOnly=TRUE)

# print task number
print(paste0('Hello! I am a task number: ', args[1]) )    
hello-parallel.py
1
2
3
4
5
# import sys library (needed for accepted command line args)
import sys

# print task number
print('Hello! I am a task number: ', sys.argv[1])
hello-parallel.jl
1
2
3
4
# print command line argument (should be the job array index: 1, 2, ...)
# note that command line arguments are stored as array of strings 
# ARGS[1] is the command line argument following the name of julia script
println("hello world from core: " * ARGS[1])
hello_parallel.m
1
2
3
function hello_parallel(i)
    fprintf('Hello! %d\n', i);
end

Then we modify the Slurm file to look like below:

hello-parallel.slurm
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#!/bin/bash

# Example of running R script with a job array

#SBATCH -J hello
#SBATCH -p normal
#SBATCH --array=1-10                    # how many tasks in the array
#SBATCH -c 1                            # one CPU core per task
#SBATCH -t 10:00
#SBATCH -o hello-%j-%a.out
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_email@stanford.edu

# Load software
module load R

# Run R script with a command line argument
Rscript hello-parallel.R $SLURM_ARRAY_TASK_ID    
hello-parallel.slurm
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#!/bin/bash

# Example of running python script with a job array

#SBATCH -J hello
#SBATCH -p normal
#SBATCH --array=1-10                    # how many tasks in the array
#SBATCH -c 1                            # one CPU core per task
#SBATCH -t 10:00
#SBATCH -o hello-%j-%a.out
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_email@stanford.edu

# Run python script with a command line argument
srun python3 hello-parallel.py $SLURM_ARRAY_TASK_ID
hello-parallel.slurm
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#!/bin/bash

# Example of running python script with a job array

#SBATCH -J hello
#SBATCH -p normal
#SBATCH --array=1-10                    # how many tasks in the array
#SBATCH -c 1                            # one CPU core per task
#SBATCH -t 10:00
#SBATCH -o hello-%j-%a.out
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_email@stanford.edu

# Load software
module load julia

# Run Julia with a command line arg being an index from 1 to 10
srun julia hello-parallel.jl $SLURM_ARRAY_TASK_ID
hello-parallel.slurm
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#!/bin/bash

# Example of running python script with a job array

#SBATCH -J hello
#SBATCH -p normal
#SBATCH --array=1-10                    # how many tasks in the array
#SBATCH -c 1                            # one CPU core per task
#SBATCH -t 10:00
#SBATCH -o hello-%j-%a.out
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_email@stanford.edu

# Load software
module load matlab

# Run Matlab with a command line arg being an index from 1 to 10
matlab -batch "hello_parallel($SLURM_ARRAY_TASK_ID);"

Note that in this case, we specified Slurm option #SBATCH --array=1-10 to run ten independent tasks in parallel. The maximum job array size is set to 512 on the Yens. Each task will generate a unique log file hello-[jobID]-[taskID].out so we can look at those and see if any of the tasks failed.

Submit Job Array to Scheduler

We can now submit our hello-parallel.slurm script to the Slurm scheduler to run the job array. It will launch all ten tasks at the same time (some might sit in the queue while others are going to run right away). To submit, run:

Terminal Command
sbatch hello-parallel.slurm

Monitor your jobs with watch squeue -u USER where USER is your SUNet ID. Check which job array tasks failed. Rerun those by setting --array= only to failed indices.