Topics
SLURM Job Array Gurobi Example
Launching Job Arrays on Yen-slurm using a Scheduler
You should read the page on scheduler overview first before reading this topic guide. This topic guide assumes you know the Slurm scheduler basics and have submitted a scheduled batch job on the Yen-slurm before.
The Yen-slurm cluster is comprised of 5 shared compute nodes that use Slurm to schedule jobs and manage a queue of resources (if there are more requests than resouces available). It is a batch submission environment like the Sherlock cluster.
Job Array
A Slurm job array is a way to launch multiple jobs in parallel. One use case is that you want to change input parameters to your executable (a Python, Julia, or R script). Instead of manually changing the input parameters and rerunning our script multiple times, we can go this in one go with a Slurm job array.
Gurobi Example
We will work with the following python script that was modified from Gurobi documentation. The example requires Gurobi 9.0.
This scripts formulates and solves the following simple MIP model using the Gurobi 9.0 matrix API:
Save this python script to a new file called gurobi_example.py
.
import numpy as np
import scipy.sparse as sp
import gurobipy as gp
import sys
from threadpoolctl import threadpool_limits
from gurobipy import GRB
# Limits the number of cores for numpy BLAS
threadpool_limits(limits = 1, user_api = 'blas')
# Set total number of threads for Gurobi to
__gurobi_threads = 1
try:
# Define the coefficients to run sensitivity analysis
capacity_coefficients = np.linspace(1, 10, num=32)
# Assign a based on command line input - default is 0
if len(sys.argv) > 1 and int(sys.argv[1]) < 31:
a = capacity_coefficients[int(sys.argv[1])]
else:
a = 0
# Create a new model
m = gp.Model("matrix1")
# Set the total number of Gurobi threads for model "m"
m.Params.threads = __gurobi_threads
# Create variables
x = m.addMVar(shape=3, vtype=GRB.BINARY, name="x")
# Set objective
obj = np.array([1.0, 1.0, 2.0])
m.setObjective(obj @ x, GRB.MAXIMIZE)
# Build (sparse) constraint matrix
data = np.array([1.0, 2.0, 3.0, -1.0, -1.0])
row = np.array([0, 0, 0, 1, 1])
col = np.array([0, 1, 2, 0, 1])
A = sp.csr_matrix((data, (row, col)), shape=(2, 3))
# Build rhs vector
rhs = np.array([4.0 + a, -1.0])
# Add constraints
m.addConstr(A @ x <= rhs, name="c")
# Optimize model
m.optimize()
print(f"Solved LP with a = {a}")
print(f"Optimal Solution: {x.X}")
print(f"Obj: {m.objVal}")
except gp.GurobiError as e:
print(f"Error code {str(e.errno)}: {str(e)}")
except AttributeError:
print(f"Encountered an attribute error")
This python script can be run with python gurobi_example.py
with no command line argument (a is set to 0).
However, we will run it via the Slurm scheduler on the Yen-slurm cluster.
Here is an example Slurm script that loads gurobipy3
module and runs gurobi_example.py
python script.
Gurobi Example Slurm Script
Here is an example Slurm script that loads Gurobi module and runs one python script.
#!/bin/bash
# Example of running a single Gurobi run for sensitivity analysis
#SBATCH -J gurobi
#SBATCH -p normal
#SBATCH -c 1 # one core per task
#SBATCH -t 1:00:00
##SBATCH --mem=1gb
#SBATCH -o gurobi-%j.out
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_email@stanford.edu
# Load software
module load gurobipy3
# Run python script
# with no command line arg: a = 0 in the script
python3 gurobi_example.py
For this script to run, you need to load gurobipy3
module and to make sure all python packages are installed.
For example, run
$ pip3 install threadpoolctl
to install threadpoolctl
package.
Now we will take this Slurm job script and modify it to run as a job array. Each task in a job array will run the same python script with a unique integer value input for sensitivity analysis.
We are going to pass an index as a command line argument to the python script. The python script sets the value of a
to the corresponding capacity coefficient array element. For example, if we run the python script with an argument 5
, the script will assign a value to a corresponding to the 5th element in the user defined capacity coefficient array.
We also want to make sure we limit the threads to one - in numpy
and
Gurobi since we will be launching one task per one CPU core. These lines in the script achieve that:
# Limits the number of cores for numpy BLAS
threadpool_limits(limits = 1, user_api = 'blas')
# Set total number of threads for Gurobi to
__gurobi_threads = 1
Now, our slurm file should look like below (save this to sensitivity_analysis.slurm
or something like it in your project space):
#!/bin/bash
# Example of running a job array to run Gurobi python script for sensitivity analysis.
#SBATCH --array=0-31 # there is a max array size - 512 tasks
#SBATCH -J gurobi
#SBATCH -p normal
#SBATCH -c 1 # one core per task
#SBATCH -t 1:00:00
##SBATCH --mem=1gb
#SBATCH -o gurobi-%A-%a.out
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_email@stanford.edu
# Load software
module load gurobipy3
# Run python script with a command line arg being an index from 0 to 31
python3 gurobi_example.py $SLURM_ARRAY_TASK_ID
Note that in this case, we specified SLURM option #SBATCH --array=0-31
to run 32 independent tasks in parallel.
The maximum job array index is 511 (--array=0-511
) on Yen-slurm. All tasks will be launched as independent jobs. There is a limit of 200
concurrent jobs per user that could be running at the same time. Each task will generate a unique log file
gurobi-%A-%a.out
where %A
will be the unique job ID and %a
will be the unique task ID (from 0 to 31).
Submit Job Array to Scheduler
We can now submit our sensitivity_analysis.slurm
script to the scheduler to run the job array on yen-slurm cluster.
It will launch all 32 tasks at the same time (some might sit in the queue while others are going to run right away).
To submit, run:
$ sbatch sensitivity_analysis.slurm
Monitor your jobs with watch squeue -u USER
where USER
is your SUNet ID. Check which job array tasks failed.
Rerun those by setting --array=
only to failed indices.
Connect with us