Yen Slurm
You may be used to using a job scheduler on other Stanford compute resources (e.g. Sherlock, etc.) or servers from other institutions. However, the Yen servers have traditionally run without a scheduler in order to make them more accessible and intuitive to our users. The ability to log onto a machine with considerably more resources than your laptop and immediately start running scripts as if it was still your laptop has been very popular with our users. This is the case on yen1, yen2, yen3, yen4 and yen5.
Note
Take a look here to see more details about the Yen hardware.
The downside of this system is that resources can be eaten up rather quickly by users and you may find a particular server to be "full". To combat this, we have implemented the Slurm scheduler on our yen-slurm servers. For users familiar with scheduler systems, this should be a seamless transition. For those unfamiliar, this page will help you learn how to get started.
Schedule Jobs on the Yens
Tip
Watch this Hub How-To presentation on using the Slurm scheduler to run parallel jobs on yen-slurm.
yen-slurm is a computing cluster offered by the Stanford Graduate School of Business. It is designed to give researchers the ability to run computations that require a large amount of resources without leaving the environment and filesystem of the interactive Yens.
The yen-slurm cluster has 10 nodes (including 3 GPU nodes) with a total of 2,112 available CPU cores, 10.25 TB of memory, and 12 NVIDIA GPU's.
What Is A Scheduler?
The yen-slurm cluster can be accessed by the Slurm Workload Manager. Researchers can submit jobs to the cluster, asking for a certain amount of resources (CPU, Memory, and Time). Slurm will then manage the queue of jobs based on what resources are available. In general, those who request less resources will see their jobs start faster than jobs requesting more resources.
Why Use A Scheduler?
A job scheduler has many advantages over the directly shared environment of the interactive Yens:
- Run jobs with a guaranteed amount of resources (CPU, Memory, Time)
- Setup multiple jobs to run automatically
- Run jobs that exceed the community guidelines on the interactive nodes
- Gold standard for using high-performance computing resources around the world
How Do I Use The Scheduler?
First, you should make sure your process can run on the interactive Yen command line. We've written a guide on migrating a process from JupyterHub to yen-slurm. Virtual Environments will be your friend here.
Once your process is capable of running on the interactive Yen command line, you will need to create a Slurm script. This script has two major components:
- Metadata around your job, and the resources you are requesting
- The commands necessary to run your process
Here's an example of a submission Slurm script, my_submission_script.slurm:
#!/bin/bash
#SBATCH -J yahtzee
#SBATCH -o rollcount.csv
#SBATCH -c 1
#SBATCH -t 10:00
#SBATCH --mem=100G
python3 yahtzee.py 100000
The important arguments here are that you request:
SBATCH -cis the number of CPUsSBATCH -tis the amount of time for your jobSBATCH --memis the amount of total memory
Once your Slurm script is written, you can submit it to the server by running sbatch my_submission_script.slurm.
How Do I View My Job Status?
You can look at the current job queue by running squeue:
USER@yen4:~$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1043 normal a_job user1 PD 0:00 1 (Resources)
1042 normal job_2 user2 R 1:29:53 1 yen11
1041 normal bash user3 R 3:17:08 1 yen11
Jobs with state (ST) R are running, and PD are pending. Your job will run based on this queue.
Submit Your First Job To Run On Yen Slurm
Example Script
| hello.R | |
|---|---|
1 | |
This one-liner script can be run with Rscript hello.R.
| hello.py | |
|---|---|
1 | |
This one-liner script can be run with python hello.py.
| hello.jl | |
|---|---|
1 | |
This one-liner script can be run with julia hello.jl.
| hello.m | |
|---|---|
1 2 3 | |
| hello.sas | |
|---|---|
1 | |
| hello.do | |
|---|---|
1 | |
However, we will run it via the Slurm scheduler on the yen-slurm cluster.
| hello.slurm | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
| hello.slurm | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
| hello.slurm | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
| hello.slurm | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
| hello.slurm | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
| hello.slurm | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | |
Then run it by submitting the job to the Slurm scheduler with:
sbatch hello.slurm
You should see a similar output:
Submitted batch job 44097
Monitor your job:
squeue
Best Practices
Using Python Virtual Environment In Slurm Scripts
We can also employ a virtual Python enviuronment using venv instead of the system's python3 when running scripts via Slurm.
For example, let's say you've created a virtual Python environment using the process described on this page that is located in your home directory at /zfs/home/users/SUNetID/venv/. You can modify your Slurm script to use this venv environment:
#!/bin/bash
# Example of running python script
#SBATCH -J my-job
#SBATCH -p normal,dev
#SBATCH -c 1 # CPU cores (up to 256 on normal partition)
#SBATCH -t 5:00
#SBATCH -o output-%j.out
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_email@stanford.edu
# Activate venv
source /zfs/home/users/SUNetID/venv/bin/activate
# Run python script
python myscript.py
In the above Slurm script, we first activate the venv environment and then execute the python script using python in the active environment. You can create your own venv environment and then activate it within your Slurm script in the same manner.
Use All Of The Resources You Request
The Slurm scheduler keeps track of the resources you request, and the resources you use. Frequent under-utilization of CPU and Memory will affect your future job priority. You should be confident that your job will use all of the resources you request. It's recommended that you run your job on the interactive Yens, and monitor resource usage to make an educated guess on resource usage.
Restructure Your Job Into Small Tasks
Small jobs start faster than big jobs. Small jobs likely finish faster too. If your job requires doing the same process many times (i.e. OCR'ing many PDFs), it will benefit you to setup your job as many small jobs. Check this page on Slurm job arrays to find an example of how to set this paradigm up.
Tips And Tricks
Current Partitions And Their Limits
Run sinfo -o "%P %D %N" in a terminal to see available partitions:
USER@yen4:~$ sinfo -o "%P %D %N"
PARTITION NODES NODELIST
normal* 7 yen[10-11,15-19]
dev 7 yen[10-11,15-19]
long 7 yen[10-11,15-19]
gpu 3 yen-gpu[1-3]
The first column PARTITION lists all available partitions. Partitions are the logical subdivision
of the yen-slurm cluster. The * denotes the default partition.
The four partitions have the following limits:
| Partition | CPU Limit Per User | Memory Limit (MB) | Memory Limit (GB) | Time Limit (default) |
|---|---|---|---|---|
| normal | 512 | 3072000 | 3000 | 2 days (2 hours) |
| long | 50 | 3072000 | 3000 | 7 days (2 hours) |
| dev | 2 | 48000 | 46 | 2 hours (1 hour) |
| gpu | 64 | 256000 | 250 | 1 day (2 hours) |
The default unit for memory allocation in Slurm is Megabytes (MB). You can request memory using either of the following flags:
--mem=<size>: Requests the total amount of memory per node.--mem-per-cpu=<size>: Requests the amount of memory per allocated CPU (Slurm multiplies this by the number of CPUs you request).
Note
Memory values must be whole integers — Slurm does not accept decimals. Keep in mind that 3000G is slightly less than 3TB so you can't request --mem=3T.
Valid units:
MorMB= Megabytes (default)GorGB= Gigabytes (1G = 1024M)TorTB= Terabytes (1T = 1024G)
Memory and Core Matching
- If you request
--mem=3000G, you must use yen10 — the only node with that much RAM and 128 cores. - Jobs with
-c 512will be placed on large-core nodes, but those only have up to 1.5 TB of RAM, so they can't handle 3T RAM jobs. - Both requests for either
--mem=3000Gor-c 512will require a full node and will wait in the queue until that node is empty which might take a very long time depending on what other jobs are in the queue at the moment.
You can see the node's memory (mem value) and cores (CPUTot value) with:
scontrol show nodes
You can submit to the dev partition by specifying:
#SBATCH --partition=dev
Or with a shorthand:
#SBATCH -p dev
If you don’t specify the partition in the submission script, the job is queued in the normal partition. To request a particular partition, for example, long, specify #SBATCH -p long in the Slurm submission script. You can specify more than one partition if the job can be run on multiple partitions (i.e. #SBATCH -p normal,dev).
To see more details about each of the partition limits, run:
sacctmgr show qos [partition]
partition argument is optional and will filter the output for that partition only.
The output table will have columns such as MaxTRESPU which lists the maximum number of CPU's a user can request, MaxJobsPU which lists the maximum number of jobs that can be running for a user, and MaxSubmitPU which lists the number of jobs that a user can submit to the partition queue.
How Do I Check How Busy Yen Slurm Is?
You can pass format options to the sinfo command as follows:
USER@yen4:~$ sinfo --format="%m | %C"
MEMORY | CPUS(A/I/O/T)
257366+ | 1040/1072/0/2112
where MEMORY outputs the minimum size of memory of the yen-slurm cluster node in megabytes (256 GB) and CPUS(A/I/O/T) prints the number of CPU's that are allocated / idle / other / total.
For example, if you see 1040/1072/0/2112 that means 1,040 CPU's are allocated, 1,072 are idle (free) out of 2,112 CPU's total.
You can also run checkyens and look at the last line for summary of all pending and running jobs on yen-slurm.
USER@yen4:~$ checkyens
Enter checkyens to get the current server resource loads. Updated every minute.
yen1 : 2 Users | CPU [####### 36%] | Memory [#### 22%] | updated 2024-10-22-00:16:01
yen2 : 3 Users | CPU [ 1%] | Memory [######## 42%] | updated 2024-10-22-00:16:00
yen3 : 6 Users | CPU [ 1%] | Memory [### 15%] | updated 2024-10-22-00:16:00
yen4 : 2 Users | CPU [ 0%] | Memory [### 16%] | updated 2024-10-22-00:16:00
yen5 : 0 Users | CPU [#### 21%] | Memory [#### 21%] | updated 2024-10-22-00:16:04
yen-slurm : 386 jobs, 7 pending | 834 CPUs allocated (53%) | 5142G Memory Allocated (50%) | updated 2024-10-22-00:16:02
When Will My Job Start?
You can ask the scheduler using squeue --start, and look at the START_TIME column.
USER@yen4:~$ squeue --start
JOBID PARTITION NAME USER ST START_TIME NODES SCHEDNODES NODELIST(REASON)
112 normal yahtzeem astorer PD 2020-03-05T14:17:40 1 yen11 (Resources)
113 normal yahtzeem astorer PD 2020-03-05T14:27:00 1 yen11 (Priority)
114 normal yahtzeem astorer PD 2020-03-05T14:37:00 1 yen11 (Priority)
115 normal yahtzeem astorer PD 2020-03-05T14:47:00 1 yen11 (Priority)
116 normal yahtzeem astorer PD 2020-03-05T14:57:00 1 yen11 (Priority)
117 normal yahtzeem astorer PD 2020-03-05T15:07:00 1 yen11 (Priority)
Note
The start times in the squeue --start output tend to be conservative and are based on the max duration of every job in the queue. Thus, it is likely that the job you just submitted will start sooner than the predicted time.
How Do I Cancel My Job On Yen Slurm?
The scancel JOBID command will cancel your job. You can find the unique numeric JOBID of your job with squeue. You can also cancel all of your running and pending jobs with scancel -u USERNAME where USERNAME is your username.
How Do I Constrain My Job To Specific Nodes?
Certain nodes may have particular features that your job requires, such as a GPU. These features can be viewed as follows:
USER@yen4:~$ sinfo -o "%20N %5c %5m %64f %10G"
NODELIST CPUS MEMOR AVAIL_FEATURES GRES
yen[10-11,15-19] 128+ 10315 (null) (null)
yen-gpu1 64 25736 GPU_BRAND:NVIDIA,GPU_UARCH:AMPERE,GPU_MODEL:A30,GPU_MEMORY:24GiB gpu:4
yen-gpu[2-3] 64 25736 GPU_BRAND:NVIDIA,GPU_UARCH:AMPERE,GPU_MODEL:A40,GPU_MEMORY:48GiB gpu:4
For example, to ensure that your job will run on a node that has an NVIDIA Ampere A40 GPU, you can include the -C/--constraint option to the sbatch command or in an sbatch script. Here is a trivial example command that demonstrates this:
sbatch -C "GPU_MODEL:A30" -G 1 -p gpu --wrap "nvidia-smi"
At present, only GPU-specific features exist, but additional node features may be added over time.