Scheduler

The yen-slurm is a new computing cluster offered by the Stanford Graduate School of Business. It is designed to give researchers the ability to run computations that require a large amount of resources without leaving the environment and filesystem of the interactive Yens.

Current yen-slurm cluster configuration

The yen-slurm cluster has 5 nodes with 480 total number of available CPU cores and 8.5 TB of memory.

What is a scheduler?

The yen-slurm cluster can be accessed by the Slurm Workload Manager. Researchers can submit jobs to the cluster, asking for a certain amount of resources (CPU, Memory, and Time). Slurm will then manage the queue of jobs based on what resources are available. In general, those who request less resources will see their jobs start faster than jobs requesting more resources.

Why use a scheduler?

A job scheduler has many advantages over the directly shared environment of the yens:

  • Run jobs with a guaranteed amount of resources (CPU, Memory, Time)
  • Setup multiple jobs to run automatically
  • Run jobs that exceed the community guidelines on the interactive nodes
  • Gold standard for using high-performance computing resources around the world

How do I use the scheduler?

First, you should make sure your process can run on the interactive Yen command line. We’ve written a guide on migrating a process from JupyterHub to yen-slurm. Virtual Environments will be your friend here.

Once your process is capable of running on the interactive Yen command line, you will need to create an slurm script. This script has two major components:

  • Metadata around your job, and the resources you are requesting
  • The commands necessary to run your process

Here’s an example of a submission slurm script, my_submission_script.slurm:

#!/bin/bash

#SBATCH -J yahtzee
#SBATCH -o rollcount.csv
#SBATCH -c 1
#SBATCH -t 10:00
#SBATCH --mem=100G

python3 yahtzee.py 100000

The important arguments here are that you request:

  • SBATCH -c is the number of CPUs
  • SBATCH -t is the amount of time for your job
  • SBATCH --mem is the amount of total memory

Once your slurm script is written, you can submit it to the server by running sbatch my_submission_script.slurm.

OK - my job is submitted - now what?

You can look at the current job queue by running squeue:

USER@yen4:~$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
              1043    normal    a_job    user1 PD       0:00      1 (Resources)
              1042    normal    job_2    user2  R    1:29:53      1 yen10
              1041    normal     bash    user3  R    3:17:08      1 yen10

Jobs with state (ST) R are running, and PD are pending. Your job will run based on this queue.

Best Practices

Use all of the resources you request

The Slurm Scheduler keeps track of the resources you request, and the resources you use. Frequent under-utilization of CPU and Memory will affect your future job priority. You should be confident that your job will use all of the resources you request. It’s recommended that you run your job on the interactive Yens, and monitor resource usage to make an educated guess on resource usage.

Restructure your job into small tasks

Small jobs start faster than big jobs. Small jobs likely finish faster too. If your job requires doing the same process many times (i.e. OCR’ing many PDFs), it will benefit you to setup your job as many small jobs.

Tips and Tricks

Current Partitions and their limits

Run sinfo command to see available partitions:

$ sinfo

You should see the following output:

USER@yen4:~$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
normal*      up 2-00:00:00      2   idle yen[10-14]
dev          up    2:00:00      2   idle yen[10-14]
long         up 7-00:00:00      2   idle yen[10-14]

The first column PARTITION lists all available partitions. Partitions are the logical subdivision of the yen-slurm cluster. The * denotes the default partition.

The three partitions have the following limits:

Partition CPU Limit Per User Memory Limit Max Memory Per CPU (default) Time Limit (default)
normal 256 3 TB 24 GB (4 GB) 2 days (2 hours)
dev 2 48 GB 24 GB (4 GB) 2 hours (1 hour)
long 32 768 GB 24 GB (4 GB) 7 days (2 hours)

You can submit to the dev partition by specifying:

#SBATCH --partition=dev

Or with a shorthand:

#SBATCH -p dev

If you don’t specify the partition in the submission script, the job is queued in the normal partition. To request a particular partition, for example, long, specify #SBATCH -p long in the slurm submission script. You can specify more than one partition if the job can be run on multiple partitions (i.e. #SBATCH -p normal,dev).

How do I check how busy the machines are?

You can pass format options to the sinfo command as follows:

USER@yen4:~$ sinfo --format="%m | %C"
MEMORY | CPUS(A/I/O/T)
1031693+ | 0/480/0/480

where MEMORY outputs the minimum size of memory of the yen-slurm cluster node in megabytes (1 TB) and CPUS(A/I/O/T) prints the number of CPU’s that are allocated / idle / other / total. For example, if you see 86/394/0/480 that means 86 CPU’s are allocated, 394 are idle (free) out of 480 CPU’s total.

You can also run checkyens and look at the last line for summary of all pending and running jobs on yen-slurm.

USER@yen4:~$ checkyens
Enter checkyens to get the current server resource loads. Updated every minute.
yen1 :  5 Users | CPU [                     3%] | Memory [#                    8%] | updated 2023-01-13-09:44:01
yen2 :  2 Users | CPU [                     0%] | Memory [                     0%] | updated 2023-01-13-09:44:02
yen3 :  2 Users | CPU [                     0%] | Memory [##                  14%] | updated 2023-01-13-09:44:01
yen4 :  3 Users | CPU [#####               29%] | Memory [###                 15%] | updated 2023-01-13-09:44:01
yen5 :  1 Users | CPU [##                  10%] | Memory [######              31%] | updated 2023-01-13-09:44:01
yen-slurm : 12 jobs, 0 pending | 86 CPUs allocated (21%) | 300G Memory Allocated (5%) | updated 2023-01-13-09:44:02

When will my job start?

You can ask the scheduler using squeue --start, and look at the START_TIME column.

USER@yen4:~$ squeue --start

JOBID PARTITION     NAME     USER ST          START_TIME  NODES SCHEDNODES           NODELIST(REASON)
112    normal yahtzeem  astorer PD 2020-03-05T14:17:40      1 yen10                (Resources)
113    normal yahtzeem  astorer PD 2020-03-05T14:27:00      1 yen10                (Priority)
114    normal yahtzeem  astorer PD 2020-03-05T14:37:00      1 yen10                (Priority)
115    normal yahtzeem  astorer PD 2020-03-05T14:47:00      1 yen10                (Priority)
116    normal yahtzeem  astorer PD 2020-03-05T14:57:00      1 yen10                (Priority)
117    normal yahtzeem  astorer PD 2020-03-05T15:07:00      1 yen10                (Priority)

How do I cancel my job on Yen-Slurm?

The scancel JOBID command will cancel your job. You can find the unique numeric JOBID of your job with squeue. You can also cancel all of your running and pending jobs with scancel -u USERNAME where USERNAME is your username.