2. Overview of the Yen Computing Infrastructure
When you access the yen cluster, you get directed to your home directory on the interactive yen (yen1, yen2, yen3, yen4 or yen5). At this point, you can manage your files, get them ready for submission, submit a job that will execute on yen-slurm cluster, view the status of pending jobs and so on.
Once you submit your job, it gets queued along with all the other jobs submitted by other yen users. We use a scheduler to automatically form a queue of jobs with a fair share of common resources like cores and memory. When the required amount of resources (CPU cores and/or memory) becomes available, the batch scheduler executes your job. We use Slurm scheduler as our job scheduler (and so does Sherlock HPC and a lot of other supercomputers at various academic institutions and national labs).
Yen-Slurm has seven nodes - yen10, yen11, yen12, yen13, yen14, yen15, and yen-gpu1, each with multiple CPU’s (processors) containing multiple cores. Schematically, we can draw a diagram of yen10 node as:
where yen10 has 3 TB of RAM, 4 CPUs, each with 32 cores for the total of 128 cores. Similarly, yen11 and yen15 nodes have 2 CPU’s, each with 128 cores for the total of 256 CPU cores and 1 TB of RAM; yen12, yen13, yen14 nodes, each has 32 cores and 1.5 TB of RAM, and yen-gpu1 node has 64 cores and 256 G of RAM. Together, yen-slurm cluster of seven nodes has 800 CPU cores and over 9.5 T of RAM.