Prerequisites
Topics
Extras
12. Monitoring Usage
Monitoring Your Resource Footprint
Certain parts of the GSB research computing infrastructure provide isolated cloud resources (like CloudForest where there is generally only one user per system), or are environments that are already managed by a scheduler (like Sherlock). In these cases it is not necesary for individuals to monitor resource usage themselves.
However, when working on systems like the yens where resources like CPU, RAM, and disk space are shared among many researchers, it is important that all users be mindful of how their work impacts the larger community.
htop
command to monitor CPU/RAM, and use the gsbquota
command to monitor disk quota.CPU & RAM
Per our Community Guidelines, CPU usage should always be limited to 12 CPU cores/threads per user at any one time on yen1-4 and to 48 CPU cores on yen5. Some software (R and RStudio, for example) default to claiming all available cores unless told to do otherwise. These defaults should always be overwritten when running R code on the yens. Similarly, when working with multiprocessing code in languages like Python, care must be taken to ensure your code does not grab everything it sees. Please refer to our parallel processing Topic Guides for information about how to limit resource consumption when using common packages.
One easy method of getting a quick snapshot of your CPU and memory usage is via the htop
command line tool. Running htop
shows usage graphs and a process list that is sortable by user, top CPU, top RAM, and other metrics. Please use this tool liberally to monitor your resource usage, especially if you are running multiprocessing code on shared systems for the first time.
The htop
console looks like this:
The userload
command will list the total amount of resources all your tasks are consuming.
$ userload
Disk
Unlike personal home directories which have a 50 GB quota, faculty project directories on yens/ZFS are currently uncapped. Disk storage is a finite resource, however, so to allow us to continue to provide uncapped project space please always be aware of your disk footprint. This includes compressing files when you are able, and removing intermediate and/or temp files whenever possible. See the yen file storage page for more information about file storage options.
Disk quotas on all yen servers can be reviewed by using the gsbquota
command. It produces output like this:
nrapstin@yen1:~$ gsbquota
/home/users/nrapstin: currently using 39% (20G) of 50G available
You can also check size of your project space by passing in a full path to your project space to gsbquota
command:
nrapstin@yen1:~$ gsbquota /zfs/projects/students/<my-project-dir>/
/zfs/projects/students/<my-project-dir>/: currently using 39% (78G) of 200G available
Example
We are going to continue using the same R example and experiment running it on multiple cores and monitoring our resource consumption.
# Run bootstrap computations on swiss data set
# Plot histogram of R^2 values and compute C.I. for R^2
# Modified: 2021-09-01
library(foreach)
library(doParallel)
library(datasets)
options(warn=-1)
# set the number of cores here
ncore <- 1
print(paste('running on', ncore, 'cores'))
# register parallel backend to limit threads to the value specified in ncore variable
registerDoParallel(ncore)
# Swiss data: Standardized fertility measure and socio-economic indicators for each of 47 French-speaking provinces of Switzerland at about 1888.
# head(swiss)
# Fertility Agriculture Examination Education Catholic Infant.Mortality
#Courtelary 80.2 17.0 15 12 9.96 22.2
#Delemont 83.1 45.1 6 9 84.84 22.2
#Franches-Mnt 92.5 39.7 5 5 93.40 20.2
#Moutier 85.8 36.5 12 7 33.77 20.3
#Neuveville 76.9 43.5 17 15 5.16 20.6
#Porrentruy 76.1 35.3 9 7 90.57 26.6
# dim(swiss)
# [1] 47 6
# number of bootstrap computations
trials <- 50000
# time the for loop
system.time({
boot <- foreach(icount(trials), .combine=rbind) %dopar% {
# resample with replacement for one bootstrap computation
ind <- sample(x = 47, size = 10, replace = TRUE)
# build a linear model
fit <- lm(swiss[ind, "Fertility"] ~ data.matrix( swiss[ind, 2:6]))
summary(fit)$r.square
}
})
# Plot histogram of R^2 values from bootstrap
hist(boot[, 1], xlab="r squared", main="Histogram of r squared")
# Compute 90% Confidence Interval for R^2
print('90% C.I. for R^2:')
quantile(boot[, 1], c(0.05,0.95))
To monitor the resource usage while running a program, we will need a second terminal window that is connected to the same yen server.
Check what yen you are connected to in the first terminal:
$ hostname
Then ssh
to the same yen in the second terminal window. So if I am on yen1
, I would open a new terminal window and ssh
to
the yen1
server so I can monitor my resources when I start running the R program on yen1
.
$ ssh yen1.stanford.edu
Once you have two terminal windows connected to the same yen, run the swiss-parallel-bootstrap.R
program after loading the R module
in one of the terminals:
$ ml R/4.2.1
$ Rscript swiss-parallel-bootstrap.R
Once the program is running, monitor your usage with htop
command in the second window:
$ htop -u <SUNetID>
where -u
will filter the running processes for your user.
While the program is running you should see only one R process running because we specified 1 core in our R program.
Let’s modify the number of cores to 8:
# Run bootstrap computations on swiss data set
# Plot histogram of R^2 values and compute C.I. for R^2
# Modified: 2021-09-01
library(foreach)
library(doParallel)
library(datasets)
options(warn=-1)
# set the number of cores here
ncore <- 8
print(paste('running on', ncore, 'cores'))
# register parallel backend to limit threads to the value specified in ncore variable
registerDoParallel(ncore)
# Swiss data: Standardized fertility measure and socio-economic indicators for each of 47 French-speaking provinces of Switzerland at about 1888.
# head(swiss)
# Fertility Agriculture Examination Education Catholic Infant.Mortality
#Courtelary 80.2 17.0 15 12 9.96 22.2
#Delemont 83.1 45.1 6 9 84.84 22.2
#Franches-Mnt 92.5 39.7 5 5 93.40 20.2
#Moutier 85.8 36.5 12 7 33.77 20.3
#Neuveville 76.9 43.5 17 15 5.16 20.6
#Porrentruy 76.1 35.3 9 7 90.57 26.6
# dim(swiss)
# [1] 47 6
# number of bootstrap computations
trials <- 50000
# time the for loop
system.time({
boot <- foreach(icount(trials), .combine=rbind) %dopar% {
# resample with replacement for one bootstrap computation
ind <- sample(x = 47, size = 10, replace = TRUE)
# build a linear model
fit <- lm(swiss[ind, "Fertility"] ~ data.matrix( swiss[ind, 2:6]))
summary(fit)$r.square
}
})
# Plot histogram of R^2 values from bootstrap
hist(boot[, 1], xlab="r squared", main="Histogram of r squared")
# Compute 90% Confidence Interval for R^2
print('90% C.I. for R^2:')
quantile(boot[, 1], c(0.05,0.95))
Then rerun:
$ Rscript swiss-parallel-bootstrap.R
You should see:
Loading required package: iterators
Loading required package: parallel
[1] "running on 8 cores"
user system elapsed
50.551 0.517 10.142
[1] "90% C.I. for R^2:"
5% 95%
0.6593025 0.9892563
While the program is running (the process will run faster since we are using 8 cores instead of 1), you should see 8 R processes running in the htop
output because we
specified 8 cores in our R program.
Last modification we are going to make is to pass the number of cores as a command line argument to our R script.
Save the following to a new script called swiss-par-command-line-args.R
.
#!/usr/bin/env Rscript
############################################
# This script accepts a user specified argument to set the number of cores to run on
# Run from the command line:
#
# Rscript swiss-par-command-line-args.R 4
#
# this will execute on 4 cores
###########################################
# accept command line arguments and save them in a list called args
args = commandArgs(trailingOnly=TRUE)
library(foreach)
library(doParallel)
library(datasets)
options(warn=-1)
# set the number of cores here from the command line. Avoid using detectCores() function.
ncore <- as.integer(args[1])
print(paste('running on', ncore, 'cores'))
# register parallel backend to limit threads to the value specified in ncore variable
registerDoParallel(ncore)
# Swiss data: Standardized fertility measure and socio-economic indicators for each of 47 French-speaking provinces of Switzerland at about 1888.
# head(swiss)
# Fertility Agriculture Examination Education Catholic Infant.Mortality
#Courtelary 80.2 17.0 15 12 9.96 22.2
#Delemont 83.1 45.1 6 9 84.84 22.2
#Franches-Mnt 92.5 39.7 5 5 93.40 20.2
#Moutier 85.8 36.5 12 7 33.77 20.3
#Neuveville 76.9 43.5 17 15 5.16 20.6
#Porrentruy 76.1 35.3 9 7 90.57 26.6
# dim(swiss)
# [1] 47 6
# number of bootstrap computations
trials <- 50000
# time the for loop
system.time({
boot <- foreach(icount(trials), .combine=rbind) %dopar% {
# resample with replacement for one bootstrap computation
ind <- sample(x = 47, size = 10, replace = TRUE)
# build a linear model
fit <- lm(swiss[ind, "Fertility"] ~ data.matrix( swiss[ind, 2:6]))
summary(fit)$r.square
}
})
# Plot histogram of R^2 values from bootstrap
hist(boot[, 1], xlab="r squared", main="Histogram of r squared")
# Compute 90% Confidence Interval for R^2
print('90% C.I. for R^2:')
quantile(boot[, 1], c(0.05,0.95))
Now, we can run this script with varying number of cores. We will still limit the number of cores to 12 on yen1-4 and to 48 cores on yen5 per Community Guidelines.
For example, to run with 4 cores:
$ Rscript swiss-par-command-line-args.R 4
You should see:
Loading required package: iterators
Loading required package: parallel
[1] "running on 4 cores"
user system elapsed
49.547 0.375 16.040
[1] "90% C.I. for R^2:"
5% 95%
0.6574049 0.9891781
Monitor your CPU usage while the program is running in the other terminal window with htop
(try userload
as well).
Connect with us