9. Run Interactive Jobs

Running R software without a graphical interface

The R software is installed system-wide but every user manages her own R packages that will live in her home directory (by default). Every R version will also have its own library separate from other versions. For example, R 3.6 will have user installed packages side by side with R 4.0 library with the user installed packages for that version. When you upgrade your R version (to run code with a newer R version), you will need to install packages for that new R version. But once the package is installed, you can use it in your scripts without having to reinstall it every time you login. Also, on the yens, system admins will install newer R versions for the users. You can let DARC know if you need a newer version of software that is currently available on the system.

Installing R packages

Load the R module with the version that you want (R 3.6.1 is the current default).

For example, let’s use the newest R version available on the yens:

$ ml R/4.2.1

Start interactive R console by typing R.

You should see:

R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.


Let’s install two multiprocessing packages on the yens that we will use for the R example.

> install.packages('foreach')

If this is your first time installing R package for this R version on the Yens, you will be asked to create a personal library (because users do not have write permissions to the system R library):

Warning in install.packages("foreach") :
  'lib = "/software/free/R/R-4.2.1/lib/R/library"' is not writable
Would you like to use a personal library instead? (yes/No/cancel) yes

Would you like to create a personal library
to install packages into? (yes/No/cancel) yes

Answer yes to both questions to create a personal library in your home directory. The library path is ~/R/x86_64-pc-linux-gnu-library/4.2 where all of the user packages will be installed. Once the library is created, next package will be installed there automatically.

Pick any mirror in the US.

Installing package into /home/users/nrapstin/R/x86_64-pc-linux-gnu-library/4.2
(as lib is unspecified)
--- Please select a CRAN mirror for use in this session ---
Secure CRAN mirrors

 1: 0-Cloud [https]
 2: Australia (Canberra) [https]
74: USA (IA) [https]
75: USA (MI) [https]
76: USA (OH) [https]
77: USA (OR) [https]
78: USA (TN) [https]
79: USA (TX 1) [https]
80: Uruguay [https]
81: (other mirrors)

Selection: 77

If the package is successfully installed, you should see:

* DONE (foreach)

The downloaded source packages are in

Then install doParallel package:

> install.packages("doParallel")

When the package is done installing, you will see:

* DONE (doParallel)

The downloaded source packages are in

Run R code interactively

Once we have loaded R module, launched R and installed R packages that we are going to use, we are ready to run our code. You can run R code line by line interactively by copying-and-pasting into the R console. For example,

> print('Hello!')
[1] "Hello!"

The advantage of interactive console is that the results are printed to the screen immediately and if you are developing code or debugging, it can be very powerful. But the disadvantage is that if you close the terminal window or lose connection, the session is not saved and you will need to reload the R module and paste all of the commands again. So, use this method for when you need interactive development / debugging environment. Another disadvantage is that if you did not login with the graphical interface (X11 forwarding) you will not be able to plot anything in the interactive console. So, if you need plots and graphs, either use -Y flag when connecting to the Yens or use JupyterHub.

We can then quit out of R without saving workspace image:

> q()
Save workspace image? [y/n/c]: n

Run R code on the command line

If you want to simply run the script, you can do so from the command line.

We are going to run the same code that we ran on our local machine, swiss-parallel-bootstrap.R.

Let’s update the script for the yens. Edit the script on JupyterHub in the Text Editor. Instead of using detectCores() function, we will hard code the number of cores for the script to use in this line in the R script:

ncore <- 1 

Thus, the swiss-parallel-bootstrap.R script on the yens should look like:

# Run bootstrap computations on swiss data set
# Plot histogram of R^2 values and compute C.I. for R^2
# Modified: 2021-09-01


# set the number of cores here
ncore <- 1

print(paste('running on', ncore, 'cores'))

# register parallel backend to limit threads to the value specified in ncore variable

# Swiss data: Standardized fertility measure and socio-economic indicators for each of 47 French-speaking provinces of Switzerland at about 1888.
# head(swiss)
#             Fertility Agriculture Examination Education Catholic Infant.Mortality
#Courtelary        80.2        17.0          15        12     9.96             22.2           
#Delemont          83.1        45.1           6         9    84.84             22.2
#Franches-Mnt      92.5        39.7           5         5    93.40             20.2
#Moutier           85.8        36.5          12         7    33.77             20.3
#Neuveville        76.9        43.5          17        15     5.16             20.6
#Porrentruy        76.1        35.3           9         7    90.57             26.6

# dim(swiss)
# [1] 47  6 
# number of bootstrap computations
trials <- 50000

# time the for loop
    boot <- foreach(icount(trials), .combine=rbind) %dopar% {

    # resample with replacement for one bootstrap computation
    ind <- sample(x = 47, size = 10, replace = TRUE)

    # build a linear model
    fit <- lm(swiss[ind, "Fertility"] ~ data.matrix( swiss[ind, 2:6]))

# Plot histogram of R^2 values from bootstrap 
hist(boot[, 1], xlab="r squared", main="Histogram of r squared")

# Compute 90% Confidence Interval for R^2
print('90% C.I. for R^2:')
quantile(boot[, 1], c(0.05,0.95))

After loading the R module, we can run this script with Rscript command on the command line:

$ Rscript swiss-parallel-bootstrap.R
Loading required package: iterators
Loading required package: parallel
[1] "running on 1 cores"
   user  system elapsed
 47.231   0.038  47.280
[1] "90% C.I. for R^2:"
       5%       95%
0.6558154 0.9894657

Again, running this script is active as long as the session is active (terminal stays open and you do not lose connection).