Skip to content

R

R

R is a versatile, open-source software environment for statistical computing and graphics. It is compatible with a wide range of UNIX platforms, Windows, and MacOS.

Usage

To list all available R versions, use the module spider command:

$ module spider R
---------------------------------------------------------------
  R:
---------------------------------------------------------------
Versions:
        R/3.6.2-foss-2019b
        R/4.0.0-foss-2020a
        R/4.0.4-foss-2020b
        R/4.2.1-foss-2022a
---------------------------------------------------------------
For detailed information about a specific "R" module 
(including how to load the modules) use the module's full name.
For example:

$  module spider R/4.2.1-foss-2022a
----------------------------------------------------------------

After selecting the most suitable version, load R in your environment by loading the corresponding module:

$ module load R/4.2.1-foss-2022a

The command R --version returns the version of R you have loaded:

$ R --version                                                                                                                                                                              
R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
https://www.gnu.org/licenses/.

Software version

Here you can check the available versions for R in the different clusters

R/3.4.0-intel-2017a-X11-20170314
R/3.4.3-foss-2017b-X11-20171023
R/3.5.1-foss-2018b
R/3.6.2-foss-2019b
R/3.6.2-intel-2019a
R/3.6.2-foss-2019b
R/4.0.0-foss-2020a
R/4.0.4-foss-2020b
R/4.2.1-foss-2022a
R/4.2.1-foss-2022a
R/4.2.2-foss-2022b
R/4.3.2-gfbf-2023a

Running an R batch script on the command line

There are multiple ways to launch an R script on the command line:

  1. Rscript yourRscript.R
  2. R CMD BATCH yourRscript.R
  3. R < yourRscript.R --no-save

Rscript is an alternative front end for use in shell scripts and other scripting applications. It is the most common method for running R scripts, redirecting output to the standard output. The second approach is similar to Rscript but redirects output to yourRscript.out file.

Running a R batch script on the cluster

In the previous section, we discussed how to launch an R script on the command line. In order to defer job execution and to run it on batch mode you can use the following batch script as template:

#!/bin/bash
#SBATCH --qos=regular
#SBATCH --job-name=R_JOB
#SBATCH --mem=200gb
#SBATCH --cpus-per-task=4
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --output=%x-%j.out
#SBATCH --error=%x-%j.err

export LSCRATCH_DIR=/lscratch/$USER/jobs/$SLURM_JOB_ID
mkdir -p $LSCRATCH_DIR
cd $SLURM_SUBMIT_DIR
cp -r * $LSCRATCH_DIR
cd $LSCRATCH_DIR

module load R/4.2.1-foss-2022a

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

Rscript yourRscript.R >& OUTPUT_FILE

export RESULTS_DIR=$SLURM_SUBMIT_DIR/RESULTS
mkdir -p $RESULTS_DIR
cp -r * $RESULTS_DIR
rm -rf $LSCRATCH_DIR

Submit the job using the sbatch command:

$ sbatch batch_script

How to install R packages?

Library Path Management

DIPC staff generally installs R packages on the system upon request. However, users can also install packages locally if they require them for a one-time use or if the package is highly specialized.

By default, R searches a set of paths when performing actions involving libraries. The first path is used by default when invoking functions such as install.packages(), which can lead to messages like this:

By default, R searches a set of paths when you request actions involving libraries. The first path is used by default when invoking functions such as install.packages() leading to messages like this:

 mkdir: cannot create directory  '/scicomp/easybuild/CentOS/7.3.1611/Haswell/software/R/3.4.3-foss-2017b-X11-20171023/lib/R/site-library/00LOCK': Permission denied ERROR: failed to lock directory  'scicomp/easybuild/CentOS/7.3.1611/Haswell/software/R/3.4.3-foss-2017b-X11-20171023/lib/R/site-library' for modifying

Temporarily changing the library path

You can modify R's library path on a one-time basis by specifying the lib= argument to install.packages. For example, if there is a directory called Rlibs in your home directory, the command:

> install.packages('caTools',lib='~/Rlibs')

will install the specified package in your local directory. To access it, use the lib.loc= argument of the library:

> library('caTools',lib='~/Rlibs')

One issue with this approach is that if a local library invokes the library() function, it won't automatically search the local library.

Changing the library path for a session

The .libPaths() function accepts a character vector naming the libraries to be used as a search path. Note that it does not automatically retain directories already on the search path. A call like:

> .libPaths(c('~/Rlibs',.libPaths()))

will put your local directory at the beginning of the search path. This means that install.packages() will automatically place packages there, and the library() function will find libraries in your local directory without additional arguments.

Permanently changing the library path

The environment variable R_LIBS is set by the script that invokes R and can be overridden (in a shell startup file, for example .bash_profile, .bashrc, etc.) to customize your library path.

> Sys.getenv('R_LIBS')                          

You can then copy this path, modify it, and set the R_LIBS environment variable to that value in the shell.

Warning

It is recommended to install packages in your /scratch directory to ensure their accessibility in the compute nodes.