wiki:cypress/R

Version 5 (modified by cmaggio, 11 years ago) ( diff )

--

Running R on Cypress

R Modules

As of August 18th, 2015 there is one version of R installed on Cypress in the module

  • R/3.1.2

Installing Packages

Running R Interactively

Start an interactive session using idev

[tulaneID@cypress1 pp-1.6.4]$ idev 
Requesting 1 node(s)  task(s) to normal queue of defq partition
1 task(s)/node, 20 cpu(s)/task, 2 MIC device(s)/node
Time: 0 (hr) 60 (min).
Submitted batch job 52311
Seems your requst is pending.
JOBID=52311 begin on cypress01-035
--> Creating interactive terminal session (login) on node cypress01-035.
--> You have 0 (hr) 60 (min).
[tulaneID@cypress01-035 pp-1.6.4]$ 

Load the R module

[tulaneID@cypress01-035 pp-1.6.4]$ module load R/3.1.2
[tulaneID@cypress01-035 pp-1.6.4]$ module list
Currently Loaded Modulefiles:
  1) git/2.4.1           3) idev                5) R/3.1.2
  2) slurm/14.03.0       4) bbcp/amd64_rhel60

Run R in the command line window

[tulaneID@cypress01-035 pp-1.6.4]$R

R version 3.1.2 (2014-10-31) -- "Pumpkin Helmet"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 

Running a R script in Batch mode

You can also submit your R job to the batch nodes (compute nodes) on Cypress. Inside your SLURM script, include a command to load the desired R module. Then invoke the Rscript command on your R script.

Error: Failed to load processor bash
No macro or processor named 'bash' found

Running a Parallel R Job

Starting with version 2.14.0, R has offered direct support for parallel computation through the "parallel" package. We will present two examples of running a parallel job of BATCH mode. They differ in the ways in which they communicate the number of cores reserved by SLURM to R. Both are based on code found in "Getting Started with doParallel and foreach" by Steve Weston and Rich Calaway and modified by The University of Chicago Resource Computing Center.

In the first example, we will use the built in R function Sys.getenv( ) to get the SLURM environment variable from the operating system.

Error: Failed to load processor r
No macro or processor named 'r' found

This script will obtain the number of tasks per node set in our SLURM script and will pass that value to the registerDoParallel( ) function. To implement this we need only set the correct parameters in our SLURM script. Suppose we wanted to use 16 cores. Then the correct script would be

Error: Failed to load processor bash
No macro or processor named 'bash' found

The disadvantage of this approach is that it is system specific. If we move our code to a machine that uses PBS-Torque as it's manager (sphynx for example) we have to change our source code. An alternative method that results in a more portable code base uses command line arguments to pass the value of our environment variables into the script.

Error: Failed to load processor r
No macro or processor named 'r' found

Note the use of args<-commandArgs(TRUE) and of as.integer(args[1]). This allows us to pass in a value from the command line when we call the script and the number of cores will be set to that value. Using the same basic submission script as last time, we need only pass the value of the correct SLRUM environment variable to the script at runtime.

Error: Failed to load processor bash
No macro or processor named 'bash' found

Not that since we did not specify an output file, the output will be written to slurm-<JobNumber>.out. For example:

[cmaggio@cypress1 ~]$ sbatch RsubmissionWargs.srun 
Submitted batch job 52481
[tulaneID@cypress1 ~]$ cat slurm-52481.out 
Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
[1] "16"
elapsed 
  3.282 
[tulaneID@cypress1 ~]$ 

Next Section: Running Python on Cypress

Note: See TracWiki for help on using the wiki.