Installing R Packages on Cypress
If you want to use some R packages that are not yet installed in your desired version of R on Cypress, then you have several alternatives, as prescribed below, for locations for installing those packages. Those locations include either your user home directory or lustre sub-directory, and the methods will vary depending on your desired level of reproducibility.
For capturing logging output including error messages during installation, see below.
For more information on how the R startup process works, see https://cran.r-project.org/web/packages/startup/vignettes/startup-intro.html
Alternative 1 - default to home sub-directory
From your R session, you may choose to have R install its packages into a sub-directory under your home directory. By default R will create such a sub-directory whose name corresponds to the R version of your current R session and install your packages there.
> R.version.string [1] "R version 3.4.1 (2017-06-30)" > install.packages("copula") Installing package into ‘/share/apps/spark/spark-2.0.0-bin-hadoop2.6/R/lib’ (as ‘lib’ is unspecified) Warning in install.packages("copula") : 'lib = "/share/apps/spark/spark-2.0.0-bin-hadoop2.6/R/lib"' is not writable Would you like to use a personal library instead? (y/n) y Would you like to create a personal library ~/R/x86_64-pc-linux-gnu-library/3.4 to install packages into? (y/n) y --- Please select a CRAN mirror for use in this session --- PuTTY X11 proxy: unable to connect to forwarded X server: Network error: Connection refused HTTPS CRAN mirror 1: 0-Cloud [https] 2: Algeria [https] ... 79: Vietnam [https] 80: (HTTP mirrors) Selection: 77 ...
Note that the above example was performed without X11 forwarding, resulting in a prompt at the command line for selection of a CRAN mirror site in the above, at which point you should enter the number corresponding to the desired mirror site, e.g. 77.
Alternative 2 - specify your lustre sub-directory via exported environment variable
Alternatively, if you prefer to use, say, your lustre sub-directory rather than your home directory, then you may do so via an exported environment variable setting as in the following. The environmental variable R_LIBS_USER points the desired location of user package(s).
First, create a directory and export the environment variable.
mkdir -p /lustre/project/<your-group-name>/R/Library export R_LIBS_USER=/lustre/project/<your-group-name>/R/Library
Then run R and install a package. Note that we can use the R function .libPaths() as confirmation of the user library location.
> .libPaths() [1] "/lustre/project/<your-group-name>/R/Library" [2] "/share/apps/spark/spark-2.0.0-bin-hadoop2.6/R/lib" [3] "/share/apps/R/3.4.1-intel/lib64/R/library" > install.packages("copula") Installing package into ‘/lustre/project/<your-group-name>/R/Library’ (as ‘lib’ is unspecified) ...
Alternative 3 - specify lustre sub-directory via environment file
Similarly, you may accomplish the above via the same environment variable setting as above but in a local file as in the following.
First, create a directory as above.
mkdir -p /lustre/project/<your-group-name>/R/Library
Then setting R_LIBS_USER in the file ~/.Renviron will tell R a default location.
Note however that setting or unsetting the environment variable R_LIBS_USER in the file ~/.Renviron will override any previously exported value of that same environment variable!
echo 'R_LIBS_USER="/lustre/project/<your-group-name>/R/Library"' > ~/.Renviron
Or use a text editor in order to create and edit the file ~/.Renviron so that the file includes the following line.
R_LIBS_USER="/lustre/project/<your-group-name>/R/Library"
Then run R and install a package. Note again the use of R function .libPaths() as confirmation of the user library location.
> .libPaths() [1] "/lustre/project/<your-group-name>/R/Library" [2] "/share/apps/spark/spark-2.0.0-bin-hadoop2.6/R/lib" [3] "/share/apps/R/3.4.1-intel/lib64/R/library" > install.packages("copula") Installing package into ‘/lustre/project/<your-group-name>/R/Library’ (as ‘lib’ is unspecified) ...
Alternative 4 - specify lustre sub-directory via R profile file
Similarly, you may set the sub-directory depending on R major.minor version via the R profile file as in the following.
Edit the file ~/.Rprofile as follows.
majorMinorPatch <- paste(R.version[c("major", "minor")], collapse=".") majorMinor <- gsub("(.*)\\..*", "\\1", majorMinorPatch) #print(paste0("majorMinor=", majorMinor)) myLibPath <- paste0("/lustre/project/<your-group-name>/R/Library/", majorMinor) dir.create(myLibPath, showWarnings = FALSE) #print(paste0("myLibPath=", myLibPath)) newLibPaths <- c(myLibPath, .libPaths()) .libPaths(newLibPaths)
Note that setting the R library trees directly via the R function .libPaths() in the file ~/.Rprofile can thus either override or append to that of any previously set value of R_LIBS_USER!
Then run R and install a package. Note again the use of R function .libPaths() as confirmation of the user library location.
> .libPaths() [1] "/lustre/project/<your-group-name>/R/Library/3.4" [2] "/share/apps/spark/spark-2.0.0-bin-hadoop2.6/R/lib" [3] "/share/apps/R/3.4.1-intel/lib64/R/library" > install.packages("copula") Installing package into ‘/lustre/project/<your-group-name>/R/Library/3.4’ (as ‘lib’ is unspecified) ...
Alternative 5 - specify lustre sub-directory via R code
As for yet another alternative, you can accomplish the above entirely in your R code via the following. First, create a directory as before.
mkdir -p /lustre/project/<your-group-name>/R/Library
Then run R and install a package, but note that you must also specify the location from which to load the package in the ensuing call to the R function library().
> myLib := "/lustre/project/<your-group-name>/R/Library" > install.packages("copula",lib=myLib) ... > library(copula, lib.loc=myLib)
Capturing Logging Output From R Package Installation
When you install an R package, the logging output can include multiple screens of information before finally ending with a simple, brief indication of success or failure such as installation of package 'RSQLite' had non-zero exit status
.
As a result, any helpful diagnostic information can be easily lost - including one or more critical error messages that you may not notice as they scroll quickly and entirely out of view and out of your terminal window buffer.
To avoid this loss of diagnostic information, the R function install.packages() provides an option keep_outputs=T (or keep_outputs=TRUE).
You can use the keep_outputs=T option for capturing the logging output in files - one file per attempted R package - for your later inspection to look for possible error messages - as in the following.
> install.packages("RSQLite", keep_outputs=T) # captures log output in a file RSQLite.out
Then from the BASH command line you can search for occurrences of the string error:
either via less or grep BASH commands. (See Linux Commands.)
For example in the following, multiple error messages captured in the file RSQLite.out indicate that the collection of boost C++ libraries is unexpectedly missing, which can be provided by loading the appropriate module, boost/1.76.0, on Cypress. (See Module Command.)
[tulaneid@cypress2 ~]$ grep -i error: RSQLite.out vendor/boost/preprocessor/list/fold_left.hpp(341): catastrophic error: cannot open source file "boost/preprocessor/list/detail/edg/fold_left.hpp" ... vendor/boost/preprocessor/list/fold_left.hpp(341): catastrophic error: cannot open source file "boost/preprocessor/list/detail/edg/fold_left.hpp" ERROR: compilation failed for package ‘RSQLite’
The following excerpt is taken from the output in the R session of > help("install.packages")
keep_outputs: a logical: if true, keep the outputs from installing source packages in the current working directory, with the names of the output files the package names with ‘.out’ appended. Alternatively, a character string giving the directory in which to save the outputs. Ignored when installing from local files.