wiki:cypress/conda_virtual_environment

Version 6 (modified by fuji, 42 hours ago) ( diff )

CONDA Virtual Environment

The virtual environment is an isolated execution environment that allows software applications to run independently of the host.

https://docs.google.com/drawings/d/e/2PACX-1vRT8y0469t50JiKqEbZYCRnDD2hSxdWGqBpWJmRoWRQlksC9-GaXAG0v8KjApAkdCWBVI4Icuv8jS4s/pub

CONDA virtual environment and package manager

CONDA distribution

  • Anaconda: Full-size CONDA and a lot of Python packages, Anaconda Inc. (Preinstalled on Cypress)
  • Miniconda: Minimum size CONDA and Python only, Anaconda Inc.
  • Miniforge: Minimum size CONDA and Python only. Community support.

Anaconda

Several versions of Anaconda are installed on Cypress.

$ module av anaconda

----------------------------------------------- /share/apps/modulefiles ------------------------------------------------
anaconda/2.1.0    anaconda3/2018.12 anaconda3/2020.07 anaconda3/5.1.0
anaconda/2.5.0    anaconda3/2019.03 anaconda3/4.0.0

------------------------------------------- /share/apps/centos7/modulefiles --------------------------------------------
anaconda3/2023.07

anaconda/2.*.* provides python2. anacoda3/* provides python3. Modules under /share/apps/centos7/modulefiles work on CentOS 7 nodes only.

Because software installed via Conda may not be compatible with older operating systems, it's recommended to use the latest Anaconda distribution.

Using anaconda3/2023.07
  • Start an interactive session on the CentOS7 node.
    idev --partition=centos7 -t 8
    
    This allocates a single CentOS 7 node for a duration of 8 hours.
  • Load module,
    module load anaconda3/2023.07
    
  • Display the version of CONDA.
    $ conda --version
    conda 23.7.3
    

Basic Commands

Command Description
conda init Initializing CONDA environment.
conda info Display the CONDA system info.
conda --help To learn about available built-in commands.
conda env list List all available environments.
conda create –n VIRT_ENV Create a virtual environment with the CONDA.
source activate VIRT_ENV Activate the CONDA virtual environment.
conda install PACKAGE-NAME Install some packages with the CONDA installation.
conda deactivate Deactivates the CONDA virtual environment.
conda remove --name VIRT_ENV --all Removing an environment.

Configuring the CONDA environment on Cypress

By default, Conda environments are stored in your home directory. However, we recommend using your group’s project directory instead. You can set the CONDA_ENVS_PATH environment variable to specify where Anaconda should create and locate your environments. For example:

export CONDA_ENVS_PATH=/lustre/project/mygroup/myuser/conda-envs

Or conda config --add envs_dirs /path/to/envs command does the same. For example:

conda config --add envs_dirs /lustre/project/mygroup/myuser/conda-envs

By default, the package cache directory is created in your home directory, which will take up a large space as you use 'conda'. To change the package cache directory,

conda config --add pkgs_dirs /lustre/project/mygroup/myuser/conda-pkgs

The configuration settings are stored in the ~/.condarc file. You can modify this file manually to customize Conda’s behavior.

Command Description
conda config --add envs_dirs /path/to/envs Adding a desired path to the environment directory.
conda config --add pkgs_dirs /path/to/pkgs Adding a desired path to the package directory.
conda config --remove envs_dirs /path/to/envs Removing a desired path to the environment directory.
vi ~/.condarc Manually add paths to CONDA config file.

Installation commands

Command Description
conda install PACKAGE-NAME Install a software package.
conda install PACKAGE-NAME=version Install a software package with a particular version.
conda install PACKAGE-NAME=version –c CHANNEL Install a software package with a particular version from a specific channel.
conda install PACKAGE-NAME1 PACKAGE-NAME2 … Install multiple packages.

Some useful commands

Command Description
conda search PACKAGE-NAME Searching a software package.
conda search PACKAGE-NAME=version --info Search a software package with a particular version and display the information.
conda update/upgrade PACKAGE-NAME Update a package to the latest version.
conda uninstall/remove PACKAGE-NAME Uninstall or remove a package.

Channels

Conda channels are sources or repositories from which Conda downloads packages.

  • Default Channels: These are pre-configured when you install Conda. They’re hosted by Anaconda, Inc.
  • Community Channels: Popular alternatives maintained by the community, such as: conda-forge, bioconda.

Configuring CONDA channels

Command Description
conda config --show channels List available channels.
conda config --prepend channels Adding a channel with high priority
conda config --append channels Adding a channel with low priority.

The configuration settings are stored in the ~/.condarc file. You can modify this file manually to customize Conda’s behavior.

Some widely used channels

Channel Description
conda-forge Community supported for general purposes.
bioconda Community supported for bioinformatics.
nvidia / cuda NVIDIA official support.
pytorch Pytorch official support.

Managing CONDA environments

  • Export a Conda Environment
    source activate VIRT_ENV
    conda env export > environment.yml 
    
  • Create environment using environment.yml file
    conda env create -f environment.yml
    

Example Installation

Installing samtools

Samtools is a suite of programs designed for working with high-throughput sequencing data. You can download the source code and build it directly on Cypress, or install it in your virtual environment via the bioconda channel.

  • Create a virtual environment, myenv1:
    conda create –n myenv1
    
  • Activate conda virtual environment:
    source activate myenv1
    
  • Search for the Samtools version via the bioconda channel:
    conda search samtools -c bioconda
    
  • Install version 1.10:
    conda install samtools=1.10 -c bioconda
    
  • Check the version of samtools
    conda list
    

Troubleshooting

  • CONDA installs all dependent software, but some may not be compatible with the Cypress Operating System. A common error message is
    UnsatisfiableError: The following specifications were found to be incompatible with your system:
    
    If the message indicates 'glibc version', see here.

Manually installing the required dependencies may resolve the issue.

  • If the software requires a specific version of Python,
    conda create -n newenv python=3.10
    
  • If you're unsure about the error message, please submit an HPC Consultation ticket and include the command you ran along with all output messages, especially the error message.

To learn more, see here.

Note: See TracWiki for help on using the wiki.