Conda on M3 (Rocky 9)#

conda is a cross-platform and language-independent package and environment manager. Users often rely on conda to make isolated and reproduceable environments, particularly for Python, though conda repositories also offer many packages for R, Perl, and other languages.

To use conda on the cluster, use the miniforge3 module. You can then create your own conda environments and activate environments using conda activate. This module includes mamba, which is a faster drop-in replacement for conda.

If this is your first time using this module, you should run some configuration commands once, but you will never need to think about these again afterwards.

Configuration (do this only once)#

To check if you have already done this before, run the following command.

$ cat ~/.condarc
envs_dirs:
- /scratch/nq46/lexg/conda/envs
pkgs_dirs:
- /scratch/nq46/lexg/conda/pkgs

If you see something similar to the above, where pkgs_dirs and envs_dirs are both set to scratch directories, then you can skip this configuration section! Otherwise, run the below commands.

Replace nq46 with your own HPC project ID. If you belong to multiple projects, just pick one.

$ PROJECT=nq46
$ echo "export PS1" >> ~/.bashrc
$ source ~/.bashrc
$ module load miniforge3
$ CONDA_HOME=/scratch/$PROJECT/$USER/conda
$ conda config --add pkgs_dirs $CONDA_HOME/pkgs
$ conda config --add envs_dirs $CONDA_HOME/envs

That’s it, you should never need to read this section again! Read on if you want more detail though…

What do these config commands do?

In short, we ensure conda puts all of its large files outside of your home directory so you don’t go over quota. The first config command puts conda’s package cache in scratch, and the second config command means your future environments will be placed in scratch.

The reason we prefer /scratch/ over /projects/ is because we prefer that conda environments are not backed up, since they have an incredibly large number of tiny files, which wreaks havoc on our backup systems.

The export PS1 line ensures that you will see the name of your active conda environment appear in your shell prompt. For example, when I activate my environment called my-env, I see my prompt is prefixed with (my-env) as below.

[lexg@m3-login3 ~]$ module load miniforge3/24.3.0-0
(base) [lexg@m3-login3 ~]$ conda activate my-env
(my-env) [lexg@m3-login3 ~]$

Creating and activating a conda environment#

As an example, let’s create and activate an environment called test-env with Python 3.11 installed.

[lexg@m3-login3 ~]$ module load miniforge3
(base) [lexg@m3-login3 ~]$ mamba create -y --name test-env python=3.11
(base) [lexg@m3-login3 ~]$ mamba activate test-env
(test-env) [lexg@m3-login3 ~]$

Note we use mamba here since it’s a faster drop-in replacement for conda, but the conda command is still available to you.

Let’s now verify that this environment has Python 3.11.

(test-env) [lexg@m3-login3 ~]$ python --version
Python 3.11.9
(test-env) [lexg@m3-login3 ~]$ which python
/scratch/nq46/lexg/conda/envs/test-env/bin/python

For more details on using conda in general, please see the conda docs.

Using conda in scripts#

Using conda inside shell scripts (e.g. when preparing SLURM scripts) is the same as using it outside of scripts! For example, I can activate my test-env environment in a script like so.

[lexg@m3-login3 ~]$ cat test-conda.sh
#!/bin/bash
module load miniforge3
conda activate test-env
python --version
which python
[lexg@m3-login3 ~]$ ./test-conda.sh
Python 3.11.9
/scratch/nq46/lexg/conda/envs/test-env/bin/python

Note

This is only true because of how we have prepared the miniforge3 module. If you are not using our module then conda activate generally does not work inside bash scripts.

Note on conda init and your bashrc#

Warning

You should not run conda init. This is known to break STRUDEL.

To check if you have run this before, run cat ~/.bashrc and search for something like the below snippet. If you see a similar snippet, you should delete it (e.g. with nano or vim, or whichever text editor you prefer!).

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/apps/miniforge3/24.3.0-0/miniforge3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/apps/miniforge3/24.3.0-0/miniforge3/etc/profile.d/conda.sh" ]; then
        . "/apps/miniforge3/24.3.0-0/miniforge3/etc/profile.d/conda.sh"
    else
        export PATH="/apps/miniforge3/24.3.0-0/miniforge3/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

Transferring a conda environment to a different location#

If you previously used your own miniconda installation or even just an older anaconda module, you may find that your existing environments broke as part of the upgrade to Rocky 9. In this scenario, you will need to recreate these environments using the new Conda. There are 2 options here:

  1. Export the environment’s specifications to a file, then use that file to rebuild the environment.

  2. Clone the environment. For details, see the Conda guide on cloning.

For option 1, export the environment to a file like so:

$ source /path/to/your/miniconda/bin/activate       # or however you activate conda
$ conda activate my-conda-env
$ conda env export > my-conda-env.yml

Then to build this environment afresh, open a new shell and do:

$ module load miniforge3/24.3.0-0
$ mamba env create -f my-conda-env.yml