Conda on M3 (Rocky 9)#
conda is a cross-platform and language-independent package and environment manager. Users often rely on conda
to make isolated and reproduceable environments, particularly for Python, though conda
repositories also offer many packages for R, Perl, and other languages.
To use conda
on the cluster, use the miniforge3
module. You can then create your own conda
environments and activate environments using conda activate
. This module includes mamba, which is a faster drop-in replacement for conda
.
If this is your first time using this module, you should run some configuration commands once, but you will never need to think about these again afterwards.
Configuration (do this only once)#
To check if you have already done this before, run the following command.
$ cat ~/.condarc
envs_dirs:
- /scratch/nq46/lexg/conda/envs
pkgs_dirs:
- /scratch/nq46/lexg/conda/pkgs
If you see something similar to the above, where pkgs_dirs
and envs_dirs
are both set to scratch directories, then you can skip this configuration section! Otherwise, run the below commands.
Replace nq46
with your own HPC project ID. If you belong to multiple projects, just pick one.
$ PROJECT=nq46
$ echo "export PS1" >> ~/.bashrc
$ source ~/.bashrc
$ module load miniforge3
$ CONDA_HOME=/scratch/$PROJECT/$USER/conda
$ conda config --add pkgs_dirs $CONDA_HOME/pkgs
$ conda config --add envs_dirs $CONDA_HOME/envs
That’s it, you should never need to read this section again! Read on if you want more detail though…
What do these config commands do?
In short, we ensure conda
puts all of its large files outside of your home directory so you don’t go over quota. The first config command puts conda’s package cache in scratch, and the second config command means your future environments will be placed in scratch.
The reason we prefer /scratch/
over /projects/
is because we prefer that conda
environments are not backed up, since they have an incredibly large number of tiny files, which wreaks havoc on our backup systems.
The export PS1
line ensures that you will see the name of your active conda
environment appear in your shell prompt. For example, when I activate my environment called my-env
, I see my prompt is prefixed with (my-env)
as below.
[lexg@m3-login3 ~]$ module load miniforge3/24.3.0-0
(base) [lexg@m3-login3 ~]$ conda activate my-env
(my-env) [lexg@m3-login3 ~]$
Creating and activating a conda environment#
As an example, let’s create and activate an environment called test-env with Python 3.11 installed.
[lexg@m3-login3 ~]$ module load miniforge3
(base) [lexg@m3-login3 ~]$ mamba create -y --name test-env python=3.11
(base) [lexg@m3-login3 ~]$ mamba activate test-env
(test-env) [lexg@m3-login3 ~]$
Note we use mamba
here since it’s a faster drop-in replacement for conda
, but the conda
command is still available to you.
Let’s now verify that this environment has Python 3.11.
(test-env) [lexg@m3-login3 ~]$ python --version
Python 3.11.9
(test-env) [lexg@m3-login3 ~]$ which python
/scratch/nq46/lexg/conda/envs/test-env/bin/python
For more details on using conda
in general, please see the conda docs.
Using conda in scripts#
Using conda
inside shell scripts (e.g. when preparing SLURM scripts) is the same as using it outside of scripts! For example, I can activate my test-env
environment in a script like so.
[lexg@m3-login3 ~]$ cat test-conda.sh
#!/bin/bash
module load miniforge3
conda activate test-env
python --version
which python
[lexg@m3-login3 ~]$ ./test-conda.sh
Python 3.11.9
/scratch/nq46/lexg/conda/envs/test-env/bin/python
Note
This is only true because of how we have prepared the miniforge3
module. If you are not using our module then conda activate
generally does not work inside bash scripts.
Note on conda init and your bashrc#
Warning
You should not run conda init
. This is known to break STRUDEL.
To check if you have run this before, run cat ~/.bashrc
and search for something like the below snippet. If you see a similar snippet, you should delete it (e.g. with nano or vim, or whichever text editor you prefer!).
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/apps/miniforge3/24.3.0-0/miniforge3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/apps/miniforge3/24.3.0-0/miniforge3/etc/profile.d/conda.sh" ]; then
. "/apps/miniforge3/24.3.0-0/miniforge3/etc/profile.d/conda.sh"
else
export PATH="/apps/miniforge3/24.3.0-0/miniforge3/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
Transferring a conda environment to a different location#
If you previously used your own miniconda installation or even just an older anaconda module, you may find that your existing environments broke as part of the upgrade to Rocky 9. In this scenario, you will need to recreate these environments using the new Conda. There are 2 options here:
Export the environment’s specifications to a file, then use that file to rebuild the environment.
Clone the environment. For details, see the Conda guide on cloning.
For option 1, export the environment to a file like so:
$ source /path/to/your/miniconda/bin/activate # or however you activate conda
$ conda activate my-conda-env
$ conda env export > my-conda-env.yml
Then to build this environment afresh, open a new shell and do:
$ module load miniforge3/24.3.0-0
$ mamba env create -f my-conda-env.yml