FAQ: Conda and Python Virtual Environment on LRZ HPC Clusters
Conda
Setup/Recommendations
Please check for available conda modules (module av anaconda
or module av miniconda
). Then load it, and step on.
> module load anaconda3 # or "module load miniconda3" > conda init bash # or choose the shell you use
This needs to be done only once. It creates a section in your ~/.bashrc
tagged via
# >>> conda initialize >>> ... # <<< conda initialize <<<
~/.conda_init
or so, and place only a source ~/.conda_init
into the ~/.bashrc
, or, better, into the ~/.profile
.Background of this proposal is two-fold. 1) Cluttering the ~/.bashrc
can have strange side effects ... and mistakes can exclude you from a successful login. 2) The ~/.bashrc
is not sourced automatically in Slurm jobs. This is to avoid possible side-effects there inside your job. In order to activate conda in a Slurm job can then but easily accomplished via
source ~/.conda_init
inside your Slurm script ... without a possible hassle with side-effects from the ~/.bashrc
.
Usage
Once this setup is done, you can normally login, with conda being active then already. This is recognized on the prompt (base)
.
(base) <userID>@<host:~>
Now, you can create and modify or remove environment according to your needs. For instance, a simple python environment with maybe a most current version not provided by LRZ can be done as follows (please always check also the Intel documentation).
Working with HPC dedicated channels
Python for HPC comes with dedicated packages or channel whose efficiency is optimized for the specific hardware. The performance gain obtained by these packages may increase by even as much as an order of magnitude, especially for applications making use of numerical libraries (numpy/scipy
) or parallel communication (mpi4py
) or yet still ML/AI.
Except the GPU cloud, at the tme of writing most LRZ machines feature Intel hardware, thus the dedicated packages are provided by the intel channel. The same build, based on the Intel Distribution for Python, is provided also in the standard python modules (see below).
In order to add the Intel channel to your conda environment, just right after conda init
type:
conda config --add channels https://software.repos.intel.com/python/conda/
Later (see Usage ), when creating your environments, or when installing your packages, you ought to specify the Intel channel as preference, e.g. by:
conda install -c https://software.repos.intel.com/python/conda numpy scipy sympy mpi4py matplotlib
You can place the preference of channels into your ~/.condarc
file – with the channels in order of preference. For example
channels: - https://software.repos.intel.com/python/conda - conda-forge - bioconda - defaults report_errors: false
Take note that drawing commonly from different channels may cause inconsistencies (less is more, usually).
Within the conda environment activated, you can install additional packages via conda install ...
.
If you are trying to reproduce a specific build from your workstation or other system, changing conda channel may alter package versioning or specific dependencies.
It this happens please report the occurrence through our service desk, as dedicated support for such troubleshooting is available.
Intel conda channel was removed by Intel. A workaround exists.
Python Virtual Environment
First, check for currently available python modules via module av python
. Then load a python module and step on. For example,
For removing such a python virtual environment, only the directory of this environment needs to be deleted.
PIPENV
There is another alternative - pipenv. Installation is usually straightforward:
> pip install --user --upgrade pipenv
After installation, pipenv
is available. (Possibly add manually ~/.local/bin
to your PATH
environment variable!)
Usage is rather simple (check --help
option). For example,
> module tmp && cd tmp ~/tmp > pipenv install cmake ~/tmp > cmake --version cmake version 3.10.2 ~/tmp > pipenv run bash ~/tmp > cmake --version cmake version 3.28.1
(With Ctrl+D
you can leave the environment again.) This could be achieved also by directly installing cmake
via pip
(much as pipenv
above). But the clue is that with this environment, you don't clutter your native home environment. One can have many pipenv environments in parallel.
Please be aware that also this environment concept is bound to the python used. So, better use our software stack provided python.
One more example:
> python -c 'import matplotlib' Traceback (most recent call last): File "<string>", line 1, in <module> ImportError: No module named matplotlib > cd tmp ~/tmp > pipenv install matplotlib ~/tmp > pipenv run python -c 'import matplotlib' # no error thrown now!
Troubleshooting
SuperMUC-NG has no outgoing Internet Access
FAQ: Installing your own applications on SuperMUG-NG