Moving from CentOS 7 to Rocky 9 for Python users¶
The Apocrita cluster has been upgraded from CentOS 7 to Rocky 9 recently. There are some important things Python users need to know, such as how to migrate your existing environments to Rocky 9, as well as how to tackle some common problems during the process.
This is a one-way operation
You should only follow the steps below once you are absolutely sure you are moving your work to Rocky 9 and not going back to CentOS 7 (which is something you should now be actively doing). Be sure to backup any important environments as detailed below.
CentOS 7 steps¶
Remove packages in ${HOME}/.local/lib
¶
The ITSR team get a lot of tickets from users encountering issues with both Python and Conda with errors such as:
File "/data/home/abc123/.local/lib/python3.10/site-packages/tensorflow/python/training/saver.py", line 38, in <module>
from tensorflow.python.framework import meta_graph
File "/data/home/abc123/.local/lib/python3.10/site-packages/tensorflow/python/framework/meta_graph.py", line 18, in <module>
from packaging import version as packaging_version # pylint: disable=g-bad-import-order
ModuleNotFoundError: No module named 'packaging'
You'll notice there that the path commences with a user's home directory
(${HOME}
). These packages are "orphaned" and have been installed incorrectly
due to there not being a correctly activated
Python virtualenv
or
Conda environment
at the time the packages were installed. pip install
commands will fallback to
the --user
location of ${HOME}/.local/lib
in such cases, which in the
context of Apocrita causes a huge number of issues.
Now is a good time to clear these out using the following command:
rm -rf ${HOME}/.local/lib/python*
This will remove any directories for any Python version under
${HOME}/.local/lib
, and this directory should always remain free of any
directories named python*
(like python3.10
, python3.12
etc.) Before
raising a ticket in future, make sure this directory doesn't have any of those
first.
Be sure your Conda environment is fully activated
Python packages will often end up in ${HOME}/.local/lib
if you create and
activate a Conda environment and then immediately run a pip install
command, e.g.:
module load miniforge
mamba create -n myenv
mamba activate myenv
pip install <package>
This is because Conda doesn't know the location of the Python binary for the environment correctly. To avoid this, either specify a version of Python during environment creation:
module load miniforge
mamba create -n myenv python=3.12
mamba activate myenv
pip install <package>
Or, make sure you run a mamba install
command before any subsequent
pip install
commands:
module load miniforge
mamba create -n myenv
mamba activate myenv
mamba install pip
pip install <package>
It's best to avoid mixing mamba install
and pip install
wherever
possible, and try to use mamba install
exclusively in Conda environments
and pip install
exclusively in virtualenvs. However, we understand that
sometimes not everything you need is packaged in Conda and you have to
install certain packages from PyPi. In such cases, tread carefully.
Back up current environments¶
You should use freshly-created Python virtualenvs and Conda environments on Rocky 9 wherever possible, but there are ways to back environments up on CentOS 7 first for re-creation on Rocky 9 later. You should only do this if you require as close a replica of an environment as possible, i.e. with pinned package versions etc. Most users should install fresh, updated and un-pinned versions of any packages required according to their official installation instructions.
Back up Python virtualenvs¶
You can use the pip freeze
command to take a snapshot of any Python virtualenv
and redirect this to a requirements.txt
file. To do this, use a CentOS 7
interactive qlogin
session:
qlogin -l centos
source /path/to/virtualenv/bin/activate
pip freeze > requirements.txt
If you have multiple Python environments to migrate, then use a memorable name
for each one (myenv1.txt
, myenv2.txt
etc.) and repeat the process above
for each one. Once you have backed them all up you can
re-create them on a Rocky 9 node later on.
Back up Conda environments¶
Use Mamba
Remember to use mamba
in place of all conda
commands as it is markedly
faster. See this blog post for more
information.
You can
export Conda environments to a YML file
that can then be used to re-create them later. To do this, again, use a CentOS 7
interactive qlogin
session:
qlogin -l centos
module load anaconda3
mamba activate myenv
mamba env export > environment.yml
Replace myenv
with the name of your Conda environment, and if you are
migrating multiple environments, then use a memorable name for each one
(myenv1.yml
, myenv2.yml
etc.) and repeat the process above for each one.
Once you have backed them all up you can
re-create them on a Rocky 9 node later on.
Clear caches¶
Clear Python cache¶
You should clean your PyPi cache entirely before moving to Rocky 9, as any
previously cached packages will have been written to disk whilst you were using
CentOS 7 and will cause issues with regards to operating system libraries if you
try to re-use them on Rocky 9. To clear them, use a CentOS 7 node qlogin
:
$ qlogin -l centos
$ module load python
$ pip cache purge
Files removed: 435
Note, this may take some time depending on the number of cached files you have.
Clear Conda cache¶
You should totally clear your Conda pkgs
directory before moving to Rocky 9 so
that it is re-created afresh and then a cache that is compatible with Rocky 9 is
created going forward. Any previously cached packages will have been written to
disk whilst you were using CentOS 7 and will cause issues with regards to
operating system libraries if you try to re-use them on Rocky 9.
By default, this package cache is stored in:
${HOME}/.conda/pkgs/cache
Use the correct location
If you have a
modified ${HOME}/.condarc
file
then this path may be elsewhere, such as your scratch directory. If so,
adjust the instructions below as necessary.
To remove your pkgs
cache, use a CentOS 7 qlogin
session:
$ qlogin -l centos
$ module load anaconda3
$ rm -rf ${HOME}/.conda/pkgs
$ mamba clean -a
There are no unused tarball(s) to remove.
There are no index cache(s) to remove.
There are no unused package(s) to remove.
There are no tempfile(s) to remove.
There are no logfile(s) to remove.
Note, this may take some time depending on the number of cached files you have.
Rocky 9 steps¶
Anaconda and Miniconda are no longer available¶
On CentOS 7, there were modules for anaconda3
and miniconda
available. Due
to licensing issues, these have been removed, and all Conda users need to use
the miniforge
module instead. See our
documentation for more
detailed information.
Don't use the defaults
channel, use nodefaults
ONLY¶
Previous documentation may have led you to use the defaults
channel in your
~/.condarc
file. This is actively discouraged in the
official Mamba documentation
(see here for more information about
Mamba and why we recommend all users use it). Instead, you should only use
nodefaults
, which will disable the defaults
channel and use only the
conda-forge
channel.
Your ~/.condarc
file should look like this:
channels:
- nodefaults
ssl_verify: true
## Optional - store Conda environments in an alternative location
envs_dirs:
- /data/scratch/abc123/anaconda/envs
pkgs_dirs:
- /data/scratch/abc123/anaconda/pkgs
Please do not define any additional channels, or extra configuration like:
channel_priority: flexible
auto_activate_base: false
This is highly likely to lead to package installation failures. If your package
requires additional channels to install, please define them using the -c
flag
during installation in the order you want to use them, e.g.:
mamba install -c bioconda -c conda-forge cellrank
Re-creating environments on Rocky 9¶
Not all environments can be easily re-created
Whilst some environments will be easily re-created on Rocky 9, not all of them will. If you experience issues re-creating environments as detailed below, please raise a ticket and we will offer more detailed support.
Re-creating Python virtualenvs¶
Presuming you have correctly backed up your
virtualenvs, then you can restore them in a Rocky 9 qlogin
:
qlogin -l rocky
module load python
virtualenv myenv
source myenv/bin/activate
pip install -r requirements.txt
You may want to load a specific Python version module rather than the default,
in which case use module load python/<version>
(e.g. module load python/3.12
) instead of just module load python
.
Repeat the process if you have multiple virtualenvs frozen to
requirements.txt
files.
Re-creating Conda environments¶
Check for the defaults
channel
Please review any environment.yml
files and check if the defaults
channel is defined in the channels:
section; if it is, remove it before
proceeding with the instructions below (as per the
information above).
Presuming you have correctly backed up your
Conda environments, then you can
re-create
them in a Rocky 9 qlogin
:
qlogin -l rocky
module load miniforge
mamba env create -f environment.yml
Repeat the process if you have multiple virtualenvs exported to
environment.yml
files.
Frequently Asked Questions¶
I'm seeing Conda errors about cache files being modified by another program¶
If you see errors such as:
warning libmamba Cache file "/data/home/abc123/.conda/pkgs/cache/d4808d92.json" was modified by another program
nodefaults/linux-64 (check zst) Checked 0.4s
or:
Preparing transaction: done
Verifying transaction: \
SafetyError: The package for r-base located at /data/home/abc123/.conda/pkgs/r-base-4.4.2-hc737e89_2
appears to be corrupted. The path 'lib/R/doc/html/packages.html'
has an incorrect size.
reported size: 3423 bytes
actual size: 61857 bytes
ClobberError: The package 'conda-forge/noarch::sysroot_linux-64-2.17-h0157908_18' cannot be installed due to a
path collision for 'x86_64-conda-linux-gnu/sysroot/lib'.
This path already exists in the target prefix, and it won't be removed
by an uninstall action in this transaction. The path is one that conda
doesn't recognize. It may have been created by another package manager.
You need to make sure you have cleared you Conda cache as detailed above.
My Conda installs keep failing with "perhaps a missing channel"¶
Please check your ~/.condarc
file is correct as
detailed above.
My Conda installs keep failing with "error libmamba Could not open lockfile"¶
Please check your ~/.condarc
file is correct as
detailed above.
My job failed with "ERROR: Unable to locate a modulefile for 'python/3.10'"¶
There is no module for Python 3.10 on Rocky 9 nodes. You'll either need to
use an available version (running module avail python
on a Rocky 9 node will
list all available versions), or if you really need Python 3.10, you'll need to
create a Conda environment
specifying python=3.10
at the time of creation.
My job failed with "python: /lib64/libc.so.6: version GLIBC_2.34' not found"¶
Once you have re-created your environments on a Rocky 9 node, they will expect
the newer version of GLIBC on Rocky 9 (2.34) to be present. If you receive an
error message like version GLIBC_2.34' not found
, make sure you are working on
a Rocky 9 node and not a CentOS 7 one (where GLIBC is 2.17).