Skip to content

Update to Rocky 9

Update of the operating system on Helios' compute nodes to Rocky Linux 9

Starting October 27th, the GPU and CPU compute nodes' operating system will be updated to Rocky Linux 9, from the current RHEL 8 and SLES 15 mix. Due to the change of the underlying OS, software stacks are changing accordingly.

Our software team prepared new software stacks (excluding CPE) that are available through the module system, with backward compatibility in mind.

OS upgrade brings many features and improvements (stability, performance) and should resolve some of the issues we have faced.

Important

Changes will be implemented on October 27th, 10:00 CEST.

How to minimise the impact of the upgrade (before October 27th):

Users are strongly encouraged to test their setup in a new environment using a dedicated reservations with early access to the new OS:

  • rocky9-cpu
  • rocky9-gpu-gh200

To use the reservation, add --reservation=rocky9-cpu or --reservation=rocky9-gpu-gh200 to the job description.

Quick troubleshooting

In case of:

  • missing modules: contact Helpdesk, report missing items, provide example script;
  • execution error due to a missing library: recreate virtual environment, if the problem persists afterwards, contact Helpdesk;

Breaking changes and incompatibilities

  • On the plgrid-gpu-gh200 partition (NVIDIA GH200 Superchip), the vast majority of the modules have been rebuilt to maintain compatibility.

  • If you encounter a missing module, please notify us via Helpdesk, provide JOBID, module name and partition type (CPU/GPU).

  • CPE (Cray Programming Environment) and dependent modules like:

    • all cray-* modules like cray-mpich, cray-fftw;
    • CrayCCE modules and modules dependent (CrayCCE/23.12, GROMACS/2024.1, GROMACS/2025.1, HDF5/1.14.3, Wannier90/3.1.0)

    are not available after the upgrade. Users of CPE and dependent modules should move to other environments i.e. foss.

    Example: instead of loading GROMACS module version 2025.1 using command

    module load CrayCCE/23.12 GROMACS/2025.1

    use command

    module load foss/2023b GROMACS/2025.1.

    All existing versions are shown through command module spider GROMACS/2025.1.

  • OpenSSL 3.0: Redhat Enterprise Linux 9 (and Rocky Linux 9) has moved to OpenSSL 3.0. This means that codes requiring earlier versions of OpenSSL (1.1) need to use the dedicated module (module load OpenSSL/1.1)

  • System Python: default Python interpreter will be upgraded to version 3.9. If you have any virtual environments based on system Python instead of one provided by modules (i.e ML-bundle), your environment may not work properly. In this case please recreate the virtual environment according to the information below.

How to recreate virtual environments - step by step

  1. Test whether running your workload using the old venv causes any problems.
  2. Create a requirements.txt file.
  3. Start an interactive session (on the new image).
  4. [Optional] Load the chosen Python module.
  5. Create a new venv.
  6. Install packages using the requirements.txt file.
  7. Test the same calculations with the new venv.

Step 1

Please test the Python application/script using a virtual environment (venv) in a job submitted via sbatch. Running scripts on the login node gives unreliable results because it is a shared resource. Additionally, executing the script in a job automatically documents the most important parameters. After the test, please pay special attention to whether:

  • errors or warnings appear,
  • the execution time is comparable to previous runs.

Memory usage and computation efficiency can also be checked using hpc-jobs-history after the job finishes.

Running the test on the old OS is necessary for a proper comparison. Please perform the comparative test during the trial period or using an appropriate reservation afterwards.

After the test, if everything is normal, there is no need to proceed further. Only if errors or significant slowdowns appear should you continue with rebuilding the venv.

Step 2

Activate the old venv and run:

python3 -m pip --require-virtualenv freeze > requirements.txt
The requirements.txt file will contain the names and versions of packages installed in the previous venv.

[Optional] When: in Step 6, if version conflicts appear, or you decide to try updating the venv to newer software versions. After creating requirements.txt, run:

cat requirements.txt | cut -d '=' -f1 > requirements-no-ver.txt
Then repeat the remaining steps, replacing every instance of requirements.txt with requirements-no-ver.txt.

Step 3

  • [Alternative 1] from October 27th at 10:00 onwards.

    Start an interactive session with one of the following commands:

    x86 (for CPU use)

    srun -N 1 --ntasks-per-node=1 --mem=4GB -A <grant-name>-cpu -p plgrid --time=01:00:00 --pty /bin/bash
    
    aarch64 (for GPU GH200 use)
    srun -N 1 --ntasks-per-node=1 --mem=4GB -A <grant-name>-gpu-gh200 -p plgrid --time=01:00:00 --gres=gpu:1 --pty /bin/bash
    
    Replace with your grant name. Adjust the time (and optionally memory) according to the expected usage for installing all packages from requirements.txt.

  • [Alternative 2] before October 27th at 10:00.

    Use one of the commands above, adding the option --reservation=rocky9-cpu or --reservation=rocky9-gpu-gh200 to use the new image.

Step 4 [Optional]

When: the previous venv required loading a Python module, or you want to create a venv for a version other than 3.9. In the interactive session on the new image, run:

ml purge
hpc-modules x86 -s python
hpc-modules gh200 -s python
Choose a Python module from the form Python/3.XX.Y - other modules containing python in the name are not suitable. Load the appropriate GCCcore and Python module, e.g.:
ml GCCcore/13.2.0 Python/3.11.5
Remember that future use of the new venv will require loading this module. If this step is skipped, the default Python (3.9) will be used to create the venv.

Step 5

Check the Python version:

python3 --version
Then navigate to a directory outside $HOME. This is important because venvs can quickly fill up $HOME. Using a team (project) directory is recommended. Create a new venv:
python3 -m venv myenv
Replace myenv with any name different from the old venv - unless the new one is in a different directory.

Step 6

Activate the venv:

source myenv/bin/activate

Upgrade pip:

python3 -m pip --no-cache-dir --require-virtualenv install --upgrade pip

Then install packages:

python3 -m pip --no-cache-dir --require-virtualenv install -r requirements.txt
Note: Always use --no-cache-dir --require-virtualenv to ensure temporary files do not go into the home cache and that packages are installed only into the venv, not globally for the user.

Step 7

As in Step 1, test the application/script. No need to perform new comparative tests - use the one from Step 1.

If problems persist after following this guide, please contact Helpdesk.

Please include:

- the requirements.txt file,
- job IDs for jobs:
  - on the old image in the old venv,
  - on the new image in the old venv,
  - on the new image in the new venv.