Skip to content

GPU / ARM / GH200 on Helios

Python, external libraries, and machine learning on Helios

Nodes with GH200 GPU superchips have CPUs with arm64 architecture, thus demanding modules built specifically for this architecture.

Anaconda should NOT be used for virtual environment management when working with Python. This is because conda environments ship with separate Python interpreter installations, which may experience compatibility issues with the ARM architecture. To create virtual environments, please use the Python standard venv module.

For deep learning applications, we provide a special module with software often used by AI libraries called ML-bundle/25.10. Always load this module before installing/building any packages or running Python programs relying on GPUs on GH-200 nodes.

Python ML environment

Remember that this module ML-bundle/25.10 should always be loaded as the first step in the given job before activation of any virtual environments.

module add ML-bundle/25.10

We also provide a custom pip repository, with popular machine learning packages pre-built for different versions with GPU support for ARM architecture. The packages from this repo can be directly installed via pip, simply by specifying the correct name and version tag of a package. To see available libraries with their tags, check the repo's contents by listing the directory $PIP_EXTRA_INDEX_URL.

Examples

An example script that creates a virtual environment and installs the packages from Helios custom wheel repository and requirements.txt file:

#!/bin/bash -l
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=16
#SBATCH --mem-per-cpu=4G
#SBATCH --time=01:00:00
#SBATCH --account=<your-grant-account>
#SBATCH --partition=plgrid-gpu-gh200
#SBATCH --output=out_files/out.out
#SBATCH --error=out_files/err.err
#SBATCH --gres=gpu:1

# IMPORTANT: load the modules for machine learning tasks and libraries
ml ML-bundle/25.10

cd $SCRATCH

# create and activate the virtual environment
python -m venv my_venv_name/
source my_venv_name/bin/activate

# install one of torch versions available at Helios wheel repo
pip install --no-cache torch==2.9.0+cu129

# install the rest of requirements, for example via requirements file
pip install --no-cache -r requirements.txt

An example script that uses created virtual environment to execute Python program:

#!/bin/bash -l
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=16
#SBATCH --mem-per-cpu=4G
#SBATCH --time=01:00:00
#SBATCH --account=<your-grant-account>
#SBATCH --partition=plgrid-gpu-gh200
#SBATCH --output=out_files/out.out
#SBATCH --error=out_files/err.err
#SBATCH --gres=gpu:1

# IMPORTANT: load the modules for machine learning tasks and libraries
ml ML-bundle/25.10

cd $SCRATCH

# activate the virtual environment
source my_venv_name/bin/activate

# run the program
python my_script_name.py