Skip to content

Helios supercomputer

Access to Helios

Computing resources on Helios are assigned based on PLGrid computing grants. To perform computations on Helios you need to obtain a computing grant through the PLGrid Portal and apply for access to Helios.

Available login nodes:

login01.helios.cyfronet.pl

System description

Helios is a hybrid cluster. CPU nodes use x86_64 CPUs, while the GPU partition is based on GH200 superchips, which include an NVIDIA Grace - ARM CPU and NVIDIA Hopper GPU. HPE Slingshot is used as an interconnect.

The login node uses an x86_64 CPU and RHEL 8.

Please keep this in mind when compiling software, etc. Knowing the destination CPU architecture and operating system is important for selecting the proper modules and software. Each architecture has its own set of modules, in order to see the complete list of modules you need to run module avail on a node of a chosen type.

Nodes details

Node specification can be found below:

Partition Number of nodes Operating system CPU RAM RAM available for job allocations Proportional RAM for one CPU Proportional RAM for one GPU Proportional CPU for one GPU Accelerator
plgrid (includes plgrid-long) 272 Rocky Linux 9 192 cores, x86_64, 2x AMD EPYC 9654 96-Core Processor @ 2.4 GHz 384 GB 384000 MB 2000 MB n/a n/a
plgrid-bigmem 160 Rocky Linux 9 192 cores, x86_64, 2x AMD EPYC 9654 96-Core Processor @ 2.4 GHz 768 GB 773170 MB 4000 MB n/a n/a
plgrid-gpu-gh200 110 Rocky Linux 9 288 cores, aarch64, 4x NVIDIA Grace CPU 72-Core @ 3.1 GHz 480 GB 489600 MB n/a 120 GB 72 4x NVIDIA GH200 96 GB

Important

Note that Helios has been upgraded to Rocky Linux 9 on 27th October. More info: Update to Rocky 9.

Job submission

Helios is using Slurm resource manager, jobs should be submitted to the following partitions:

Name Time limit Resource type (account suffix) Access requirements Description
plgrid 72 h -cpu Generally available. Standard partition.
plgrid-long 166 h -cpu Requires a grant with a maximum job runtime of 168 h. Used for jobs with extended runtime.
plgrid-bigmem 72 h -cpu-bigmem Requires a grant with CPU-BIGMEM resources. Resources used for jobs requiring an extended amount of memory.
plgrid-gpu-gh200 48 h -gpu-gh200 Requires a grant with GPGPU resources. GPU partition.

If you are unsure of how to properly configure your job on Helios please consult this guide: Batch system.

Accounts and computing grants

Please, get familiar how to specify an account: Accounts and grants.

Billing

General billing process is described here: Billing.

Storage

To avoid problems use variables instead of full path to the filesystems.

Current usage, capacity and other storage attributes can be checked by issuing the hpc-fs command, see HPC tools.

Please check if your workload may benefit from using ramdisk or localfs: Storage features.

Important

In the $SCRATCH space, data older than 30 days and job directories older than 7 days are automatically deleted.

Software

Applications and libraries are available through the Modules system.

Important

Modules for ARM and x86 CPUs are not interchangeable, and selecting the right module for the destined architecture is critical for getting software to work! Please load the proper modules on the node, inside of the job script!

Note

Modules' names on Helios are case sensitive.

The modules on the Helios supercomputer are organized hierarchically. This means that in order to load a module, its main dependencies must be loaded first. For this reason, the command module avail will not show all modules available on the cluster, but only those that can be loaded at the moment. To get a list of all modules to load, use the command module spider.

Check the hpc-modules command.

Using GPU (NVIDIA GH200)

How to use GH200 superchips is described here: GPU / ARM / GH200.

Support

Keep in mind general rules.

In case of problems check the Support page.