Helios supercomputer
Access to Helios
Computing resources on Helios are assigned based on PLGrid computing grants. To perform computations on Helios you need to obtain a computing grant through the PLGrid Portal and apply for access to Helios.
Available login nodes:
login01.helios.cyfronet.pl
System description
Helios is a hybrid cluster. CPU nodes use x86_64 CPUs, while the GPU partition is based on GH200 superchips, which include an NVIDIA Grace - ARM CPU and NVIDIA Hopper GPU. HPE Slingshot is used as an interconnect.
The login node uses an x86_64 CPU and RHEL 8.
Please keep this in mind when compiling software, etc. Knowing the destination CPU architecture and operating system is important for selecting the proper modules and software. Each architecture has its own set of modules, in order to see the complete list of modules you need to run module avail on a node of a chosen type.
Nodes details
Node specification can be found below:
| Partition | Number of nodes | Operating system | CPU | RAM | RAM available for job allocations | Proportional RAM for one CPU | Proportional RAM for one GPU | Proportional CPU for one GPU | Accelerator |
|---|---|---|---|---|---|---|---|---|---|
| plgrid (includes plgrid-long) | 272 | Rocky Linux 9 | 192 cores, x86_64, 2x AMD EPYC 9654 96-Core Processor @ 2.4 GHz | 384 GB | 384000 MB | 2000 MB | n/a | n/a | |
| plgrid-bigmem | 160 | Rocky Linux 9 | 192 cores, x86_64, 2x AMD EPYC 9654 96-Core Processor @ 2.4 GHz | 768 GB | 773170 MB | 4000 MB | n/a | n/a | |
| plgrid-gpu-gh200 | 110 | Rocky Linux 9 | 288 cores, aarch64, 4x NVIDIA Grace CPU 72-Core @ 3.1 GHz | 480 GB | 489600 MB | n/a | 120 GB | 72 | 4x NVIDIA GH200 96 GB |
Important
Note that Helios has been upgraded to Rocky Linux 9 on 27th October. More info: Update to Rocky 9.
Job submission
Helios is using Slurm resource manager, jobs should be submitted to the following partitions:
| Name | Time limit | Resource type (account suffix) | Access requirements | Description |
|---|---|---|---|---|
| plgrid | 72 h | -cpu | Generally available. | Standard partition. |
| plgrid-long | 166 h | -cpu | Requires a grant with a maximum job runtime of 168 h. | Used for jobs with extended runtime. |
| plgrid-bigmem | 72 h | -cpu-bigmem | Requires a grant with CPU-BIGMEM resources. | Resources used for jobs requiring an extended amount of memory. |
| plgrid-gpu-gh200 | 48 h | -gpu-gh200 | Requires a grant with GPGPU resources. GPU partition. |
If you are unsure of how to properly configure your job on Helios please consult this guide: Batch system.
Accounts and computing grants
Please, get familiar how to specify an account: Accounts and grants.
Billing
General billing process is described here: Billing.
Storage
To avoid problems use variables instead of full path to the filesystems.
Current usage, capacity and other storage attributes can be checked by issuing the hpc-fs command, see HPC tools.
Please check if your workload may benefit from using ramdisk or localfs: Storage features.
Important
In the $SCRATCH space, data older than 30 days and job directories older than 7 days are automatically deleted.
Software
Applications and libraries are available through the Modules system.
Important
Modules for ARM and x86 CPUs are not interchangeable, and selecting the right module for the destined architecture is critical for getting software to work! Please load the proper modules on the node, inside of the job script!
Note
Modules' names on Helios are case sensitive.
The modules on the Helios supercomputer are organized hierarchically. This means that in order to load a module, its main dependencies must be loaded first. For this reason, the command module avail will not show all modules available on the cluster, but only those that can be loaded at the moment. To get a list of all modules to load, use the command module spider.
Check the hpc-modules command.
Using GPU (NVIDIA GH200)
How to use GH200 superchips is described here: GPU / ARM / GH200.
Support
Keep in mind general rules.
In case of problems check the Support page.