Nvidia Modulus 22.09 module on Sunbird#

So, recently NVIDIA Modulus 22.09 was loaded as a module on Sunbird. One can easily load the NVIDIA Modulus 22.09 as follows:

module load modulus/22.09

What this will do is it will create a environment variable $MODULUS_IMG which points to the apptainer image.

[s.1915438@sl1(sunbird) ~]$ echo $MODULUS_IMG
/apps/local/tools/modulus/22.09/modulus.sif

Interactive shell#

For simplicity we will run the apptainer container interactively using apptainer shell. Let us start submitting an interactive job.

salloc --nodes=1 --account=scw1901 --partition=accel_ai --gres=gpu:1
srun --pty /bin/bash

Now we can fire the apptainer container using the following command. We additionally need --nv flag to make sure the container detects the GPU.

apptainer shell --nv --contain --cleanenv --bind "$(pwd)":/data,/tmp:/tmp  $MODULUS_IMG

This will run the apptainer container and we can check importing python modules as follows:

[s.1915438@scs2041(sunbird) ~]$ apptainer shell --nv --contain --cleanenv --bind "$(pwd)":/data,/tmp:/tmp  $MODULUS_IMG
/usr/bin/rm: cannot remove '/usr/local/cuda/compat/lib': Read-only file system
rm: cannot remove '/usr/local/cuda/compat/lib': Read-only file system
Singularity> nvidia-smi
Tue Sep 12 12:27:36 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A100-PCI...  Off  | 00000000:A3:00.0 Off |                    0 |
| N/A   36C    P0    43W / 250W |      0MiB / 40960MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
Singularity> python --version
Python 3.8.13
Singularity> python
Python 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10)
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import modulus
>>> import pysdf.sdf
>>>
Singularity> exit