Singularity Containers¶
Singularity is an open source container solution developed specifically for HPC environments. With Singularity, HPC users can safely bring their own execution environments to the cluster. Unlike other container solutions, Singularity does not require root level permissions to run containers, which allows users to freely control what software stack they wish to use. Provisioning of a container image can be done locally on the user’s machine or on Singularity Hub. The resulting image can then be securely executed on any machine with Singularity installed. Reproduction of results means that a user can now share a single Singularity image file that will ensure a consistent execution environment wherever it is run.
Singularity is a virtualization solution. There are several kinds of Virtualization solutions. From one side we have System Virtual Machines, they provide a substitute for a real machine. They provide functionality needed to execute entire operating systems. In some cases the virtualization even emulate different architectures and allow execution of software applications and operating systems written for another CPU or architecture. From the other side we have OS-level virtualization where the kernel allows the existence of multiple isolated user-space instances. Singularity belongs to the latest.
Containers are similar to Virtual Machines, however, the differences are enough to consider them different technologies and those differences are very important for HPC. Virtual Machines takes up a lot of system resources. Each Virtual Machine (VM) runs not just a full copy of an operating system, but a virtual copy of all the hardware that the operating system needs to run. This quickly adds up to a lot of precious RAM and CPU cycles, valuable resources for HPC.
In contrast, all that a container requires is enough of an operating system, supporting programs and libraries, and system resources to run a specific program. From the user perspective, a container is in most cases a single file that contains the file system, ie a rather complete Unix filesystem tree with all libraries, executables, and data that are needed for a given workflow or scientific computation.
There are several container solutions and one prominent example is Docker. Docker is intended to run trusted containers by trusted users. This model do not fit well in HPC where computer clusters are accessed by a number of users whose privileges cannot be escalated to superuser level. For that reason Singularity utilizes a very different security paradigm. This is a required feature for implementation within any multi-user environment with untrusted users running untrusted containers.
For more information about Singularity and complete documentation see: https://singularity.lbl.gov/quickstart
Activating Singularity¶
There are two main components in Singulairty containers. The runtime executable and the singularity image.
The Container Runtime or Singularity runtime is designed to leverage the above mentioned container formats. The runtime program is called singularity and is capable not only of running singularity images but also to create them.
To activate singularity on your current terminal, on both, Spruce and Thorny execute:
module load singularity/2.5.2
That will modify the $PATH variable allowing you to execute the main command for singularity:
singularity
Singularity Images¶
The second component are Singularity images. On Spruce and Thorny we have a number of centrally managed images:
/shared/software/containers
or using the variable:
$SNG_PATH
The list of Singularity images on Spruce and Thorny is the same, the table below shows the current images available.
Image Name |
Description |
Based on |
GUI |
GPU |
|
---|---|---|---|---|---|
centos-final.simg |
Example of image using libgraph 1.0.2, run examples “circle”, “julia” and “sample” |
centos:latest |
Yes |
No |
|
dakota-6.10-release-public-rhel7.simg |
Dakota 6.10: Sandia software for optimization and Uncertainty Quantification |
centos:latest |
No |
No |
|
docs_hpc_wvu.simg |
Contains Sphinx, pandoc and scripts to create documentation for https://docs.hpc.wvu.edu |
ubuntu:bionic |
No |
No |
|
dolmades.simg |
Windows Apps under Linux using Singularity (using wine) |
c1t4r/dolmades |
Yes? |
No |
|
Grass-7.4.0.simg |
GRASS (Geographic Resources Analysis Support System), open source GIS |
ubuntu:trusty |
Yes |
No |
|
Grass-7.4.simg |
GRASS (Geographic Resources Analysis Support System), open source GIS |
ubuntu:trusty |
Yes |
No |
|
Grass-7.6.1.simg |
GRASS (Geographic Resources Analysis Support System), open source GIS |
ubuntu:bionic |
Yes |
No |
|
Jupyter-5.2.2_Python-3.6.8.simg |
Python 3.6.8 with a number of Scientific Packages and Jupyter Notebooks |
ubuntu:bionic |
Yes |
No |
|
jupyter_conda.simg |
Python 3.6.8 with a number of Scientific Packages and Jupyter Notebooks |
continuumio/miniconda3 |
Yes |
No |
|
jupyter.simg |
Python 3.6.8 with a number of Scientific Packages and Jupyter Notebooks |
ubuntu:bionic |
Yes |
No |
|
jupyter-xenial.simg |
Python 3.5.2 with a number of Scientific Packages and Jupyter Notebooks |
ubuntu:xenial |
Yes |
No |
|
Keras-2.1.4_TensorFlow-1.5.0.simg |
Neural Networks and Deep Learning with Keras 2.1.4 and TensorFlow 1.5.0 |
gw000/keras-full:2.1.4 |
Yes |
Yes |
|
loos.simg |
Lightweight Object-Oriented Structure library (LOOS) |
ubuntu:trusty |
No |
No |
|
miniconda3_firefox.simg |
Jupyter from miniconda with Firefox |
continuumio/miniconda3 |
Yes |
No |
|
miniconda3.simg |
Jupyter from miniconda without firefox |
continuumio/miniconda3 |
Yes |
No |
|
ParaView-5.6.0.simg |
ParaView 5.6: open-source, multi-platform data analysis and visualization |
ubuntu:bionic |
Yes |
No |
|
RStudio-desktop-1.2.1335_R-3.4.4.simg |
RStudio Desktop 1.2 with R 3.4.4 |
jekriske/r-base |
Yes |
No |
|
RStudio-server-1.2.1335_R-3.4.4.simg |
RStudio Server 1.2 with R 3.4.4 |
nickjer/singularity-r |
Yes |
No |
|
singularity-rstudio.simg |
RStudio Server 1.2 |
nickjer/singularity-r |
Yes |
No |
|
Stacks-2.1.simg |
Stacks: Pipeline for building loci from short-read sequences like illumina |
ubuntu:trusty-20170817 |
No |
No |
|
Stacks-2.4.simg |
Stacks: Pipeline for building loci from short-read sequences like illumina |
ubuntu:trusty |
No |
No |
|
Tensorflow-1.13.1-gpu-py3-jupyter.simg |
TensorFlow with support for GPUs |
tensorflow/tensorflow:1.13.1-gpu-py3-jupyter |
Yes |
No |
|
Tensorflow-1.13.1-py3-jupyter.simg |
TensorFlow without support for GPUs |
tensorflow/tensorflow:1.13.1-py3-jupyter |
Yes |
No |
|
TensorFlow_gpu_py3.simg |
TensorFlow with support for GPUs |
tensorflow/tensorflow:latest-gpu-py3 |
No |
Yes |
|
Visit-2.13.2.simg |
Visit: Interactive, scalable, visualization, animation and analysis tool |
ubuntu:trusty |
No |
No |
|
Visit-3.0.simg |
Visit: Interactive, scalable, visualization, animation and analysis tool |
centos:latest |
No |
No |
|
wkhtmltox-0.12.simg |
wkhtmltopdf command line tools to render HTML into PDF and other image formats |
ubuntu:trusty |
No |
No |
|
wkhtmltox.simg |
wkhtmltopdf command line tools to render HTML into PDF and other image formats |
ubuntu:trusty |
No |
No |
Interactive Job with X11 forwarding¶
Several images above are intended for interactive computing. In those cases you ensure that you connect to the cluster with X11 forwarding, before asking for an interactive job. From Linux or MacOS you can connect via SSH with X11 forwarding using:
ssh -X <username>@spruce.hpc.wvu.edu
If you are using MacOS you need a X Window System on your Mac. You can install XQuartz to get it https://www.xquartz.org/
If you are using Windows you will need a X11 Server, for example using MobaXterm https://mobaxterm.mobatek.net/
Once you have login into the cluster, create an interactive job with the following command line, in this case we are using standby as queue but any other queue is valid.:
qsub -X -I -q standby
Once you get inside a compute node, load the module:
module load singularity/2.5.2
After loading the module the command singularity is available for usage, and you can get a shell inside the image with:
singularity shell ${SNG_PATH}/<Image Name>
Non-interactive execution with Submission scripts¶
In this case you do not need to export X11, just login into Spruce or Thorny:
ssh <username>@spruce.hpc.wvu.edu
Once you have login into the cluster, create a submission script,
(name the file runjob.pbs
for example), in this case we are using standby as
queue but any other queue is valid.
#!/bin/sh
#PBS -N JOB
#PBS -l nodes=1:ppn=1
#PBS -l walltime=04:00:00
#PBS -m ae
#PBS -q standby
module load singularity/2.5.2
singularity exec ${SNG_PATH}/<Image Name> <command_or_script_to_run>
Submit your job with:
qsub runjob.pbs
GPU Support and Singularity¶
To get access to GPUs from inside the container use the argument --nv
either for the shell
or exec
subcommands.
Lets demonstrate this with an interactive example using Tensorflow on spruce
Assuming that you are now log into Spruce execute:
$> qsub -I -q comm_gpu
After a few seconds you get into a compute node:
salg0001:~$>
Next step is to activate Singularity:
$> module load singularity/2.5.2
Lets use for example Keras-2.1.4_TensorFlow-1.5.0.simg
one of the images centrally managed and located at $SNG_PATH
:
$> singularity shell --nv $SNG_PATH/Keras-2.1.4_TensorFlow-1.5.0.simg
Lets check that from inside the image the GPUs are visible:
$> nvidia-smi
Wed Sep 25 18:06:29 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.26 Driver Version: 396.26 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K20m Off | 00000000:08:00.0 Off | 0 |
| N/A 39C P0 76W / 225W | 96MiB / 4743MiB | 44% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K20m Off | 00000000:24:00.0 Off | 0 |
| N/A 40C P0 49W / 225W | 0MiB / 4743MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla K20m Off | 00000000:27:00.0 Off | 0 |
| N/A 34C P0 52W / 225W | 0MiB / 4743MiB | 91% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| |
+-----------------------------------------------------------------------------+
Now we can use IPython and Tensorflow:
$> ipython3
Python 3.5.3 (default, Jan 19 2017, 14:11:04)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import tensorflow as tf
In [2]: tf.test.is_built_with_cuda()
Out[2]: True
In [3]: tf.test.is_gpu_available()
2019-09-25 18:18:37.476402: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
2019-09-25 18:18:40.271869: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: Tesla K20m major: 3 minor: 5 memoryClockRate(GHz): 0.7055
pciBusID: 0000:08:00.0
totalMemory: 4.63GiB freeMemory: 4.48GiB
2019-09-25 18:18:40.390182: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 1 with properties:
name: Tesla K20m major: 3 minor: 5 memoryClockRate(GHz): 0.7055
pciBusID: 0000:24:00.0
totalMemory: 4.63GiB freeMemory: 4.56GiB
2019-09-25 18:18:40.508266: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 2 with properties:
name: Tesla K20m major: 3 minor: 5 memoryClockRate(GHz): 0.7055
pciBusID: 0000:27:00.0
totalMemory: 4.63GiB freeMemory: 4.56GiB
2019-09-25 18:18:40.508594: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Device peer to peer matrix
2019-09-25 18:18:40.508681: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1126] DMA: 0 1 2
2019-09-25 18:18:40.508697: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 0: Y N N
2019-09-25 18:18:40.508705: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 1: N Y Y
2019-09-25 18:18:40.508713: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1136] 2: N Y Y
2019-09-25 18:18:40.508730: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla K20m, pci bus id: 0000:08:00.0, compute capability: 3.5)
2019-09-25 18:18:40.508742: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: Tesla K20m, pci bus id: 0000:24:00.0, compute capability: 3.5)
2019-09-25 18:18:40.508753: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: Tesla K20m, pci bus id: 0000:27:00.0, compute capability: 3.5)
Out[3]: True
Those two checks ensure that TensorFlow was indeed compiled with GPU support and the TensorFlow is able to see the 3 GPUs installed on the machine.
Now we can run a very simple calculation using the GPUs:
In [4]: with tf.device('/gpu:0'):
...: a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
...: b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
...: c = tf.matmul(a, b)
...:
...: with tf.Session() as sess:
...: print (sess.run(c))
...:
2019-09-25 18:22:40.750833: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla K20m, pci bus id: 0000:08:00.0, compute capability: 3.5)
2019-09-25 18:22:40.750901: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: Tesla K20m, pci bus id: 0000:24:00.0, compute capability: 3.5)
2019-09-25 18:22:40.750914: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: Tesla K20m, pci bus id: 0000:27:00.0, compute capability: 3.5)
[[ 22. 28.]
[ 49. 64.]]
Notice that the calculation was performed on /gpu:0
, as the machine has 3 GPUs you can also compute on /gpu:1
and /gpu:2
Another way of checking the available devices is with:
In [5]: with tf.Session() as sess:
...: devices = sess.list_devices()
...:
2019-09-25 18:27:51.067844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Tesla K20m, pci bus id: 0000:08:00.0, compute capability: 3.5)
2019-09-25 18:27:51.067891: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: Tesla K20m, pci bus id: 0000:24:00.0, compute capability: 3.5)
2019-09-25 18:27:51.067904: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:2) -> (device: 2, name: Tesla K20m, pci bus id: 0000:27:00.0, compute capability: 3.5)