Dolly Sods (2023-)

Dolly Sods is WVU’s latest HPC Cluster. It was deployed in July/August 2023 and was funded by NSF Major Research Instrumentation (MRI) Grant Award #2117575. The cluster has a total of 37 nodes. The system has 155 NVIDIA GPU cards. They are distributed as 120 NVIDIA A30, 19 NVIDIA A40, and 16 NVIDIA A100.

Dolly Sods

logo3

logo1

logo2

Overview

(This text can be used for proposals to Grant Funding Agencies)

Dolly Sods is a hardware-accelerated High-Performance Computing (HPC) cluster. Dolly Sods serves the HPC needs of West Virginia University (WVU) and other higher education institutions in West Virginia. It is hosted in Morgantown, WV, and was built thanks to NSF Major Research Instrumentation (MRI) Grant Award #2117575

Dolly Sods is a cluster of 37 compute nodes. All compute nodes have hardware accelerators in the form of NVIDIA GPUs. There is a total of 155 GPU cards distributed as follows: 30 compute nodes with 4 GPUs NVIDIA(R) A30 24GB PCIe. 4 compute nodes with 4 GPUs NVIDIA(R) A40 24GB PCIe. 2 compute nodes with 8 GPUs NVIDIA(R) A100 40GB PCIe. The login node also has 3 GPUs NVIDIA(R) A40 24GB PCIe.

Dolly Sods uses SLURM for workload management and it has a variety of compilers, numerical libraries, and scientific software specifically compiled and optimized for the hardware architecture.

Acknowledgment Message

We ask our users to acknowledge the use of Dolly Sods in all possible publications thanks to this resource. The message in the acknowledgment section could be as follows:

Computational resources were provided by the WVU Research Computing Dolly Sods HPC cluster, which is funded in part by NSF OAC-2117575.

Hardware Acceleration

The specifications of the three kinds of GPU cards on Dolly Sods are shown in the table below

GPU Card
Model
GPU
Memory
CUDA
Cores
Tensor
Cores
Max Power
Compsumption
Compute
Capability

NVIDIA A30

24 GB HBM2

3804

224

165 W

8.0

NVIDIA A40

48 GB GDDR6 ECC

10752

336

300 W

8.6

NVIDIA A100

40 GB HBM2e ECC

6912 FP32
3456 FP64

432

250 W

8.0

Full specifications for the GPU cards can be found for NVIDIA A30 , NVIDIA A40 and NVIDIA A100

The GPUs in Sods have different compute capabilities. The compute capability of a device is represented by a version number, also sometimes called its “SM version”. This version number identifies the features supported by the GPU hardware and is used by applications at runtime to determine which hardware features and/or instructions are available on the present GPU.

The compute capability comprises a major revision number X and a minor revision number Y and is denoted by X.Y. All the nodes in Dolly Sods have the same major revision number but different minor revision numbers. Devices with the same major revision number are of the same core architecture. The major revision number is 8 for devices based on the NVIDIA Ampere GPU architecture.

You can see Compute Capabilities for other GPU cards.