Infrastructure and Services¶
WVU Research Computing (WVU-RC) group provides access to centrally managed computational systems, several High-Performance Computing systems, and data analysis clusters. We also offer Infrastructure that supports researchers, such as large and secure data storage, a high-speed data transfer channel using a demilitarized zone (DMZ), support in High-Performance Computing, parallel programming, and visualization, and training in those areas via workshops and seminars.
Here we summarize the different dimensions of action for WVU-RC:
High-Performance Computing¶
We operate three High-Performance computing clusters: A general-purpose HPC cluster called Thorny Flat, a GPU-oriented HPC cluster called Dolly Sods and we maintain a HIPAA-compliant HPC cluster on behalf of the West Virginia Clinical and Translational Science Institute (WVCTSI).
Our High-Performance Computing facilities were funded by the National Science Foundation. Thorny Flat was partially funded by NSF Major Research Instrumentation (MRI) Grant Award #1726534. Dolly Sods was funded by NSF Major Research Instrumentation (MRI) Grant Award #2117575. These facilities are also funded by the WVU Research Office and faculty investments.
Our current HPC clusters are the latest of several others on a trajectory of keeping HPC infrastructure available to our West Virginia researchers. Two other HPC clusters served in the past but were decommissioned long ago.
Mountaineer was WVU’s first centrally managed HPC cluster. It was a 384-core Intel-based HPC cluster based on Xeon Westmere processors. Each node had 12 cores and 48GB of RAM, providing an average of 4GB per core. Storage was provided by a direct-attached SAN unit with 10TB of formatted disk space and a network-attached storage system with 60 TB of storage capacity. Mountaineer stopped operating in April 2019, when the new cluster Thorny Flat started.
Spruce Knob was WVU’s second centrally managed HPC system. This system had 176 compute nodes and 3,376 CPU cores based on Intel Xeon Sandy Bridge, Ivy Bridge, Haswell, and Broadwell processors. Spruce Knob followed a condo model where faculty members could purchase direct access to nodes on the cluster making them part owners of the cluster. Spruce was finally decommissioned in 2023 when Dolly Sods entered in operation.
Thorny Flat is an HPC cluster in production. It has 185 compute nodes, an aggregated 6796 CPU cores, and 47 GPU cards. The GPU cards are diverse, including NVIDIA P6000, RTX 6000, and A100. The aggregated total RAM in the compute nodes is over 30TB.
Dolly Sods is our newest HPC cluster. This cluster is smaller than Thorny Flat but includes many GPU cards, so it is specialized for jobs that can use them. It has 37 compute nodes, an aggregated 1216 CPU cores, and 155 GPU cards. The GPU cards are diverse, including NVIDIA A30, A40, and A100. The aggregated total RAM in the compute nodes is over 10TB.
Data Analysis Cluster¶
GoFirst Cluster is a dedicated WVU MS Business Data Analytics computing resource that allows Business Data Analytics M.S. students to gain experience in a controlled, secure cloud-computing environment. GoFirst is built from four compute nodes running HDFS shared filesystem to run Hadoop and Spark jobs using RStudio as a frontend interface.
Research Exchange¶
REX is a Science DMZ, or demilitarized zone, a dedicated “express lane” network for research data traffic within the University’s larger network. It was funded through a nearly $487,000 cyberinfrastructure grant that WVU Research Corp. received in 2014 from the National Science Foundation.
REX allows Information Technology Services to separate research traffic from other Internet traffic, guarantee high-speed Internet2 access for WVU researchers, and facilitate data exchanges with off-campus collaborators. The upgrades also provide WVU researchers greater access to off-campus resources such as national scientific supercomputing centers. The grant funded the development and deployment of two Data Transfer Nodes, high-performance data transfer “depots” that will improve the ability to move large science data sets. These Data Transfer Nodes have 640TB of raw disk storage, giving researchers a high-speed storage location when transferring large data sets.
HPC storage¶
WVU-RC offers researchers access to two storage tiers through our Data Direct Network GRIDScaler system. Our standard tier of storage is free to all users, but users can also purchase dedicated group storage on the system.
The GRIDScaler system provides access to high-speed parallel GPFS storage and currently provides over 7 GB of throughput and 1 PB of raw storage.
All users of HPC systems have access to more than 400 TB of high-speed scratch storage. Scratch storage is for the temporary storage of files and gives researchers a place to process large amounts of data. In addition, each user is provided 10 GB of home directory space and 10 GB of group storage space upon request.
Some researchers prefer to have dedicated group storage on the HPC cluster to store large amounts of data for processing without fearing it being removed. These researchers can purchase dedicated storage for their group. This also offers an easy way to share data between researchers in the same group.
Data Depot¶
The WVU Research Data Depot is a centrally managed, reliable, secure, fast data storage system designed to meet the university’s diverse research storage needs. It can handle files of all sizes, from many few KB files to very large many GB files. Researchers who use this service will have access to their data both on and off campus and can also use it to collaborate with researchers outside of WVU.
ITS designed the Data Depot to be easy to use. Researchers can drag and drop files through an interface they are accustomed to for users of Windows, macOS, or Linux file managers. Command-line tools, such as sftp
, and Linux-based commands, such as mount, can also access the files on lab PCs/servers.
More information on: Research Data Depot
Seminars and workshops¶
WVU-RC supports the mission of educating users on High-Performance Computing, Parallel Programming, and Data Analysis via seminars and workshops.