Tinker9 
=======

The Tinker molecular modeling software is a complete and general package for molecular mechanics and dynamics, with some special features for biopolymers. Tinker has the ability to use any of several common parameter sets, such as Amber (ff94, ff96, ff98, ff99, ff99SB), CHARMM (19, 22, 22/CMAP), Allinger MM (MM2-1991 and MM3-2000), OPLS (OPLS-UA, OPLS-AA), Merck Molecular Force Field (MMFF), Liam Dang's polarizable model, AMOEBA (2004, 2009, 2013, 2017, 2018) polarizable atomic multipole force fields, AMOEBA+ that adds charge penetration effects, and our new HIPPO (Hydrogen-like Interatomic Polarizable POtential) force field. Parameter sets for other force field models are under consideration for future releases.

Tinker9 is a complete rewrite and extension of the canonical Tinker software, currently Tinker8. Tinker9 is implemented as C++ code with OpenACC directives and CUDA kernels providing excellent performance on GPUs. At present, Tinker9 builds against the object library from Tinker8, and provides GPU versions of the Tinker ANALYZE, BAR, DYNAMIC, MINIMIZE and TESTGRAD programs. Existing Tinker file formats and force field parameter files are fully compatible with Tinker9, and nearly all Tinker8 keywords function identically in Tinker9. Over time we plan to port much or all of the remaining portions of Fortran Tinker8 to the C++ Tinker9 code base.

These instructions describe how tinker9 was compiled on Thorny Flat with GPU support for the particular GPU cards present on this cluster.

Clone the repository
--------------------

Tinker9 sources are on Github and uses submodules. This command will download the latest version and the submodules::

	$> git clone --recurse-submodules https://github.com/TinkerTools/tinker9.git

Create a build folder
---------------------

It is a best practice to use a separate build folder to compile the code. The sources remain untouched and it is easy to just remove the content of the folder and recompile::

    $> cd tinker9
    $> mkdir build
    $> cd build

Modules for compilation
-----------------------

Tinker9 uses cmake for building the software.
The version of cmake on RHEL 7.x is usually too old so it is a good idea to load a more recent version.
RHEL 7.x comes with cmake version 2.8.12.2 and the CMakeLists request the minimal version being 3.12.

Tinker9 is implemented as C++ and takes advantage of GPUs via OpenACC directives and CUDA programming.
The natural compiler for a code that uses OpenACC is NVIDIA HPC suite as OpenACC is better implemented.
For the CUDA side the latest version of the CUDA Toolkit at the time of writing this (April 2022) is CUDA 11.5.
These are the modules loaded for compiling tinker9::

    $> module load dev/cmake/3.21.1 dev/git/2.29.1 lang/nvidia/nvhpc/21.11 parallel/cuda/11.5

Configuring tinker9
-------------------

Configuring a code with cmake will prepare the sources, and create the Makefiles for building the binaries using the compilers and libraries declared with the arguments presented to cmake.
The cmake line used was this::

    $> cmake -DCUDA_DIR=${CUDA_HOME} -DCOMPUTE_CAPABILITY=61,75,80 \
    -DCMAKE_Fortran_COMPILER=gfortran -DCMAKE_BUILD_TYPE=release \
    -DCMAKE_INSTALL_PREFIX=/shared/software/atomistic/tinker9 \
    -DCMAKE_CUDA_COMPILER=`which nvcc` -Wno-dev ..

The CUDA compute capabilities can be found on  `CUDA Website <https://developer.nvidia.com/cuda-gpus>`_
For the particular case of Thorny Flat, we have 3 kinds of GPUs: P6000, RTX6000, and A100.
The compute capabilities for these cards are:

+------------------+--------------------+
| GPU Model        | Compute Capability |
+==================+====================+
| Quadro P6000     |       61           |
+------------------+--------------------+
| Quadro RTX 6000  |       75           |
+------------------+--------------------+
| NVIDIA A100      |       80           |
+------------------+--------------------+

After configuring the code execute make with a reasonable number of compilation threads. On the head node of Thorny Flat we have 24 cores, using 12 compilation threads is reasonable and the code will compile relatively fast::

	$> make -j12


Running the tests
-----------------

To run the tests lauch and interactive job on a queue allowing GPUs::

	$> qsub -I -l nodes=1:ppn=16:gpus=2 -q be_gpu

Go to the build folder and check the GPU present on the node::

	$> cd /shared/src/tinker9/build
	$> nvidia-smi 
	Fri Apr  1 12:35:39 2022       
	+-----------------------------------------------------------------------------+
	| NVIDIA-SMI 495.29.05    Driver Version: 495.29.05    CUDA Version: 11.5     |
	|-------------------------------+----------------------+----------------------+
	| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
	| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
	|                               |                      |               MIG M. |
	|===============================+======================+======================|
	|   0  Quadro RTX 6000     Off  | 00000000:C1:00.0 Off |                    0 |
	| N/A   25C    P8    14W / 250W |      0MiB / 22698MiB |      0%   E. Process |
	|                               |                      |                  N/A |
	+-------------------------------+----------------------+----------------------+
	|   1  Quadro RTX 6000     Off  | 00000000:C2:00.0 Off |                    0 |
	| N/A   25C    P8    15W / 250W |      0MiB / 22698MiB |      0%   E. Process |
	|                               |                      |                  N/A |
	+-------------------------------+----------------------+----------------------+
																				   
	+-----------------------------------------------------------------------------+
	| Processes:                                                                  |
	|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
	|        ID   ID                                                   Usage      |
	|=============================================================================|
	|  No running processes found                                                 |
	+-----------------------------------------------------------------------------+

Load the modules::

	$> module load dev/cmake/3.21.1 dev/git/2.29.1 lang/nvidia/nvhpc/21.11 parallel/cuda/11.5

Execute the test suite::

	$> make test
	[ 69%] Built target tinkerObjF                                                                     
	[ 69%] Built target tinkerObjCpp           
	[ 69%] Built target tinkerFToCpp                 
	[ 83%] Built target tinker9_cpp                                                                    
	[ 88%] Built target __t9_all_tests_o                                                               
	[ 89%] Built target tinker9_f                                                                      
	[ 89%] Built target tinker9_version                                                                
	[ 97%] Built target tinker9_acc                                                                    
	[100%] Built target tinker9_cu                                                                     
	[100%] Built target all.tests                                                                      
	Filters: info                                                                                      
	 Primary GPU package :  CUDA                    
													 
	 GPU Device :  Setting Device ID to 0 from GPU utilization                                         
																									   
																									   
	 Program Information                                                                               
																									   
		Version:                             1.0.0 GIT 1e34a417                                        
		Synchronized with Tinker commit:     5aa9948d 
		C++ compiler:                        nvc++ 21.11.0
		Size of real (bytes):                4
		Size of mixed (bytes):               8
		Using deterministic force:           true
		Debug mode:                          off
		Platform:                            CUDA and OpenACC
		Primary GPU package:                 CUDA
		Latest CUDA supported by driver:     11.5
		CUDA runtime version:                11.5
		Thrust version:                      1.13.1 patch 0
		CUDA compiler:                       nvcc 11.5.119
		OpenACC compiler:                    nvc++ 21.11.0
		GPU detected:                        2
		GPU 0:                              
		   PCI:                              0000:C1:00.0
		   Name:                             Quadro RTX 6000
		   Maximum compute capability:       7.5
		   Single double perf. ratio:        32
		   Compute mode:                     exclusive process
		   Error-correcting code (ECC):      on
		   Clock rate (kHz):                 1620000
		   Number of Multiprocessors:        72
		   Number of CUDA cores:             9216
		   Used/Total GPU memory:            0.79 % / 22.17 GB
		GPU 1:                              
		   PCI:                              0000:C2:00.0
		   Name:                             Quadro RTX 6000
		   Maximum compute capability:       7.5
		   Single double perf. ratio:        32
		   Compute mode:                     exclusive process
		   Error-correcting code (ECC):      on
		   Clock rate (kHz):                 1620000
		   Number of Multiprocessors:        72
		   Number of CUDA cores:             9216
		   Used/Total GPU memory:            0.79 % / 22.17 GB
	===============================================================================
	test cases: 1 | 1 passed
	assertions: - none -

	Filters: [ff],[util]

    .....######################################################################    
    ...##########################################################################  
    ..###                                                                      ### 
    .###            Tinker9  --  Software Tools for Molecular Design            ###
    .##                                                                          ##
    .##                      Version 1.0.0-rc  January 2021                      ##
    .##                                                                          ##
    .##                 Copyright (c)  Zhi Wang & the Ponder Lab                 ##
    .###                           All Rights Reserved                          ###
    ..###                                                                      ### 
    ...##########################################################################  
    .....######################################################################    

	 Compiled at:  12:18:38  Apr  1 2022
	 Commit Date:  Fri Apr 1 04:23:24 2022 -0500
	 Commit:       1e34a417

	 Primary GPU package :  CUDA

	 GPU Device :  Setting Device ID to 0 from GPU utilization

	...
	...
	...

	===============================================================================
	All tests passed (65640 assertions in 58 test cases)

	[100%] Built target test