Fortran, C and C++

Scientific computing traces its roots to the very origin of computers. The first electro-mechanical computers originally were created to compute large numerical calculations much faster than humans can do. With the evolution of computers came new ways of expressing algorithms in ways closer to human languages but still very close to the internal operation of electronic machines. With the new electronic computers in the 1950’s, programmers used assembly language (or assembler language), low-level programming language with a close correspondence between the instructions in the language and the architecture’s machine code instructions. Programming in assembly is very time consuming and prone to error. New programming languages evolved to facilitate the translation of algorithms into code, avoiding almost entirely the need for writing in assembler. Today, three dominant low level languages for scientific computing are Fortran, C and C++.

Fortran is the oldest of those three, and many algorithms that are fundamental for most problems in science were written in it. Fortran 77 was for many years the standard de facto for scientific computing and have evolve over the years with Fortran 90, 95, 2003 and 2008.

C is a general purpose language and used in most of the basic software written in Unix environments. The language itself is not targeted to write scientific programs so it less expressive for vectors and matrices compared to Fortran.

C++ is in many ways an extension of C with high element constructs such as classes. C++ is widely used in desktop applications, games and also scientific applications.

Fortran, C and C++ shared a number of attributes. These languages make use of a compiler that translates the source code into machine code. Parallelization interfaces such as MPI are written with explicit support for those three languages and compiler suites such as GCC, Intel and NVIDIA HPC include those languages by default.

Implementations of C, C++, and Fortran Compilers

In an HPC cluster is customary to include compilers for Fortran, C and C++ by default. On Linux machines the usual compilers are from the GNU Compiler Collection (GCC). Apart from GCC, there are other vendors of compilers for these three languages. Each vendor brings its own advantages and particular options to optimize the resulting binaries, support for CPU extensions, support for language extensions and support for various standards of the languages that are standarized and reviewed over time. Intel® oneAPI HPC Toolkit and NVIDIA HPC SDK are two examples of alternative compilers that are available in many HPC clusters. Each compiler for the three languages has its own name. The table below shows the names of the compilers for each of the three languages from each vendor.

Name of compiler commands from different vendors

Language

GNU Compiler Collection (GCC)

Intel® oneAPI HPC Toolkit

NVIDIA HPC SDK

Fortran

gfortran

ifort

nvfortran

C

gcc

icc

nvc

C++

g++

icpc

nvc++

The compiler installed by default with the Operating System is usually too old. That is due to the tendency of having more stable software packages on production systems and that is the case on HPC clusters too. The versions of those installed compilers are old for the purpose of many scientific packages and do not support recent CPU extensions and language extensions such as OpenMP or OpenACC. On Thorny Flat the default version of the compilers is 4.8, one way of identifying the version of gcc compilers is running the command:

$> gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --disable-libgcj --with-isl=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install --with-cloog=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install --enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)

Environment Modules

It is usually better to use environment modules to access newer versions of GCC or compilers from other vendors such as Intel or NVIDIA. The table below shows the modules that can be used on Thorny Flat to load the modules for compilers from different vendors and from several versions

Vendor

Modules

Advantages

GCC

lang/gcc/9.3.0
lang/gcc/12.2.0


Standard de-facto on Linux/UNIX
Most codes compile correctly with GCC
Relative good support for OpenMP and AVX
extensions

Intel

lang/intel/2018
lang/intel/2019
compiler/2021.4.0
compiler/2022.1.0
compiler/2023.1.0
Compilers that optimize particularly well
on Intel Hardware.
Good support for OpenMP but not OpenACC

NVIDIA

lang/nvidia/nvhpc/22.3
lang/nvidia/nvhpc/23.3
lang/nvidia/nvhpc/23.7
Good support for OpenACC in addition to
OpenMP.
Informative parallelization information

Compiling Source Code

We will now demonstrate how to compile simple codes using compilers from the 3 vendors above. For doing that, consider the algorithm that compule the Sieve of Eratosthenes. We have the same algorithm implemented in C and Fortran. The code can be copied into files and we will proceed to compile it.

To download the code you can execute the following command directly on the cluster:

$> wget https://docs.hpc.wvu.edu/_static/sieve.c
$> wget https://docs.hpc.wvu.edu/_static/sieve.f90

Or download the files sieve.c and sieve.f90.

If you download the files to your own computer you can later upload them to the cluster using File Transfer (Globus)

C version (sieve.c)

#include <stdio.h>
#include <stdlib.h>

void sieve(unsigned char *, int);
   
int main(int argc, char *argv[])
{
  /* Declaring variables */
  unsigned char *array;
  int n;

  /* Reading command line arguments */
  if ( argc != 2 ) /* argc should be 2 for correct execution */
    {
      /* We print argv[0] assuming it is the program name */
      printf( "usage: %s max_number\n", argv[0] );
    }
  else
    {
      n=atoi(argv[1]);
      array =(unsigned char *)malloc((n+1)*sizeof(unsigned char));
      printf(" Sieve for prime numbers up to %d\n", n);
      sieve(array,n);
    }
  return 0;
}

void sieve(unsigned char *a, int n)
{
  int i=0, j=0, counter=0;

  for(i=2; i<=n; i++) {
    a[i] = 1;
  }

  for(i=2; i<=n; i++) {
    if(a[i] == 1) {
      for(j=i; j<= (int)(n/i + 1); j++) {
        if ((i*j)<=n) a[(i*j)] = 0;
      }
    }
  }
      
  if (n<=1000) {
      for(i=2; i<=n; i++) {
        if(a[i] == 1)
          printf("%d ", i);
      }
      printf("\n\n");
  }
  else {
      for(i=2; i<=n; i++) {
        if(a[i] == 1)
          counter+=1;
      }
      printf(" Total number of primes: %d \n", counter);
  }
}

Fortran version (sieve.f90)

module str2int_mod
contains

  elemental subroutine str2int(str,int,stat)
    implicit none
    ! Arguments
    character(len=*),intent(in) :: str
    integer,intent(out)         :: int
    integer,intent(out)         :: stat

    read(str,*,iostat=stat)  int
  end subroutine str2int

end module

program sieve

  use str2int_mod
  implicit none

  integer :: i, stat, i_max=0, counter=0
  logical, dimension(:), allocatable :: is_prime
  character(len=32) :: arg

  i = 0
  do
    call get_command_argument(i, arg)
    if (len_trim(arg) == 0) exit

    i = i+1
    if ( i == 2 ) then
       call str2int(trim(arg), i_max, stat)
       write(*,*) "Sieve for prime numbers up to", i_max
    end if

  end do

  if (i_max .lt. 1) then
     write (*,*) "Enter the maximum number to search for primes"
     call exit(1)
  end if

  allocate(is_prime(i_max))

  is_prime = .true.
  is_prime (1) = .false.
  do i = 2, int (sqrt (real (i_max)))
    if (is_prime (i)) is_prime (i * i : i_max : i) = .false.
  end do

  if (i_max <= 1000 ) then
      do i = 1, i_max
        if (is_prime (i)) write (*, '(i0, 1x)', advance = 'no') i
      end do
      write (*, *)
  else
      do i = 1, i_max
        if (is_prime (i)) counter=counter+1
      end do
      write (*,*) 'Total number of primes: ', counter
  end if

end program sieve

Using the GNU Compiler Collection (GCC)

We will start compiling this code with a modern version of GCC. Load the module for GCC 12.2:

$> module load lang/gcc/12.2.0

The module for GCC 12.2 includes compilers for Fortran, C, and C++. We can compile the codes above with the commands:

$> gcc sieve.c -o sieve_c_gcc122
$> gfortran sieve.f90 -o sieve_f90_gcc122

Now we can execute the binaries:

$> ./sieve_c_gcc122 100000000
 Sieve for prime numbers up to 100000000
 Total number of primes: 5761455

$> ./sieve_f90_gcc122 100000000
 Sieve for prime numbers up to   100000000
 Total number of primes:      5761455

Using NVIDIA HPC SDK

Another suite that can be use for compiling this code is the NVIDIA HPC SDK Load the module for NVIDIA HPC SDK 23.7:

$> module load lang/nvidia/nvhpc/23.7

The module for the NVIDIA compilers includes compilers for Fortran, C, and C++. We can compile the codes above with the commands:

$> nvc sieve.c -o sieve_c_nv237
$> nvfortran sieve.f90 -o sieve_f90_nv237

Now we can execute the binaries:

$> ./sieve_c_nv237 100000000
 Sieve for prime numbers up to 100000000
 Total number of primes: 5761455

$> ./sieve_f90_nv237 100000000
 Sieve for prime numbers up to   100000000
 Total number of primes:      5761455

Using Intel® oneAPI HPC Toolkit

There are several suites currently offered by Intel. Intel® oneAPI HPC Toolkit includes compilers for Fortran, C, and C++. In addition it also includes Intel MKL which is an optimized implementation of BLAS and LAPACK linear algebra routines.

Load the module for the Intel compilers:

$> module load compiler/2021.4.0

The module includes Intel compilers for Fortran, C, and C++. We can compile the codes above with the commands:

$> icc sieve.c -o sieve_c_intel2021
$> ifort sieve.f90 -o sieve_f90_intel2021

Now we can execute the binaries:

$> ./sieve_c_intel2021 100000000
 Sieve for prime numbers up to 100000000
 Total number of primes: 5761455

$> ./sieve_f90_intel2021 100000000
 Sieve for prime numbers up to   100000000
 Total number of primes:      5761455