Apocrita says Hello, world! Basic use of our cluster¶

A common first program to write in a new language is a "Hello world" example where we print a simple line of output. In this tutorial we first look at examples written in C, C++ and Fortran. To run the examples we'll learn about interactive sessions on compute nodes, modules and compiling source code. We'll also look at examples in MATLAB, Python and R. For these we'll see how to use modules to select suitable interpreters.

After completing this tutorial the reader should:

understand basic module usage on Apocrita;
be able to compile and run simple programs on a compute node;
be able to use interpreters on a compute node;
know the difference between interactive and batch jobs.

We run all of our jobs on compute nodes on Apocrita, to avoid using excessive resources on the login node. We're first going to use an interactive session on one of the nodes before creating a job script which can be run when we aren't logged in.

Using `qlogin` to create an interactive session¶

Modules on Apocrita modifying our user environment to give us access to tools. These allow us to switch between different versions of tools or select a particular choice of tool when there are several conflicting options.

For example, we can load a module to select which compiler suite and its version we wish to use. We may also use a module to set up our environment to use a particular version of Python.

Running a C, C++ or Fortran program¶

On Apocrita, the plain text files that contain a C, C++ or Fortran program cannot be run directly. Instead, these source files must first be compiled to give us an executable program. For example, if we have a file called hello.f90 with the following Fortran program

  print '("Hello, world!")'
end

we need to use a Fortran compiler to process this file.

Apocrita has a number of different compiler suites available, containing compilers for C, C++ or Fortran source. The available GCC compilers can be seen with the module command

$ module avail gcc/
-------- /share/apps/environmentmodules/centos7/devtools ---------
gcc/4.8.5  gcc/6.3.0  gcc/7.1.0(default)  gcc/8.2.0  gcc/10.2.0

The NVIDIA HPC SDK and Intel compilers may be seen similarly:

$ module avail nvidia-hpc-sdk/
-------- /share/apps/environmentmodules/centos7/devtools ---------
nvidia-hpc-sdk/21.3(default)
$ module avail intel/
-------- /share/apps/environmentmodules/centos7/devtools ---------
intel/2017.1  intel/2017.3  intel/2018.1  intel/2018.3(default)

We can choose a particular version of one of the compiler suites with the module load command, giving the name of a module:

module load intel/2018.1

This load command says that we want to access the Intel compiler suite, version "2018.1".

If we are happy with the "default" version of a compiler suite, we can load that with the command

$ module purge
$ module load gcc
$ module list
Currently Loaded Modulefiles:
 1)gcc/7.1.0(default)

When we load one of these modules, we have access to the compilers of that suite. Although the compiler commands have the same name for each version of a suite, the name will be different for the Intel compiler and one of the others. That is, the Intel Fortran compiler is called ifort, but the GCC Fortran compiler is called gfortran and

NVIDIA's nvfortran. Fortunately, our modules set environment variables to give us a command name our C compiler can be used with ${CC}, our C++ compiler with ${CXX} and our Fortran compiler with ${FC}:

$ module purge
$ module load intel/2018.1
$ echo ${FC}
ifort
$ module switch intel/2018.1 gcc/8.2.0
$ echo ${FC}
/share/apps/centos7/gcc/gcc/4.8.5/8.2.0/bin/gfortran

When we've loaded a compiler module, we can compile the Fortran source file we have:

${FC} hello.f90 -o hello

This calls the Fortran compiler (${FC}), asking it to compile the source file hello.f90 and producing an executable program called hello (-o hello). We can see this new file has been created with the command

ls -l hello

Finally, we can run this program with the command:

$ ./hello
Hello, world!

{% include note.html title="Using ./hello instead of hello" content="We've compiled the source file to give us a program in our current directory. This directory isn't in our search path, so to allow our shell to find it we must provide a full path. The './' component at the start of the command tells the shell that it's the program in the current directory that we wish to use." %}

For the C and C++ source files below, we can compile these in the same way using the corresponding compilers:

$ ${CC} hello.c -o hello && ./hello
Hello, world!
$ ${CXX} hello.cpp -o hello && ./hello
Hello, world!

C:

#include <stdio.h>

int main() {
  printf("Hello, world!\n");
  return 0;
}

C++:

#include <iostream>

int main() {
  std::cout<<"Hello, world!"<<std::endl;
}

Once we've compiled the program we can run it again as many times as we like without recompiling it. It's only when we make changes to the source code that we need to go through the compilation step before running.

{% include important.html title="Needing the compiler module to run the program" content="In the example here we loaded the compiler module (intel/2018.3) to access the compilers (icc, icpc and ifort) to compile. The modules remain loaded when we later run the executable program. If in our job we are simply running the program, rather than compiling, we often still need to load the module to access the compiler's run-time environment." %}

Running a MATLAB, Python or R program¶

In contrast to the examples of C, C++ and Fortran above, some programming languages environments on Apocrita may come with interpreters. With the language R, for example, we have an interpreter we can load and then, in an interactive session, type in commands and see the results fed back to us.

As with the compilers, we first load a module to select a language interpreter and its version:

$ module avail R/
--------- /share/apps/environmentmodules/centos7/general ---------
R/3.3.2  R/3.4.0  R/3.4.2  R/3.4.3  R/3.5.1  R/3.5.3  R/3.6.1(default)  R/4.0.2
$ module load R/3.6.1
Loading R/3.6.1
  Loading requirement: java/1.8.0_171-oracle
$ R

R version 3.6.1 (2019-07-05) -- "Action of the Toes"
...
>

Where the > is now a prompt within the R interpreter, rather than our shell. As we type commands in there, they are interpreted within the R environment:

> print('Hello, world!')
[1] "Hello, world!"
> quit(save='no')
$

Our R statement print('Hello, world!') is run to give the output on the second line. We can then exit the R interpreter, returning to our shell, with the function quit().

Similarly, the Python interpreter can be used:

$ module load python/3.6.3
$ python
Python 3.6.3 (default, Oct  4 2017, 15:04:38)
...
>>> print('Hello, world!')
Hello, world!
>>> exit()
$

MATLAB on Apocrita defaults to the desktop GUI: for simple interactive use we usually want to avoid that:

$ module load matlab/2019a
$ matlab -nodisplay -nojvm

                            < M A T L A B (R) >
                  Copyright 1984-2019 The MathWorks, Inc.
                  R2019a (9.6.0.1072779) 64-bit (glnxa64)
                               March 8, 2019

For online documentation, see https://www.mathworks.com/support
For product information, visit www.mathworks.com.

>> disp 'Hello, world!'
Hello, world!
>> exit
$

While we can use the interpreters mentioned above interactively within an interactive session on a compute node, typing commands in and getting the result back in the terminal, we can also use the interpreters to process whole scripts. If we have the text files with our desired commands in, in the files hello.R, hello.py and hello.m for the R, Python and MATLAB examples respectively we can run them non-interactively.

For R we use the command Rscript to process a script file (instead of R for the interactive session):

$ Rscript hello.R
[1] "Hello, world!"

For Python:

$ python hello.py
Hello, world!

For MATLAB we use the -r hello option to execute commands from the file hello.m (the statement hello in MATLAB will be sourced from a file hello.m found in the search path, which in this case is the current directory):

$ matlab -nodesktop -nodisplay -nosplash -nojvm -r hello

                            < M A T L A B (R) >
                  Copyright 1984-2019 The MathWorks, Inc.
                  R2019a (9.6.0.1072779) 64-bit (glnxa64)
                               March 8, 2019


For online documentation, see https://www.mathworks.com/support
For product information, visit www.mathworks.com.

Hello, world!

Using script files in this way allows us to repeat a whole program without entering each line and waiting for the response. As we'll see in the next part of the tutorial, such scripts may also be used in a non-interactive mode in a submitted batch job.

Writing and submitting job scripts¶

In the previous part of the tutorial we used an interactive qlogin session to run programs on a compute node. Here we'll use a batch job to run a compiled program or script in a non-interactive way.

Running pre-compiled C, C++ or Fortran program¶

Once we've compiled our program source in an interactive session, we have an executable program which we can run whenever we want. To run the program on compute modes we switch to using batch jobs. If we have the program hello in our current directory we create the following job submission script, called hello-compiled.sh, to run it:

#!/bin/sh
#$ -cwd              # Run the job in the current directory
#$ -pe smp 1         # Request a single core on the node
#$ -l h_vmem=0.5G    # Request 0.5GB memory for the job
#$ -l h_rt=0:10:0    # The job should complete in 10 minutes
#$ -N hello          # Call the job "hello"
#$ -o hello.out      # Write output to the file hello.out

# Load the compiler module
module load intel/2018.3

# Run the compiled program
./hello

We load the same module file that we used to compile the program before running it in the job to ensure that the compiler's run-time environment is available.

The job script can be submitted using the qsub command, and we can see the state of the submitted job in the queue using qstat:

qsub hello-compiled.sh
qstat

Once the job has finished, which can be seen by the job no longer appearing in the qstat output, we can look at the output file created by the job:

$ cat hello.out
Hello, world!

Running in batch a MATLAB, Python or R program¶

Running a script non-interactively in a batch job is much the same as how we ran it non-interactively in the session earlier. As with the compiled program we create a submission script hello-scripts.sh:

#!/bin/sh
#$ -cwd              # Run the job in the current directory
#$ -pe smp 1         # Request a single core on the node
#$ -l h_vmem=0.5G    # Request 0.5GB memory for the job
#$ -l h_rt=0:10:0    # The job should complete in 10 minutes
#$ -N hello          # Call the job "hello"
#$ -o hello.out      # Write output to the file hello.out

# Load the interpreter modules
module load matlab/2019a python/3.6.3 R/3.6.1

# Run the interpreters with the scripts
# MATLAB
matlab -nodesktop -nodisplay -nosplash -nojvm -r hello

# Python
python hello.py

# R
Rscript hello.R

Conclusion¶

In this tutorial we've looked at how we can run basic programs on the cluster Apocrita. We used modules to set up our environment to gain access to compilers or interpreters. For C, C++ and Fortran we saw how to compile source code into runnable programs and how to run those programs; for MATLAB, Python and R we ran interactive sessions and submitted scripts which ran non-interactively.

For readers with a QMUL GitHub Enterprise account, we provide a repository with example code and job submission scripts.