Using compiler options to help debug programs¶

Once we've written a program more advanced than our "Hello, world!" example, we're going to make mistakes. In this post, we'll look at how we can use the very compilers we're using to compile our program to pick up on some of these mistakes.

Introduction¶

To ensure that program components do what is intended, they should have appropriate specifications and tests. There are, however, a lot of common mistakes in implementation that our usual development tools can pick up. Here, we'll look at how the C and Fortran compilers we use to build our program can also detect some errors.

Each compiler differs in its ability to diagnose the types of errors discussed here. The options required to enable diagnostics will vary and some compilers may detect some of the problems by default. For concreteness, we'll focus on the compiler suites most commonly used on Apocrita: GCC and Intel. For the examples below, unless otherwise stated, we'll use the compiler modules gcc/10.2.0 and intel/2020.4 on Apocrita.

Errors and compiler flags¶

Incorrect variable names¶

In Fortran code we can use implicit typing of variables to avoid having to declare them explicitly:

program example
  i = 1    ! Implicitly an integer variable
  x = 1.2  ! Implicitly a (default) real variable
end program example

This is (very) prone to error, such as with

program example
  integer :: mycomplicatedvariable = 1
  mycomp1icatedvariable = 27
  print '(I5)', mycomplicatedvariable
end program example

With code we are writing now, we'd liberally use implicit none to ensure that no variable can be typed implicitly, giving a compile-time error for the example above. However, we can also use compiler flags to compile as though we'd put implicit none in all suitable scopes.

For example, with gfortran we can compile with -fimplicit-none:

$ gfortran -fimplicit-none example.f90
example.f90:3:23:

    3 | mycomp1icatedvariable = 27
      |                     1
Error: Symbol ‘mycomp1icatedvariable’ at (1) has no IMPLICIT type; did you mean ‘mycomplicatedvariable’?

With ifort we can use -warn declarations:

$ ifort -warn declarations example.f90
example.f90(3): warning #6717: This name has not been given an explicit type.   [MYCOMP1ICATEDVARIABLE]
  mycomp1icatedvariable = 27
--^

This compiler diagnostic is very crude: it picks up accidental use of implicit typing but not intentional use where it would give many false positives. Nonetheless, it can be helpful in checking that implicit none has been used in all correct places.

Accessing arrays out of bounds¶

In our C and Fortran programs which are using arrays, we may be attempting to access elements outside the bounds of some array. For example, we may have the C program

int main (void) {
  float x[10], y;

  x[10] = 1.2;
  y = x[10];
  return 0;
}

or the Fortran program

program example
  implicit none

  real :: x(10)
  integer :: i

  do i = 0, 10
    x(i) = 1.2
  end do
end program example

If we are lucky, our programs with out-of-bounds access will crash when we run them. However, we may be unlucky enough for them not to crash, instead finishing as though nothing were wrong.

Compiling the C program with gcc's bounds checking enabled (-fsanitize=bounds) we see errors detected when running:

$ gcc -o example example.c -fsanitize=bounds
$ ./example
example.c:4:4: runtime error: index 10 out of bounds for type 'float [10]'
example.c:5:8: runtime error: index 10 out of bounds for type 'float [10]'

Similarly, when compiled with icc and -check-pointers=rw we have a (less helpful) message like:

$ icc -o example example.c -check-pointers=rw
$ ./example
CHKP: Bounds check error ptr=0x7ffff028eca8 sz=4 lb=0x7ffff028ec80 ub=0x7ffff028eca7 loc=0x400cb3

Compiling the Fortran program with gfortran's bounds checking enabled (-fbounds-check) we see an error when running:

$ gfortran -o example example.f90 -fbounds-check
[ ... ]
$ ./example
At line 8 of file example.f90
Fortran runtime error: Index '0' of dimension 1 of array 'x' below lower bound of 1

For the same Fortran program compiled with ifort (-check bounds) we also see an error when running:

$ ifort -o example example.f90 -check bounds
$ ./example
forrtl: severe (408): fort: (3): Subscript #1 of the array X has value 0 which is less than the lower bound of 1

In simple cases we may even see bounds violations in Fortran programs being detected when compiling, reported by the compiler either as a warning or as an error. For example, when compiling the example with gfortran the compile-time message elided above looks like:

example.f90:8:6:

    7 |   do i = 0, 10
      |              2
    8 |     x(i) = 1.2
      |      1
Warning: Array reference at (1) out of bounds (0 < 1) in loop beginning at (2)

Referencing an undefined variable¶

In our C and Fortran programs, we may forget to give a variable a value before we try to use that value. For example, we may miss out an initialization expression such as in the C program

int main(void) {
  float x;
  x = x + 1.2;
  return 0;
}

or the Fortran program

program example
  implicit none

  real :: x
  x = x + 1.2
end program example

In such cases compilers may just take "random" values for the variable and carry on, eventually reporting "wrong" results.

With the GCC compilers, gcc for the C and gfortran for the Fortran, we can request detection during compilation of such uninitialized references using the -Wuninitialized compiler option.

For C:

$ gcc example.c -Wuninitialized
example.c: In function ‘main’:
example.c:3:9: warning: ‘x’ is used uninitialized in this function [-Wuninitialized]
    3 |   x = x + 1.2;
      |       ~~^~~~~

For Fortran:

$ gfortran example.f90 -Wuninitialized
example.f90:5:0:

    5 |   x = x + 1.2
      |
Warning: ‘x’ is used uninitialized in this function [-Wuninitialized]

We can also try to detect references of uninitialized variables when running the program. For example, with the Intel C compiler (icc) we have the option -check=uninit:

$ icc -o example example.c -check=uninit
$ ./example

Run-Time Check Failure: The variable 'x' is being used in example.c(3,3) without being initialized

Abort

For the Intel Fortran compiler (ifort) we have -check uninit:

$ ifort -o example example.f90 -check uninit
$ ./example
forrtl: severe (194): Run-Time Check Failure. The variable 'example_$X' is being used in 'example.f90(5,3)' without being defined

With gfortran we can attempt similar run-time checks by using a compiler option to initialize real variables or the parts of complex variables to "signalling NaN" values and then trapping the resulting signal when referenced:

$ gfortran -o example example.f90 -finit-real=snan -ffpe-trap=invalid
$ ./example

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

This, however, makes the compiling option -Wunitialized ineffective for real and complex variables and has no equivalent for other types (except for the components of derived types when combined with -finit-derived). Optimizations performed by the compiler may also render this diagnostic impotent.

Incorrect procedure references¶

When calling a procedure, it's possible for us to provide the wrong arguments or parameters. For example, if we have an external Fortran procedure like

subroutine set_x(x)
  implicit none
  real :: x
  x = 1.2
end subroutine set_x

in the file set_x.f90 it's possible for us to call that subroutine in a program with mismatching argument:

program example
  implicit none

  external set_x
  double precision x

  call set_x(x)
  print *, x

end program example

Here the x which is double precision doesn't match the real which is expected. Using ifort and the -warn interfaces option we can detect these cases by automatically checking interfaces:

$ ifort -warn interfaces example.f90 set_x.f90
example.f90(7): error #6633: The type of the actual argument differs from the type of the dummy argument.   [X]
  call set_x(x)
-------------^
compilation aborted for example.f90 (code 1)

More helpful output¶

If one of the runtime checks fails and the program stops with an error, we may see output relating to where in the program the error was detected. For example, running our example with array bounds violation we may see output (using ifort) like:

forrtl: severe (408): fort: (3): Subscript #1 of the array X has value 0 which is less than the lower bound of 1

Image              PC                Routine            Line        Source
example            00000000004062EF  Unknown               Unknown  Unknown
example            00000000004038A8  Unknown               Unknown  Unknown
example            0000000000403822  Unknown               Unknown  Unknown
libc-2.17.so       00007FF73AA8E555  __libc_start_main     Unknown  Unknown
example            0000000000403729  Unknown               Unknown  Unknown

This stack trace by itself doesn't seem very useful. We can, however, ask for more detail when compiling by using the -g and -traceback options. When run having compiled with this option, we instead see output with additional information about the location of the erroneous reference:

forrtl: severe (408): fort: (3): Subscript #1 of the array X has value 0 which is less than the lower bound of 1

Image              PC                Routine            Line        Source
example            000000000040639F  Unknown               Unknown  Unknown
example            00000000004038C9  MAIN__                      8  example.f90
example            0000000000403822  Unknown               Unknown  Unknown
libc-2.17.so       00007F623AEB7555  __libc_start_main     Unknown  Unknown
example            0000000000403729  Unknown               Unknown  Unknown

With this additional output we know not just that the array x was used with an incorrect subscript, but that the particular access is in line 8, in the main program (MAIN_) in the file example.f90.

The equivalent options for gfortran are -g and -fbacktrace.

In some cases with the -g option, compiler optimizations can make the information presented slightly inaccurate. When using -g it's usually clearer to limit the optimizations performed using the -O0 option (for Intel and GCC compilers) or -Og (for GCC compilers).

Conclusion¶

In this post we've looked at a number of types of errors that we can make when writing C or Fortran code. Using compiler options we can often detect these errors, either when compiling or when later running the programs.

We saw options for each type of error but we can easily turn on several checks at once. With gfortran we can use -Wall -fcheck=all, for example, and with ifort -warn all -check all. For the GCC compilers gcc and gfortran we can use the address sanitizer options -fsanitize=... to enable several diagnostics. Some of the address sanitizer options are incompatible with each other so there is no -fsanitize=all option and checks will have to be enabled individually.

There are many types of errors that cannot be detected by our compilers here, and several errors that can be detected that we haven't seen in this post. Documentation for each compiler will give additional detail on other options.

When developing programs it's good practice to compile with diagnostic checks and to use different compilers to benefit from the strengths of each. We also see that as compilers develop their diagnostic abilities may improve.