Using compiler options to help debug programs¶
Once we've written a program more advanced than our "Hello, world!" example, we're going to make mistakes. In this post, we'll look at how we can use the very compilers we're using to compile our program to pick up on some of these mistakes.
Introduction¶
To ensure that program components do what is intended, they should have appropriate specifications and tests. There are, however, a lot of common mistakes in implementation that our usual development tools can pick up. Here, we'll look at how the C and Fortran compilers we use to build our program can also detect some errors.
Each compiler differs in its ability to diagnose the types of errors discussed
here. The options required to enable diagnostics will vary and some compilers
may detect some of the problems by default. For concreteness, we'll focus on
the compiler suites most commonly used on Apocrita: GCC and Intel. For the
examples below, unless otherwise stated, we'll use the compiler modules
gcc/10.2.0
and intel/2020.4
on Apocrita.
Errors and compiler flags¶
Incorrect variable names¶
In Fortran code we can use implicit typing of variables to avoid having to declare them explicitly:
program example
i = 1 ! Implicitly an integer variable
x = 1.2 ! Implicitly a (default) real variable
end program example
This is (very) prone to error, such as with
program example
integer :: mycomplicatedvariable = 1
mycomp1icatedvariable = 27
print '(I5)', mycomplicatedvariable
end program example
With code we are writing now, we'd liberally use implicit none
to ensure
that no variable can be typed implicitly, giving a compile-time error for
the example above. However, we can also use compiler flags to compile as though
we'd put implicit none
in all suitable scopes.
For example, with gfortran we can compile with -fimplicit-none
:
$ gfortran -fimplicit-none example.f90
example.f90:3:23:
3 | mycomp1icatedvariable = 27
| 1
Error: Symbol ‘mycomp1icatedvariable’ at (1) has no IMPLICIT type; did you mean ‘mycomplicatedvariable’?
With ifort we can use -warn declarations
:
$ ifort -warn declarations example.f90
example.f90(3): warning #6717: This name has not been given an explicit type. [MYCOMP1ICATEDVARIABLE]
mycomp1icatedvariable = 27
--^
This compiler diagnostic is very crude: it picks up accidental use of implicit
typing but not intentional use where it would give many false positives.
Nonetheless, it can be helpful in checking that implicit none
has been used
in all correct places.
Accessing arrays out of bounds¶
In our C and Fortran programs which are using arrays, we may be attempting to access elements outside the bounds of some array. For example, we may have the C program
int main (void) {
float x[10], y;
x[10] = 1.2;
y = x[10];
return 0;
}
or the Fortran program
program example
implicit none
real :: x(10)
integer :: i
do i = 0, 10
x(i) = 1.2
end do
end program example
If we are lucky, our programs with out-of-bounds access will crash when we run them. However, we may be unlucky enough for them not to crash, instead finishing as though nothing were wrong.
Compiling the C program with gcc's bounds checking enabled (-fsanitize=bounds
)
we see errors detected when running:
$ gcc -o example example.c -fsanitize=bounds
$ ./example
example.c:4:4: runtime error: index 10 out of bounds for type 'float [10]'
example.c:5:8: runtime error: index 10 out of bounds for type 'float [10]'
Similarly, when compiled with icc and -check-pointers=rw
we have a (less
helpful) message like:
$ icc -o example example.c -check-pointers=rw
$ ./example
CHKP: Bounds check error ptr=0x7ffff028eca8 sz=4 lb=0x7ffff028ec80 ub=0x7ffff028eca7 loc=0x400cb3
Compiling the Fortran program with gfortran's bounds checking enabled
(-fbounds-check
) we see an error when running:
$ gfortran -o example example.f90 -fbounds-check
[ ... ]
$ ./example
At line 8 of file example.f90
Fortran runtime error: Index '0' of dimension 1 of array 'x' below lower bound of 1
For the same Fortran program compiled with ifort (-check bounds
) we also see
an error when running:
$ ifort -o example example.f90 -check bounds
$ ./example
forrtl: severe (408): fort: (3): Subscript #1 of the array X has value 0 which is less than the lower bound of 1
In simple cases we may even see bounds violations in Fortran programs being detected when compiling, reported by the compiler either as a warning or as an error. For example, when compiling the example with gfortran the compile-time message elided above looks like:
example.f90:8:6:
7 | do i = 0, 10
| 2
8 | x(i) = 1.2
| 1
Warning: Array reference at (1) out of bounds (0 < 1) in loop beginning at (2)
Referencing an undefined variable¶
In our C and Fortran programs, we may forget to give a variable a value before we try to use that value. For example, we may miss out an initialization expression such as in the C program
int main(void) {
float x;
x = x + 1.2;
return 0;
}
or the Fortran program
program example
implicit none
real :: x
x = x + 1.2
end program example
In such cases compilers may just take "random" values for the variable and carry on, eventually reporting "wrong" results.
With the GCC compilers, gcc for the C and gfortran for the Fortran, we can
request detection during compilation of such uninitialized references using
the -Wuninitialized
compiler option.
For C:
$ gcc example.c -Wuninitialized
example.c: In function ‘main’:
example.c:3:9: warning: ‘x’ is used uninitialized in this function [-Wuninitialized]
3 | x = x + 1.2;
| ~~^~~~~
For Fortran:
$ gfortran example.f90 -Wuninitialized
example.f90:5:0:
5 | x = x + 1.2
|
Warning: ‘x’ is used uninitialized in this function [-Wuninitialized]
We can also try to detect references of uninitialized variables when running
the program. For example, with the Intel C compiler (icc) we have the option
-check=uninit
:
$ icc -o example example.c -check=uninit
$ ./example
Run-Time Check Failure: The variable 'x' is being used in example.c(3,3) without being initialized
Abort
For the Intel Fortran compiler (ifort) we have -check uninit
:
$ ifort -o example example.f90 -check uninit
$ ./example
forrtl: severe (194): Run-Time Check Failure. The variable 'example_$X' is being used in 'example.f90(5,3)' without being defined
With gfortran we can attempt similar run-time checks by using a compiler option to initialize real variables or the parts of complex variables to "signalling NaN" values and then trapping the resulting signal when referenced:
$ gfortran -o example example.f90 -finit-real=snan -ffpe-trap=invalid
$ ./example
Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
This, however, makes the compiling option -Wunitialized
ineffective for real
and complex variables and has no equivalent for other types (except for the
components of derived types when combined with -finit-derived
). Optimizations
performed by the compiler may also render this diagnostic impotent.
Incorrect procedure references¶
When calling a procedure, it's possible for us to provide the wrong arguments or parameters. For example, if we have an external Fortran procedure like
subroutine set_x(x)
implicit none
real :: x
x = 1.2
end subroutine set_x
in the file set_x.f90
it's possible for us to call that subroutine in a
program with mismatching argument:
program example
implicit none
external set_x
double precision x
call set_x(x)
print *, x
end program example
Here the x
which is double precision
doesn't match the real
which is
expected. Using ifort and the -warn interfaces
option we can detect these
cases by automatically checking interfaces:
$ ifort -warn interfaces example.f90 set_x.f90
example.f90(7): error #6633: The type of the actual argument differs from the type of the dummy argument. [X]
call set_x(x)
-------------^
compilation aborted for example.f90 (code 1)
More helpful output¶
If one of the runtime checks fails and the program stops with an error, we may see output relating to where in the program the error was detected. For example, running our example with array bounds violation we may see output (using ifort) like:
forrtl: severe (408): fort: (3): Subscript #1 of the array X has value 0 which is less than the lower bound of 1
Image PC Routine Line Source
example 00000000004062EF Unknown Unknown Unknown
example 00000000004038A8 Unknown Unknown Unknown
example 0000000000403822 Unknown Unknown Unknown
libc-2.17.so 00007FF73AA8E555 __libc_start_main Unknown Unknown
example 0000000000403729 Unknown Unknown Unknown
This stack trace by itself doesn't seem very useful. We can, however, ask for
more detail when compiling by using the -g
and -traceback
options. When
run having compiled with this option, we instead see output with additional
information about the location of the erroneous reference:
forrtl: severe (408): fort: (3): Subscript #1 of the array X has value 0 which is less than the lower bound of 1
Image PC Routine Line Source
example 000000000040639F Unknown Unknown Unknown
example 00000000004038C9 MAIN__ 8 example.f90
example 0000000000403822 Unknown Unknown Unknown
libc-2.17.so 00007F623AEB7555 __libc_start_main Unknown Unknown
example 0000000000403729 Unknown Unknown Unknown
With this additional output we know not just that the array x
was used with
an incorrect subscript, but that the particular access is in line 8, in the
main program (MAIN_
) in the file example.f90
.
The equivalent options for gfortran are -g
and -fbacktrace
.
In some cases with the -g
option, compiler optimizations can make the
information presented slightly inaccurate. When using -g
it's usually
clearer to limit the optimizations performed using the -O0
option (for Intel
and GCC compilers) or -Og
(for GCC compilers).
Conclusion¶
In this post we've looked at a number of types of errors that we can make when writing C or Fortran code. Using compiler options we can often detect these errors, either when compiling or when later running the programs.
We saw options for each type of error but we can easily turn on several checks
at once. With gfortran we can use -Wall -fcheck=all
, for example, and with
ifort -warn all -check all
. For the GCC compilers gcc and gfortran we can use
the address sanitizer options -fsanitize=...
to enable several diagnostics.
Some of the address sanitizer options are incompatible with each other so
there is no -fsanitize=all
option and checks will have to be enabled
individually.
There are many types of errors that cannot be detected by our compilers here, and several errors that can be detected that we haven't seen in this post. Documentation for each compiler will give additional detail on other options.
When developing programs it's good practice to compile with diagnostic checks and to use different compilers to benefit from the strengths of each. We also see that as compilers develop their diagnostic abilities may improve.