Memory debugging¶
Memory debugging is the process of tracking bugs related to the allocation and deallocation of dynamic memory. This page will start with a short introduction on how to use Valgrind for this purpose. Then, we present how to use it for a parallel code. See also the Cray documentation page about debugging tools in Cray Programming Environment.
Valgrind¶
The Valgrind memcheck tool is a tool for memory debugging and memory leak detection. Valgrind monitors access, allocation, and deallocation of memory and allows you to debug the following problems:
- bad memory accesses
- uses of uninitialized values
- memory leaks
- double frees or mismatched frees
- overlapping source and destination memory blocks to memory copy functions
To use Valgrind, compile you application with the debug flag (-g
) so that
Valgrind can point you to the faulty lines of code.
Next, run your application with Valgrind
Warning
When using Valgrind, your application will run much slower than normal and use more memory. Take this into consideration when you set up your job.
Valgrind also include a heap profiler named massif
. This tool measures how
much heap memory your program uses and can be invoked with the following
command
See also:
- Valgrind memcheck documentation
- Valgrind massif documentation
Valgrind4HPC¶
Debugging serial applications with Valgrind is quite straightforward. However, for a parallel application with multiple ranks, the output can become messy. That is where Valgrind4hpc comes in handy.
Valgrind4hpc aggregates any duplicate messages across ranks to help provide an understandable picture of program behavior. Valgrind4hpc manages starting and redirecting output from many copies of Valgrind, as well as deduplicating and filtering Valgrind messages.
To use Valgrind4hpc you have to load the module
The next step is to launch your application with Valgrind4hpc. It will take care of running your application through Slurm. The general form of the command is:
If your application does not take arguments, the --
at the end is optional.
The table below summarizes the basic options that you can use with Valgrind4hpc.
Option | Description |
---|---|
--num-ranks=<num ranks> |
Specify the number of ranks to run |
--launcher-args=<arguments> |
Arguments to the application launcher (srun ) |
--valgrind-args=<arguments> |
Specify Valgrind arguments |
--from-ranks=<ranks> |
Only show Valgrind output from certain ranks |
For example, to run your application with rank 16 on two nodes, the command will be