Valgrind

valgrind [5] is a powerful tool for detecting memory management problems in programs[6]. The kinds of problems it can detect are often very difficult to find by other means and often cause difficult to diagnose crashes. Valgrind can be used with existing executables without recompiling or relinking, although the output it produces will be much more useful if you have compiled with the -g flag.

Valgrind is an instrumentation framework for building dynamic analysis tools. There are Valgrind tools that can automatically detect many memory management and threading bugs, and profile your programs in detail. You can also use Valgrind to build new tools. Valgrind is a multipurpose code profiling and memory debugging tool for Linux when on the x86 and, as of version 3, AMD64, architectures. It allows you to run your program in Valgrind's own environment that monitors memory usage such as calls to malloc and free (or new and delete in C).

Valgrind is basically an x86 emulator that checks all reads and writes of memory, intercepts all calls to allocate and deallocate memory. The memcheck tool of valgrind (which is the main tool and the only one covered in this chapter) can detect the following:

  • Use of uninitialised memory
  • Reading/writing memory after it has been free'd
  • Reading/writing off the end of malloc'd blocks
  • Reading/writing inappropriate areas below the stack.
  • Memory leaks
  • Mismatched use of malloc/new/new[] vs free/delete/delete[]
  • Overlapping src and dst pointers in memcpy() and related functions
  • Doubly freed memory
  • Passing unaddressable bytes to a system call

A note to those running the 2.6 version of the Linux kernel: Valgrind versions ≤ 2.1.1 do not work on the 2.6 kernel. You either need a later version (which have not been released as of the time of this writing) or a version compiled from CVS. You also need to run sysctl -w kernel.vdso=0 as root in order for valgrind to run.

A simple program that contains nine different tests, each of which shows an example of an error that valgrind can catch, is included in an appendix. You can compile it with the command g++ -g valgrind-tests.cc -o valgrind-tests, and can run it by specifying the test number (1-9) at the command line (e.g. valgrind-tests 2).

Let me walk through running one of the tests under valgrind and the output it produces. To run test 2 of the valgrind-tests program in the appendix under valgrind, run valgrind --logfile=valgrind.output ./valgrind-tests 2. (Note that in the development versions of valgrind, --tool=memcheck must also appear on the command line; it is no longer assumed by default.) The logfile is not necessary, but I prefer the output of my program and the output from valgrind to be separated. Also, the --logfile option is slightly misleading in that the string you specify is not actually the logfile used, but rather is just part of the name of the logfile. A string of the form .pid5313, where 5313 is the process id number of the valgrind-tests program when it runs, will be appended. The code that this specific test runs is

After you launch valgrind like this from a terminal, the log file contain a number of messages. The beginning of the logfile will look something like

Note that each line begins with a 5313 (the 5313 is the process id number and thus will be different each time the program runs). The reason for this flag on each line is that the output of the program is normally interspersed with the output of valgrind, and a way to tell the output from the two programs apart is needed. Of course, that need does not exist when a log file is specified as we have done, but valgrind still includes it anyway. The other information on the lines printed by valgrind should be clear.

After this header, valgrind prints any errors that it comes across. The test case we ran contained an invalid write to an already freed chunk of memory, so the messages from valgrind in the log file reflect that:

The way to read the first five lines of this is that an invalid write occurred (to some data structure that was four bytes long) at line 37 of valgrind-test.cc (which was called from line 134 of valgrind-tests.cc, which was called from __libc_start_main, which is the outermost function of the valgrind-tests program). Line 37 of valgrind-test.cc corresponds to the *i = 4 statement, which came right after the delete i statement in the code. Valgrind then tries to be more helpful and state why the write was invalid. The extra information it provides states that the invalid write was to a freed block (and valgrind even goes so far as to state how big that freed block was and where inside that freed block the invalid write occurred to), and then lists the stack trace of functions involved in freeing that block (that block was freed at line 244 of replace_malloc.c, which was called in line 253 of vg_replace_malloc.c, which was called by line 36 of valgrind-test.cc (which is where the delete i statement was in the valgrind-test.cc program), which was called by...I think you get the picture).

At the end of the log file, a summary of errors that valgrind found is printed.

This output is self-explanatory for the most part. The suppressed comment refers to the fact that valgrind allows errors to be suppressed--something that can come in handy since valgrind also reports on errors in all libraries that your application is linked to.

At this point, you really know everything you need to know to get started with valgrind; it really does not require much more information than knowing that you just add 'valgrind' (plus maybe some options) at the beginning of your command line and then reading the output that is produced. A good thing to do at this point is to run the nine tests in the valgrind_tests program in the appendix under valgrind and get used to valgrind's output for different types of errors.

Valgrind does have a few notable disadvantages which are helpful to be aware of:

  • Valgrind does not work with statically linked binaries (in the development versions this has changed, but it still does not work as well as with dynamically linked binaries).
  • Valgrind uses a lot of memory, and programs run very slowly under it (25-50 times slower than natively). Of course, debugging is normally a slow process and 50 times slower than normal CPU speed is nothing compared to the time and frustration involved in manually tracking down bugs that valgrind can spot.
  • Optimized code can cause valgrind to wrongly report uninitialized value errors. The authors know how to fix this, but it would make valgrind much slower (and it is already quite slow). The suggested fix for this is to not optimize when trying to debug code with valgrind. Not optimizing when debugging is a good rule of thumb anyway.

There are many other options and ways in which you can run valgrind. You can learn more at http://valgrind.kde.org/. Also note that there are graphical frontends to valgrind, among them alleyoop, a front-end using the Gnome libraries. In the remainder of the section, I will try to list some of the more common options to pass to valgrind.

Errors from the usage of uninitialized memory are a little bit different in that they are not triggered instantly, meaning that do not occur when copying uninitialized values to uninitialized locations. This means that you may need to check one of the functions further down the stack trace than the top one listed by valgrind. Sometimes, the default of valgrind to list no more than 4 functions in the stack trace will not be enough to see where the real cause of the problem was. You can increase the number of functions that valgrind lists with the --num-callers= option. Also, note that using a single uninitialized variable can result in many errors if that value is copied multiple times--something that can easily happen if that value is passed to another function. An example of this is the first test case of the valgrind_tests program.

By default, valgrind does not check for memory leaks. This can be changed by specifying the --leak-check=yes option. Even without this option, however, valgrind's end summary will still state whether memory is still in use, and, if there is, suggest that the user rerun with --leak-check turned on. Note that valgrind has a somewhat confusing default of merging reported leaks based on ignoring all but a few frames of the stack trace. This can be turned off by also specifying --leak-resolution=high.

There is also a --gdb-attach=yes option which allows one to attach gdb to the running program when an error is encountered in order to learn more about what is going on. Note that this option conflicts with the use of any log file, and it will probably require that the path to gdb be specified by using --gdb-path=/path/to/gdb.

Finally, a few other options of note are the -v option for more verbosity, a -fno-inline option for C++ which makes it easier to see the function-call chain, a --gen-suppressions=yes option to help in the generation of suppressions files. (Suppressions files are a longer topic than I want to cover in this short tutorial, but if you ever find that valgrind displays lots of errors for a library that you are linked to but which you do not want to see the errors for, then this option can come in handy), and a --skin=addrcheck option (the syntax has changed to --tool=addrcheck in development versions) which cause valgrind to do fewer checks but which runs about twice as fast and uses less memory.

Gnome applications tend to have deep stack traces, much of which comes from the Glib main loop. So it tends to be important to specify a large value for --num-callers (say, 40 or so, just to be safe). Also, if checking for leaks, be sure to specify --leak-resolution=high.

Common valgrind options

Option : --num-callers=number

Purpose : Determines the number of function calls (i.e. depth of stacktrace) to display as part of showing where an error occurs within a program. The default is a measly 4.

Option : --leak-check=yes

Purpose : Enabling leak checking has valgrind search for memory leaks (i.e. allocated memory that has not been released) when the program finished.

Option : --leak-resolution=high

Purpose : An option that should be used when doing leak checking since all other options result in confusing reports.

Valgrind

Option : --show-reachable=yes

Purpose : An option that makes leak checking more helpful by requesting that valgrind report whether pointers to unreleased blocks are still held by the program.

Option : -v

Valgrind Still Reachable

Purpose : Run in more verbose mode.

Option : -fno-inline

Purpose : An option for C++ programs which makes it easier to see the function-call chain.

Option : --gen-suppressions=yes

Purpose : A simple way to generate a suppressions file in order to facilitate ignoring certain errors in future runs of the same code.

Option : --skin=addrcheck

Purpose : (Note that the name of this option has become --tool and has become mandatory for the development release). This selects the specific tool of valgrind that will run. Memcheck (the only tool covered here) is the default.

Option : --logfile=file-basename

Purpose : Record all errors and warnings to file-basename.pidpid


[5] Only works on Linux-x86, although people are working on porting it to FreeBSD.

[6] Actually, it can also profile cache hits and misses and detect data races in multithreaded programs, but those uses will not be covered here.

Valgrind is a program that checks for both memory leaks and runtime errors. A memory leak occurs whenever you allocate memory using keywords like new or malloc, without subsequently deleting or freeing that memory before the program exits. Runtime errors, as their name implies, are errors that occur while your program is running. These errors may not cause your code to crash, but they can cause unpredictable results and should be resolved.

Memory Leaks

If your program has memory leaks, they will appear at the bottom of the Valgrind output, as follows:

Ultimately, your goal is to get each of these categories down to 0 bytes in 0 blocks. When this happens, the Leak Summary output will change to look as follows:

Runtime Errors

Valgrind outputs runtime errors together with your program's output. For example, consider the following Valgrind segment:

In the middle of running my code, Valgrind discovered a runtime error, in this case, Invalid write of size 4. Notice that it gives us a 'stack trace', in other words, a list of which functions were called to get to the point where this error occurred. In this case, main() called function1(), which called function2(), which called function3(), which called function4(). function4() is where the error occurred. Notice that Valgrind even gives a line number: our error occurred in the file valgrind.cpp on line 4.

Sometimes, Valgrind will report that an error occurred in a function that you didn't even write. For example:

As you can see in the stack trace, an error occurred in the file write.c on line 27. But you didn't even write that! When this happens, the best thing to do is to follow the stack trace backwards until you find a function that you did write. In this example, we would follow the stack trace back until we found the main() function, specifically line 180 in valgrind.cpp.

Let's now consider the most frequent Valgrind errors that you might face in this class:

Valgrind Download

Conditional jump or move depends on uninitialized value(s)

This error message means that you have a loop or if statement that is dependent on an uninitialized variable (a variable whose value you haven't set). In the example below, notice that the integer x is only set if userInput is equal to true. But then, no matter what, we use x in the for-loop condition, even though it might not be set. This creates problems, because if you don't set the value of the variable, it will be set to some unknown value.

Invalid write of size n

This error message most often occurs if you have written outside the bounds of memory you have allocated. For example, suppose you allocate an array of integers with 5 items, but then try to access the 8th element:

Mismatched (or Invalid) free() / delete / delete[] / realloc()

This error message can have several causes. The most common reason is forgetting to use delete[] with dynamically allocated arrays. In this example, the valgrind error could be corrected by simply adding the square brackets after delete.

This can also occur if you don't correctly delete the memory you've allocated. Anytime you use new, you must delete with delete. Anytime you use malloc, you must delete using free. Attempting to mix and match, as seen below, will cause a Valgrind error:

One other thing that can cause this problem is deleting the same thing twice:

Stack overflow in thread

Valgrind Abr

Valgrind

In labs that use recursion, such as BST, you may run into this error. Most frequently this occurs when you don't have exhaustive base cases for a recursive function. Without all the necessary base cases, your recursive function will continually call itself until there is no more memory on the stack to call it.

Process terminating with default action of signal 11 (SIGSEGV)

This means that your program crashed with a Segmentation Fault. A segmentation fault generally occurs if you try to access memory that is already reserved by the system. In this class, this most commonly occurs if you try to access a data member of a NULL pointer, like this:

Process terminating with default action of signal 15 (SIGTERM)

In the autograder output, you will almost always see this error when your program's execution timed out (in this class, the time limit is 5 seconds). Like the stack overflow issue, this can happen with recursive functions that are missing necessary base cases. If this is the case, your Valgrind output will look something like this:

Notice that function1() continually called itself over and over until we ran out of time.