Valgrind provides several mechanisms to locate memory problems in your code. Paul Floyd shows us how to use them.
In the previous part of this series I covered basic use of memcheck. In this article, I’ll expand on that and cover the difficult cases that I touched on previously:
- Compiling with Valgrind macros.
- Attaching a debugger.
- Using memory pools.
Compiling with Valgrind macros
When you are testing an application with memcheck, you have a passive role interacting with Valgrind. Valgrind will only generate output if an error occurs (not counting the header that contains copyright information, the Valgrind options in effect, shared libraries and function intercepted and the footer with a summary of errors found and suppressions used). You have no access to the internals of the VEX virtual machine or the state of memory. Valgrind provides you with macros that allow you to actively control output and interact with the VM.
In order to trace down the precise origins of an error, you might want to generate output at points prior to the error. Alternatively, you might want to examine memory even when there are no errors. You can think of the macros for this purpose as being a bit like printf statements, with the output going into Valgrind’s output (the console or the log file). In addition to Valgrind’s output, the macros may return a value, either ‘directly’ from the macro as a status, or through inout arguments to the macro. There are also macros to trigger Valgrind actions like performing a leak check (which otherwise will only happen when the application under test terminates).
In your C or C++ source file, you have to include the appropriate header, e.g.,
#include “valgrind/memcheck.h”
(you might prefer to use <memcheck.h> if Valgrind is installed with its include files in the system header directories).
Then you need to add the include path to the compiler directive, if the headers are not in the system include path. For instance, in a GNU makefile
CPPFLAGS += -I “/Applications/valgrind/include”
Then you can use the macros in your source. Since Valgrind does not link any extra libraries, these macros use a different mechanism. The macros contain a sequence of machine instructions that no known compiler would ever issue and that have no side effects. The Valgrind virtual machine detects this sequence and instigates a client request. When not running under Valgrind, there is no effect other than a very small time penalty. For example, on x86 the following is used:
#define __SPECIAL_INSTRUCTION_PREAMBLE \ "roll $3, %%edi ; roll $13, %%edi\n\t" \ "roll $29, %%edi ; roll $19, %%edi\n\t"
which rotates EDI by 64bits, leaving it unchanged.
There are numerous such macros: Listing 1 shows the client macros in memcheck.h .
VALGRIND_MAKE_MEM_NOACCESS(_qzz_addr,_qzz_len) VALGRIND_MAKE_MEM_UNDEFINED(_qzz_addr,_qzz_len) VALGRIND_MAKE_MEM_DEFINED(_qzz_addr,_qzz_len) VALGRIND_MAKE_MEM_DEFINED_IF_ADDRESSABLE(_qzz_addr,_qzz_len) VALGRIND_CREATE_BLOCK(_qzz_addr,_qzz_len, _qzz_desc) VALGRIND_DISCARD(_qzz_blkindex) VALGRIND_CHECK_MEM_IS_ADDRESSABLE(_qzz_addr,_qzz_len) VALGRIND_CHECK_MEM_IS_DEFINED(_qzz_addr,_qzz_len) VALGRIND_CHECK_VALUE_IS_DEFINED(__lvalue) VALGRIND_DO_LEAK_CHECK VALGRIND_DO_ADDED_LEAK_CHECK VALGRIND_DO_CHANGED_LEAK_CHECK VALGRIND_DO_QUICK_LEAK_CHECK VALGRIND_COUNT_LEAKS(leaked, dubious, reachable, suppressed) VALGRIND_COUNT_LEAK_BLOCKS(leaked, dubious, reachable, suppressed) VALGRIND_GET_VBITS(zza,zzvbits,zznbytes) VALGRIND_SET_VBITS(zza,zzvbits,zznbytes) |
Listing 1 |
Let’s take a look at an example (Listing 2).
// main.c // clientreq #include <stdio.h> #include <stdlib.h> #include <string.h> #include "memcheck.h" struct two_bit { char foo:2; char :0; }; int main (int argc, const char * argv[]) { const size_t size = 2*sizeof(int); int *pi = malloc(size); short *ps; struct two_bit *ptb; pi[0] = 1; ps = (short *)pi; ps[2] = 2; ptb = (struct two_bit *)pi; ptb[7].foo = 3; unsigned long addressable = VALGRIND_CHECK_MEM_IS_ADDRESSABLE (pi, size); printf("addressable %lx\n", addressable); addressable = VALGRIND_CHECK_MEM_IS_ADDRESSABLE (pi, size+1); printf("addressable %lx\n", addressable); int status = 0; unsigned char bits[8]; memset(bits, 0, 8); status = VALGRIND_GET_VBITS(pi,bits,size); for (int i = 0; i < size; ++i) { printf("byte %d bits %x\n", i, (unsigned int)bits[i]); } free(pi); return 0; } |
Listing 2 |
This is intended to be built on a 64bit system, though the results should be similar on a 32bit system.
A pointer to
int
,
pi
, gets assigned to 2
int
s worth (8 bytes) in the heap. The first
int
is initialized. Then I do some nasty casting, first to initialize the first half (2 bytes) of the second
int
. Then, with recourse to a
struct
with a bitfield, I initialize just two bits in the last byte of the 2nd
int
. So of the 4 bytes in that 2nd
int
, the 1st two are initialized, the third is uninitialized and the fourth has 2 bits initialized.
After all of the initialization (or not) come Valgrind client request macros. The first checks if the 8 bytes allocated are addressable. The second checks if 9 bytes are addressable. The third gets the initialization status of each of the bits that were allocated.
If I compile this and run it outside of Valgrind I get Listing 3.
addressable 0 addressable 0 byte 0 bits 0 byte 1 bits 0 byte 2 bits 0 byte 3 bits 0 byte 4 bits 0 byte 5 bits 0 byte 6 bits 0 byte 7 bits 0 |
Listing 3 |
However, running it under Valgrind gives Listing 4.
addressable 0 ==3089== Unaddressable byte(s) found during client check request ==3089== at 0x100000D46: main (in /Users/paulf/Library/Developer/Xcode/DerivedData/clientreq-einugynxilcucqauaotevhsuanfx/Build/Products/Debug/clientreq) ==3089== Address 0x1000040e8 is 0 bytes after a block of size 8 alloc'd ==3089== at 0xD6D9: malloc (vg_replace_malloc.c:266) ==3089== by 0x100000C3E: main (in /Users/paulf/Library/Developer/Xcode/DerivedData/clientreq-einugynxilcucqauaotevhsuanfx/Build/Products/Debug/clientreq) ==3089== addressable 1000040e8 byte 0 bits 0 byte 1 bits 0 byte 2 bits 0 byte 3 bits 0 byte 4 bits 0 byte 5 bits 0 byte 6 bits ff byte 7 bits fc |
Listing 4 |
As expected, the check whether the 8 bytes were addressable returns 0, meaning that they are all addressable. The check whether 9 bytes are accessible provokes a ‘Unaddressable byte(s) found during client check request’ message with information and a return of the address of the first unaddressable byte. The loop over the 8 bytes that were allocated show that the 1st 6 bytes have been initialized, byte 6 is uninitialized and byte 7 has the bottom 2 bits initialized and the top 6 bits uninitialized.
So at the cost of having to change how the executable was built, we’ve gained access down to the bit of the memory status of the executable. That’s great, but it does have the drawback of being static – you can’t easily change at runtime what is analysed. Since Valgrind 3.7.0, there is a way to have more dynamic access to the internals while the executable is running, and that is to use the built in gdbserver. Not only can you access information like that shown above, you can also (almost) debug the application like a real application directly under gdb.
Let’s see an example of using the gdbserver. First of all, some example code, with a
print
function that reads beyond the array that is passed to it (Listing 5).
#include <iostream> #include <unistd.h> using std::cout; template<typename T> void init(size_t size, T* ptr) { for (size_t i = 0; i < size; ++i) { ptr[i] = 0; } } template<typename T> void print(size_t size, T* ptr) { for (size_t i = 0; i < size; ++i) { cout << "element " << i << " " << ptr[i] << "\n"; } } int main() { //sleep(10); int *pi = new int[11]; init(10, pi); print(11, pi); delete [] pi; } |
Listing 5 |
If I compile and run it, I get ‘element 0 0’ to ‘element 11 0’. Running it under valgrind with the
-v
option causes the following to be included in the output (Listing 6).
==12922== TO DEBUG THIS PROCESS USING GDB: start GDB like this ==12922== /path/to/gdb ./vg_gdb ==12922== and then give GDB the following command ==12922== target remote | /usr/lib/valgrind/../../bin/vgdb --pid=12922 ==12922== --pid is optional if only one valgrind process is running |
Listing 6 |
I was using xterms to do this, and if you are using terminals, either you need to be very good at coping with the spliced gdb/application under test input and output, or you just use two terminals, which is what I did. The ‘sleep’ was uncommented to give a bit of time to attach gdb. In the first terminal,
gdb ./vg_gdb
(to be ready with the gdb prompt)
then in the second terminal
valgrind -v ./vg_gdb
Select the text and then quickly switch back to the first terminal and paste
(gdb) target remote | /usr/lib/valgrind/../../ bin/vgdb –pid=12922
Then I could use all of the usual gdb commands like n(ext), s(tep), p(rint) and so forth. I stepped as far as the print function.
In order to examine
ptr
I issued the command
(gdb) monitor get_vbits 0x59ff040 44
and got back
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffffffff
monitor
is the command that gdb uses to communicate with a remote server. You should only use it in cases like this and not when you are debugging an application directly.
As expected, the last
int
is not initialized, as shown by the
f
s.
You don’t have to use terminals, this will also work with GUI applications and GUI wrappers for gdb (like ddd).
You can use the gdbserver for several things as well as
get_vbits
.
- Information about errors that have been detected.
- Changing logging options.
- Change the accessibility flags for given memory.
- Check that memory is addressable.
- Check for leaks.
See
monitor help
for details.
If you are using Valgrind prior to 3.7.0, then you will not have this feature available. You’ll need to use either or both of the macros (as described above) and the
--db-attach=yes
option. With this option set, when memcheck encounters an error, it will ask you if you want to attach a debugger, like this:
==9654== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ----
If you type
y
or
Y
it will launch gbd and attach it to the application under test. With the attached debugger you have a static image of the application – you can do things like go up and down the stack and examine variables, but you can’t step or run the application. When you quit gdb, control returns to memcheck running the application under test. Valgrind defaults to using gdb. You can specify another debugger with the command:
–db-command=<command>
I’ve had trouble with this when I’ve used it in the
.valgrindrc
file. When the commands are parsed, they are split on spaces, and this option usually contains spaces and
%f
for the application file and
%p
for the pid. So if my
.valgrindrc
contains
--db-command="ddd %f %p"
and I run
valgrind --db-attach=yes xemacs
then I get
valgrind: Bad option: %f
This will work if you put the commands on the command line
valgrind --db-attach=yes --db-command="ddd %f %p" xemacs
Another thing that can be difficult is if you use gdb with a command line application. In this case the output of the application will be mixed with the output (and input) of gdb.
For the last section in this article, I’ll look at using memory pools. Let’s start with a little noddy application with memory pool (Listing 7).
#include <iostream> class MemPool { public: MemPool(); ~MemPool(); int *allocInt(); void freeInt(int *ptr); private: int *pool; unsigned int freeMap; static const size_t poolSize = 32; }; MemPool::MemPool() : pool(new int[poolSize]), freeMap(0U) { } MemPool::~MemPool() { delete [] pool; } int *MemPool::allocInt() { for (size_t i = 0; i < poolSize; ++i) { if (!(freeMap & 1U << i)) { freeMap |= 1 << i; return &pool[i]; } } return 0; } void MemPool::freeInt(int *ptr) { for (size_t i = 0; i < poolSize; ++i) { if (ptr == &pool[i]) { freeMap &= ~(1 << i); return; } } } int main (int argc, const char * argv[]) { MemPool mempool; int *ptrs[3]; ptrs[0] = mempool.allocInt(); ptrs[1] = mempool.allocInt(); ptrs[2] = mempool.allocInt(); mempool.freeInt(ptrs[0]); mempool.freeInt(ptrs[2]); } |
Listing 7 |
Note the obvious ‘leak’, 3 calls to
allocInt
but only 2 calls to
freeInt
. Compiling and running this with memcheck detects no errors (Listing 8).
==9886== HEAP SUMMARY: ==9886== in use at exit: 0 bytes in 0 blocks ==9886== total heap usage: 1 allocs, 1 frees, 128 bytes allocated ==9886== ==9886== All heap blocks were freed -- no leaks are possible ==9886== ==9886== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4) |
Listing 8 |
Let’s now add the Valgrind machinery to instrument the memory pool. The parts that need to be changed are:
- The constructor, to tell Valgrind about the memory pool and to mark it as ‘noaccess’.
- The destructor, to actually perform the leak checks and to tell Valgrind that the memory pool is no longer used.
- The allocator, so that Valgrind knows when a chunk in the pool is used.
- The deallocator, so that Valgrind knows when chunks in the pool are released (Listing 9).
#include <iostream> #include "valgrind/memcheck.h" class MemPool { public: MemPool(); ~MemPool(); int *allocInt(); void freeInt(int *ptr); private: int *pool; unsigned int freeMap; static const size_t poolSize = 32; }; MemPool::MemPool() : pool(new int[poolSize]), freeMap(0U) { VALGRIND_MAKE_MEM_NOACCESS(pool, poolSize*sizeof(int)); VALGRIND_CREATE_MEMPOOL(pool, poolSize*sizeof(int), 0); } MemPool::~MemPool() { VALGRIND_DO_LEAK_CHECK; VALGRIND_DESTROY_MEMPOOL(pool); delete [] pool; } int *MemPool::allocInt() { for (size_t i = 0; i < poolSize; ++i) { if (!(freeMap & 1U << i)) { freeMap |= 1 << i; VALGRIND_MEMPOOL_ALLOC(pool, &pool[i], sizeof(int)); return &pool[i]; } } return 0; } void MemPool::freeInt(int *ptr) { for (size_t i = 0; i < poolSize; ++i) { if (ptr == &pool[i]) { VALGRIND_MEMPOOL_FREE(pool, ptr); freeMap &= ~(1 << i); return; } } } int main (int argc, const char * argv[]) { MemPool mempool; int *ptrs[3]; ptrs[0] = mempool.allocInt(); ptrs[1] = mempool.allocInt(); ptrs[2] = mempool.allocInt(); mempool.freeInt(ptrs[0]); mempool.freeInt(ptrs[2]); } valgrind -v --leak-check=full--show-reachable=yes ./main ==9971== Searching for pointers to 1 not-freed blocks ==9971== Checked 180,728 bytes ==9971== ==9971== 4 bytes in 1 blocks are still reachable in loss record 1 of 1 ==9971== at 0x400C28: MemPool::allocInt() (main.cpp:46) ==9971== by 0x400D6C: main (main.cpp:73) ==9971== ==9971== LEAK SUMMARY: ==9971== definitely lost: 0 bytes in 0 blocks ==9971== indirectly lost: 0 bytes in 0 blocks ==9971== possibly lost: 0 bytes in 0 blocks ==9971== still reachable: 4 bytes in 1 blocks ==9971== suppressed: 0 bytes in 0 blocks |
Listing 9 |
Note that the memory ‘leaked’ from the pool is marked as ‘still reachable’ rather than as one of the ‘lost’ categories.
That wraps it up for memcheck. Before I go, a few production notes. On my Mac (with Mac OS X 10.6.8 on an Intel CPU) I couldn’t get the memory pool example to work. On the Linux install that I used for the same example (openSUSE 11.4) the Valgrind headers were missing and I had to add the valgrind-devel package.
In my next article, I’ll cover Callgrind, a tool for time profiling applications.