Dr. Memory: A Memory Checker Faster Than Valgrind

vinkelhake · on Jan 15, 2016

Another option is AddressSanitizer. Available in GCC and Clang. Compile with -fsanitize=address.

http://clang.llvm.org/docs/AddressSanitizer.html

hannob · on Jan 15, 2016

I'm often amazed how many people simply are not aware of address sanitizer.

It's orders of magnitude faster than valgrind - and it can find bugs that are simply impossible to find with a runtime-only tool (e.g. most stack oob accesses). However it can't find uninitialized memory (there is msan for that, but that's a bit tricky to set up and not available in gcc).

Peaker · on Jan 15, 2016

I think when I tried to use it, it enlarged all my stack allocations. When using precise stack allocations (user threads), it causes all the stacks to blow up.

slavik81 · on Jan 15, 2016

It think it uses a lot of extra memory as part of a design a tradeoff that keeps CPU overhead to a minimum. The documentation above mentions the memory cost in the limitations section.

By the way, there's a great talk from C++ Going Native 2013, called The Care and Feeding of C++'s Dragons [1] talking about a number new C++ tools, including AddressSanitizer. It's been two years since then, but it's probably still a good introduction.

[1] https://channel9.msdn.com/Events/GoingNative/2013/The-Care-a...

wumpus · on Jan 15, 2016

Dr. Memory and Valgrind both work on unaltered binaries -- quite different tools from a compiler option. I expect that AddressSanitizer is quite a bit faster because it's compiled in.

en4bz · on Jan 15, 2016

There's also TSAN for detecting race conditions as well. Personally I've been using valgrind less and less since ASAN and TSAN were released simply because they're so much faster.

wyldfire · on Jan 15, 2016

And UBSan for finding undefined behavior.

aidenn0 · on Jan 16, 2016

While I like -fsanitize=address, the project I spend most of my time on may have tens of thousands of lightweight threads, which makes it a non-starter. Valgrind works fine on it (Albeit at a significant performance impact).

versteegen · on Jan 16, 2016

Why is it that valgrind/memcheck can handle that but AddressSanitizer can't?

aidenn0 · on Jan 16, 2016

-fsanitize=address adds significant memory overhead on the stack; each thread has its own stack, so this overhead is multiplied by a large number.

haldean · on Jan 15, 2016

asan and valgrind are complimentary tools; AddressSanitizer catches memory errors and is fast enough that you can keep it turned on and kind of forget about it, but it doesn't catch leaked memory. Valgrind has been indispensable for leak finding for me, and asan hasn't replaced it.

slavik81 · on Jan 15, 2016

There is a LeakSanitizer which is enabled by default when you use AddressSanitizer under x86_64 Linux, though it's not supported on other platforms.

http://clang.llvm.org/docs/AddressSanitizer.html#memory-leak...

ucsdrake · on Jan 15, 2016

While probably a biased blog, folks might find it interesting to read Julian Seward's (valgrind's maintainer) thoughts after experimenting with drmemory;

https://blog.mozilla.org/jseward/2015/10/05/dr-memory-a-memo...

zitterbewegung · on Jan 15, 2016

Looking at the post it seems that the issue is that the underlying host is much more Complex and hard for the program to have a complete picture of what is happening. Does Microsoft have an solution similar to drmemory or valgrind?

mockery · on Jan 16, 2016

Microsoft has App Verifier: See: https://msdn.microsoft.com/en-us/library/ms220948(v=vs.90).a... Or: https://randomascii.wordpress.com/2011/12/07/increased-relia...

tkinom · on Jan 15, 2016

For better or worst, I prefer to write my own memory leak checker for most of the projects I worked on.

After did it a few times, it normally take less than 1 days to put in the code - basically wrapper for malloc, free, etc, new, delete. Add some table code to track bytes alloc/ free per filename/line#.

   It must have a very concise high level table  dump memory alloc/free associate with file:lineno.  


   file:lineno:  #Alloc_ByteCnt #Free_ByteCnt

Advantages: It works very well on app that need to runs forever. One can easily track memory usage/leakage over time, but adding a trigger to dump the memory table without killing the app. The trigger can be TestAPI call or external URL.

   If you have a full functional regression test coverage system in place (All my projects has them), one can easily trigger the memory table dump along with certain test points.    Fairly easy to ID memory leak associate with very high level functionalities  such as 

   * certain image load trigger 200K leak per test loop.  
   * Certain URL access trigger file:line leak 64 bytes per test loop
   * , etc


   Once all these setup in place, memory leak are fairly trivial to solved.   I have not seen any user space memory leak that take more than 1 hours to solve and verify. 


   The most complex memory leak I had to debug was the EGL app that trigger EGL kernel driver memory leak at ~20MB per over night testing.   To debug that was much harder.....

Because that was the kernel/system memory is leaking not just user APP memory.

kazinator · on Jan 16, 2016

Valgrind can also dump leak information while the program is running. For that, you have to use the Valgrind API. There is a request for it.

   #include <valgrind/memcheck.h>
   ...

   // in the code:

       VALGRIND_DO_LEAK_CHECK;

I like to put

   #ifdef CONFIG_VALGRIND_SUPPORT

around this stuff so it can be compiled out. The Valgrind API call macros expand to some machine code instruction patterns which execute harmlessly on a regular CPU, and have no effect, but are recognized as triggers by the Valgrind execution core. These instructions add bloat and burn cycles.

Valgrind has the advantage that it has a real leak checker: i.e. reports of still-allocated memory that is definitely unreachable by any pointer. It can also report blocks which are reachable only by pointers into their interior, not to their base address.

Having the stack trace for each allocation is also nice. If something is leaked, it can sometimes be useful to know where it was allocated (by what chain of calls). Of course you might know that every dynamic struct foo came from foo_alloc. But who called foo_alloc for a given leaked struct foo? And who called that caller?

Jabbles · on Jan 15, 2016

Memory profilers are far cheaper to use than valgrind.

Too · on Jan 16, 2016

Valgrind will also warn you if you try to use unassigned memory, i don't think wrapping malloc could possibly to that.

forrestthewoods · on Jan 15, 2016

"Dr. Memory currently targets 32-bit applications only."

:(

a_e_k · on Jan 15, 2016

Yes, this could be interesting but that one's a dealbreaker for me. Guess I'll keep using Valgrind.

rincebrain · on Jan 15, 2016

They're working on it.

They've got experimental 64-bit and ARM support in the repo, but they're still ironing out some false-positives. [1][2]

[1] - https://github.com/DynamoRIO/drmemory/issues/1839

[2] - https://github.com/DynamoRIO/drmemory/commit/dcb3af0836e0d75...

pjc50 · on Jan 15, 2016

Add a 32-bit target? Or is the memory usage too high?

xorblurb · on Jan 15, 2016

Some applications are not designed to be portable to 32-bits targets, regardless of memory usage... (whether or not this is a bad thing is another story, but at a point where such non-trivial application exist and has to be checked this is just a fact you can't retroactively bypass)

jcoffland · on Jan 15, 2016

Some bugs only show up on one arch.

jcoffland · on Jan 15, 2016

Google's gperf is much faster than valgrind and has been around for several years. I've used it successfully in large products to track down hard to find memory leaks.

TwoBit · on Jan 15, 2016

I think you mean gperftools and not gperf, which is a hash generaor.

jcoffland · on Jan 16, 2016

That's right. I call it gperf for short.

mhandb · on Jan 16, 2016

What about "application verifier" it is only for windows. i am not aware of the advantage of using drmemory or valgrind over appverif.exe (except the multi platform part). Can you tell me the advantages if any?

nickpsecurity · on Jan 15, 2016

Love what I see in the first two paragraphs. Also comes with a descriptive paper and source. My favorite kind of research effort. :)

JoeAltmaier · on Jan 15, 2016

Sounds good! I've never found memory-leak detection to work well; hopefully this one will do a good job.

jcoffland · on Jan 15, 2016

Really? I've found the devs who say this are just having trouble understanding the results these tools put out. No offence intended but do you think that could be the case?

JoeAltmaier · on Jan 15, 2016

Nope. Been doing this 20 years, tried them all. The biggest mistake is, they report all memory allocations as 'leaks'. Instead of just repeated allocations without corresponding freeing. So you get 1,000 reports of which 1 or 2 are really problems.

Then they like to carp about exactly how the memory was freed, which primitive etc. Which can matter when multiple heap disciplines are in use. But when they aren't its just more noise.

I remember an embedded fibre channel router we did, 16 processors and 100's of server message handlers. We tried a heap validation tool of some sort (valgrind?) and after a week of struggle we found exactly one sort-of problem ( a small leak ) out of the thousands of false alarms. Definitely not worth the effort.

kazinator · on Jan 16, 2016

Valgrind's leak detector separately reports:

- still reachable allocated blocks

- possibly leaked blocks: block reachable only via pointers to their interior, not to their base address

- definitely leaked blocks: blocks not reachable by any pointer in the program image.

Even the simplest tools like dmalloc do not report all allocations ever performed as leaks (even ones that have been freed). The simple tools report the still-in-use blocks, regardless of whether they are reachable or not.

That can be "good enough". And anyway, sometimes blocks that are still reachable are de facto leaks. Programs can have "semantic leaks" whereby some dynamic set data structure keeps growing and growing as processing marches on. For instance, a global hash table from which nothing is removed even though the entries have ceased to be relevant and the table is the only thing which refers to them. (Someone forgot to make the hash table a weak hash).

jcoffland · on Jan 16, 2016

This is exactly what I'm talking about. You can enable options to filter though these warnings. Too many devs get overwhelmed by the output and give up early. It takes some practice to use correctly but tools like valgrind are invaluable.

Also, if you are ignoring warnings about using the wrong deallocator in C++ then you are making a big mistake. Your code may work but this causes nasty bugs. E.g. if you allocate with 'new []' and then deallocate with plain old 'delete' or worse 'free()'.

JoeAltmaier · on Jan 18, 2016

I understand all that. In the end, I've experienced no benefit from these tools.

The example about new[]/delete; is a good one. That is important only if the element has a ctor/dtor. Yet the tools flag simple scalar arrays as well - pointlessly adding to the noise.

sushisource · on Jan 15, 2016

And it works on Windows? Sign me up.

kensai · on Jan 15, 2016

The problem with valgrind is that they don't update it often enough. Is once a year enough?

fdej · on Jan 15, 2016

Why does it need to be updated? Valgrind is that rare piece of software that always has been working absolutely flawlessly for me.

Elv13 · on Jan 16, 2016

If you wish to use new CPU features such as the imaginary AVX-1337, have fun waiting a year before being able to run your leak check.

And no, I don't want to maintain multiple code paths on my system, even for compiled generated AVX[2,512,SSE4.1] code. If you do so, you end up having a dev system unrepresentative of the production one.

rincebrain · on Jan 15, 2016

If nothing else, it needs to be updated for various new syscalls/x86 instructions periodically on the different platforms it supports (if you've never tried running valgrind on something that isn't Linux, you might not have run into this.)

karcass · on Jan 16, 2016

+1 for the Firesign Theater reference. Anybody else catch that?

designer_alex · on Jan 16, 2016

Read me Dr.Memory?