No more leaks with sanitize flags in gcc and clang

If you are programming in C and C++, you are probably wasting at least some of your time hunting down memory problems. Maybe you allocated memory and forgot to free it later.

A whole industry of tools has been built to help us trace and solve these problems. On Linux and MacOS, the state-of-the-art has been valgrind. Build your code as usual, then run it while under valgrind and memory problems should be identified.

Tools are nice but a separate check breaks your workflow. If you are using recent versions of the GCC and clang compilers, there is a better option: sanitize flags.

Suppose you have the following C program:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv)
{
   char * buffer = malloc(1024);
   sprintf(buffer, "%d", argc);
   printf("%s",buffer);
}

Save this file as s.c. The program should simply print out how many arguments were entered on the command line. Notice the call to malloc that allocates a kilobyte of memory. There is no accompanying call to free and so the kilobyte of memory is “lost” and only recovered when the program ends.

Let us compile the program with the appropriate sanitize flags (-fsanitize=address -fno-omit-frame-pointer):

gcc -ggdb -o s s.c -fsanitize=address -fno-omit-frame-pointer

When you run the program, you get the following:

$ ./s

=================================================================
==3911==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 1024 byte(s) in 1 object(s) allocated from:
    #0 0x7f55516b644a in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x9444a)
    #1 0x40084e in main /home/dlemire/tmp/s.c:6
    #2 0x7f555127eec4 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)

SUMMARY: AddressSanitizer: 1024 byte(s) leaked in 1 allocation(s).

Notice how it narrows down to the line of code where the memory leak came from?

It is even nicer: the return value of the command will be non-zero meaning that if this code was run as part of software testing, you could automagically flag the code as being buggy.

While you are at it, you can add other sanitize flags such as -fsanitize=undefined to your code. The undefined sanitizer will warn you if you are relying on undefined behavior as per the C or C++ specifications.

These flags represent significant steps forward for people programming in C or C++ with gcc or clang. They make it a lot more likely that your code will be reliable.

Really, if you are using gcc or clang and you are not using these flags, you are not being serious.

22 thoughts on “No more leaks with sanitize flags in gcc and clang”

    1. The caveat is that they are only available on recent versions of the compilers but I stress that they are no longer “experimental” or “bleeding edge”. They work out-of-the-box without fiddling, without unnecessary bugs.

      I don’t know whether they work for all targets, but I suspect that they must.

  1. Will this functionality find memory allocations in more complex scenarios? Like when you allocate mem at some point, pass it around the app and the forget about deleting?

    BTW: Is there something similar for MSVC? You can use clang with VS 2015… so maybe that way you can take advantage of it somehow?

    1. Will this functionality find memory allocations in more complex scenarios? Like when you allocate mem at some point, pass it around the app and the forget about deleting?

      Of course. Though it will only tell you where the memory was allocated, not where it should have been freed.

      BTW: Is there something similar for MSVC? You can use clang with VS 2015… so maybe that way you can take advantage of it somehow?

      I don’t think you can “use clang with VS 2015”. As far as I can tell, Microsoft only allows you to use the clang parser. These sanitizers have to do with the generated code, not merely the parser. So it is different.

      1. aaa… right, so the sanitizers cannot be invoked from VS and thus you cannot use this feature.
        I hope VS will create something similar soon…

  2. It’s a bad idea to run with these sanitizers outside of testing environments though, definitely not in production.

    As far as I’m aware there’s been no effort to ensure the security of the sanitizer runtimes themselves, so even if they protect against memory bugs in application code, there are pretty huge security holes in the runtimes. See: http://seclists.org/oss-sec/2016/q1/363

    They’re great for testing though (we run address-sanitizer builds as part of our regular testing).

    1. It’s a bad idea to run with these sanitizers outside of testing environments (…)

      Though I was maybe not sufficiently clear in my blog post, I meant to refer to these sanitizers as superior alternatives (or complements) to other testing and debugging tools like valgrind.

      However, since they can help produce better code, I think that they may end up generating more secure software.

  3. hello Daniel, I would like to ask you a question. Do you know why the AddressSanitizer would be taking a whole different set of libraries.

    For instance, I was trying to recreate strcmp, but what I realized is that compiling it normally it just gives me the difference, but with -fsanitize=address it gives me 1, 0, -1 outputs.

    Thanks

    1. Do you know why the AddressSanitizer would be taking a whole different set of libraries.

      I very much doubt that it is what it is doing.

      I was trying to recreate strcmp, but what I realized is that compiling it normally it just gives me the difference, but with -fsanitize=address it gives me 1, 0, -1 outputs.

      Can you post your code?

      1. Here is my source in the left and the two different outputs in the right:
        http://imgur.com/uU2SZyB

        You can clearly see that output of libc strcmp changes from difference to hardcoded outputs of 1, 0, -1 only when the -fsanitize=address is used.

        Btw, this is my testfile:

        #include
        #include
        #include “libft.h”
        #include

        int a, b, i, n;
        char *ra, *rb;

        i = 0;
        n = 1000;
        while (i <= n)
        {
        if (i < n)
        {
        ra = strdup(ft_itoa(arc4random()));
        rb = strdup(ft_itoa(arc4random()));
        }
        else
        {
        ra = "cba";
        rb = "cba";
        }
        a = ft_strcmp(ra, rb);
        b = strcmp(ra, rb);
        if (a != b)
        printf("\033[1m\033[31m[ FAIL ]\x1b[0m: str1: [%s] \t| str2: [%s] \t| ft_strcmp: %d\t| strcmp: %d\n", ra, rb, a, b);
        else
        printf("\033[1m\033[32m[ OK ]\x1b[0m: str1: [%s] \t| str2: [%s] \t| ft_strcmp: %d\t| strcmp: %d\n", ra, rb, a, b);
        i++;
        }
        }

  4. For future readers, here a code sample to reproduce the issue:

    #include <stdio.h>
    #include <string.h>
    
    int main() {
      const char * ra = "1375154539";
      const char * rb = "-497308599";
      printf("%d \n", strcmp(ra, rb));
    }
    
    1. Were you able to get a different output as well with the -fsanitize=address?
      I am on OSX 10.11.6 btw

      And this is the configuration of gcc/clang in this machine:
      Configured with: –prefix=/Applications/Xcode.app/Contents/Developer/usr –with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/include/c++/4.2.1
      Apple LLVM version 8.0.0 (clang-800.0.38)
      Target: x86_64-apple-darwin15.6.0
      Thread model: posix
      InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

      1. Yes. I am able to reproduce the issue and it can be explained by looking at the code of the sanitizer in LLVM:

        https://github.com/llvm-mirror/compiler-rt/blob/35f212efc287a7b582afcb41d86bdff7a29e7367/lib/sanitizer_common/sanitizer_common_interceptors.inc#L757

        As you can see in this code, the sanitizer has its own implementation of the memcmp function. It calls CharCmpX which you can find in the file above, and that returns -1, 0, 1.

        There is a set of functions that are re-implemented with various safety checks in this manner.

        So it is not loading a whole other library, it is simply the compiler handling these functions as special cases.

        Note that if your code relied on getting specific values out of memcmp, then it was wrong as per the standard.

        1. Thank you very much for this answer.
          Everyone else I asked was handwaving it or not really caring about this by just saying that “you should not use strcmp that way anyways, so why bother”.

          But now my next question is, what would they decide to intercept strcmp and the other functions, is it really a security risk to be doing s1 – s2?

          1. But now my next question is, what would they decide to intercept strcmp and the other functions, is it really a security risk to be doing s1 – s2?

            It is definitively wrong for your code to assume a specific implementation of strcmp.

            1. Yeah, I get that, so are you saying that that the reason that they intercept the strcmp by addresssanitizer is to check if someone is using the strcmp implementation improperly?

              *Scratching my head*

              I would like to know what is the rationale for the devs to add these intercepts:

              #if SANITIZER_INTERCEPT_STRCMP
              static inline int CharCmpX(unsigned char c1, unsigned char c2) {
              return (c1 == c2) ? 0 : (c1 < c2) ? -1 : 1;
              }

              I am interested in the decision making process and the reasons behind them, thanks!

              1. I guess you would want to see…

                int CharCmpX(unsigned char c1, unsigned char c2) {
                  return c2 - c1;
                }
                

                As far as I can tell, the result of this function is not well defined in C. The subtraction is fine, but the assignment to a signed integer is implementation dependent.

                1. Btw my colleagues in school are telling me to just not use addresssanitizer and to use “leaks” or “valgrind” instead.

                  And that AddressSanitizer actually doesn’t work in OSX.

                  Thank you a lot for your patience.
                  What would you be ur input on that?

                  1. I comment on valgrind in my blog post and why I think that using the sanitizers in your compiler are better. And yes, the sanitizers work under macOS, they are officially supported by Apple (as of Xcode 7).

                    1. Yes, I just wanted to reconfirm, thank you very much.
                      If you have a bitcoin address, let me tip you!

                      I learned a lot!

Leave a Reply

Your email address will not be published. Required fields are marked *