The myth of signed arithmetic wraparound

Pop quiz time! Suppose you’re on a machine where int is 32 bits wide and thus INT_MAX, the maximum possible value for an int, is 2147483647. What is the output of the following C program?

#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
 
int main (int argc, char *argv[])
{
        int sum = INT_MAX + 1;
        printf("INT_MAX + 1 == %d\n", sum);
        return EXIT_SUCCESS;
}

If you’re like most C programmers, you probably answered the following:

INT_MAX + 1 == -2147483648

And if pressed to explain how you could add two positive numbers together and wide up with a negative number, you’d explain that it’s because of how two’s complement arithmetic works:

 2147483647 == 0111 1111 1111 1111 1111 1111 1111 1111
+         1 == 0000 0000 0000 0000 0000 0000 0000 0001
-----------    ---------------------------------------
-2147483648 == 1000 0000 0000 0000 0000 0000 0000 0000

It turns out that you’d be wrong. In C, the behavior of signed arithmetic overflow — that is, what happens when the result of an arithmetic operation is outside the range of what’s representable with that data type — is undefined. You might witness the wraparound behavior you’d expect from two’s complement arithmetic, but it’s certainly not guaranteed, since C doesn’t require the use of two’s complement arithmetic.

But, you might protest, that’s an awfully pedantic point to make, since pretty much any system you’ll encounter these days uses two’s complement arithmetic, with alternative representations like one’s complement or sign-and-magnitude being extremely difficult to find outside of legacy hardware. Two’s complement and arithmetic and the wraparound behavior it exhibits may not technically be part of the C standard, but in any practical application it’s a de facto standard because of the use of two’s complement arithmetic by the underlying processor, right?

Actually, no, not at all; real-world compilers will exploit the fact that the C standard does not define the behavior of integer arithmetic overflow when it’s optimizing your program. As a result, you can’t at all rely on wraparound happening. For instance, consider the following program:

#include <sys/stat.h>
#include <sys/types.h>
#include <fcntl.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
 
int process_something (int fd, int size)
{
        // Catch integer overflow
        if (size > size + 1)
                abort();
 
        char *string = malloc(size + 1);
        int count = read(fd, string, size);
        string[count] = 0;
        int len = strlen(string);
        free(string);
        return len;
}
 
int main (int argc, char *argv[])
{
        int fd = open("/etc/issue", O_RDONLY);
        int len = process_something(fd, INT_MAX);
        close(fd);
        return EXIT_SUCCESS;
}

The function process_something wants to allocate size + 1 bytes and is trying to be careful not to accidentally pass a negative value to malloc, which would happen if size == INT_MAX. The code tries to exploit how two’s complement works by checking size > size + 1 to detect integer overflow, and aborting if that’s the case.

This code does not work when compiled with gcc, depending on the optimization level. At -O2 and above, gcc realizes that size > size + 1 should never be true — think about it mathematically — and so eliminates the entire if block. What about if there is integer overflow? The C standard says the behavior of the program is undefined if that happens, so the compiler doesn’t need to worry about it.

When compiled at optimization level -O0 or -O1, the output of the program is:

Aborted

Which is what happens when abort gets called. This is the behavior the programmer expected.

But when compiled at optimization level -O2 or -O3, the code that calls abort gets optimized away, and this happens (or at least, could happen) instead:

Segmentation fault

Why? On x86 Linux, malloc returns NULL when it receives a negative argument, since size + 1 is negative due to two’s complement wraparound. Since malloc returned a NULL pointer, a segmentation fault happens when the following code attempts to dereference it.

This is at least what happens on my computer. But since the program invokes undefined behavior, it could do anything at all and be considered correct, from a technical standpoint. Other possibilities are: successfully reading the input file anyway, failing silently without crashing, sending threatening e-mails to your boss, drinking your coffee when you’re not looking, or making demons fly out of your nose. Undefined behavior is undefined. Anything is possible.

The point is, relying on undefined behavior is a bad idea, even if you think you know what’s actually going to happen. Part of the reason that languages like C provide areas where undefined behavior can occur, as opposed to languages like Java that try to specify what must happen in every circumstance, is that it gives compilers a lot more flexibility in optimizing code. A compiler can assume, for example, that a < a + 1 without having to first prove that a will never be the maximum possible value for its type; if process_something in our previous example were in a separate library, the compiler would have no way of knowing what a could be at runtime, and would have to assume the worst.

For more about the gotchas of undefined behavior in C, read What Every C Programmer Should Know About Undefined Behavior (part 2) (part 3) on the LLVM Project Blog, which inspired this post and showed me why I would have answered the question I opened this post with incorrectly.

7 Responses

  1. That’s a silly example … surely the problem is in the API, which should look like this:

    int process_something(int fd, size_t size)

  2. Yes, this is a bit of a contrived example to illustrate how when a program invoked undefined behavior, the compiler is free to do whatever it wants, and as a result changing the optimization level can change the run-time behavior of the program. The example doesn’t work with the more sensible size_t, since the C standard does specify that wraparound happens with unsigned integral types.

  3. elseiffor!

  4. “On x86 Linux, malloc returns NULL when it receives a negative argument”

    Not exactly. malloc() *cannot* receive a negative argument, since its parameter is of type size_t, an unsigned type.

    In the call malloc(size + 1), size is of type int, and in this case has the value INT_MAX. size + 1 is still of type int, but has an undefined value (worse than that, the program’s behavior is undefined). *If* size + 1 yields some int value, that value will be converted to size_t. Conversion to an unsigned type is well defined; it wraps around. So without optimization, malloc() is probably called with an argument of 2147483648 (of type size_t).

    And in fact, on my x86 Linux system, when I remove the abort() call, malloc(2147483648)) *doesn’t* return a null pointer. This is because Linux malloc() allocates virtual memory. If I actually tried to use that memory, it wouldn’t be able to map all of it to physical memory, and the OOM killer would start shutting down processes (not necessarily the offending one).

  5. Interesting. On my system (x86_64), if I insert a “printf(“%p\n”,string);” right after the call to malloc, I see that malloc is returning NULL. But if I just call malloc(2147483648) directly, I get a valid pointer. This probably has something to do with the fact that on my system, sizeof(int) == 4 and sizeof(size_t) == 8.

    Even though Linux overcommits memory, it still seems that it will fail on extremely large values passed to it: malloc(18446744073709551615) returns NULL (where 18446744073709551615 is the maximum value of size_t). So there is some point where Linux doesn’t even pretend to be able to satisfy the request.

    Hmm, so what may be happening is that size + 1 gets treated as INT_MIN, which then gets extended to 64 bits and treated as an unsigned value, resulting in a sufficiently enormous value that Linux’s malloc can’t cope with. So in the course of trying not to allocate INT_MAX bytes, the incorrect bounds-checking actually lets it try to allocate some 230 times more memory than that.

  6. That is *exactly* what I was thinking, Paul. Great minds, dude.

  7. Hehe, Renee

Comments are closed.