The surprising cleverness of modern compilers

I wanted to know how a modern C compiler like clang would process the following C code:

#include <stdint.h>
int count(uint64_t x) {
  int v = 0;
  while(x != 0) {
    x &= x - 1;
  return v;

Can you guess?

popcntq	%rdi, %rax

That is right. A fairly sophisticated C function, one that might puzzle many naive programmers compiles down to a single instruction. (Tested with clang 3.8 using -O3 -march=native on a recent x64 processor.)

What does that mean? It means that C is a high-level language. It is not “down to the metal”. It might have been back when compilers were happy to just translate C into correct binary code… but these days are gone. One consequence of the cleverness of our compilers is that it gets hard to benchmark “algorithms”.

In any case, it is another example of externalized intelligence. Most people, most psychologists, assume that intelligence is what happens in our brain. We test people’s intelligence in room, disconnected from the Internet, with only a pencil. But my tools should get as much or even more credit than my brain for most of my achievements. Left alone in a room with a pencil, I’d be a mediocre programmer, a mediocre scientist. I’d be no programmer at all. And this is good news. It is hard to expand or repair the brain, but we have a knack for building better tools.

16 thoughts on “The surprising cleverness of modern compilers”

  1. Note, however, that this optimization in LLVM is highly-specialized, and it pattern-matches against the exact implementation you gave and turns it into a popcnt. There is no magic going on, and any number of semantics-preserving rewrites of the C code will cause LLVM to miss the optimization, because the pattern isn’t recognized any longer.

  2. I probably wouldn’t understand an in-depth explanation, but I would still like to know – at least on superficial level – how clang does this. It seems to me that an extremely sophisticated algorithm would be needed to achieve this.

    What if you change the code slightly? Like change x != 0 to x != 1 or something.

    1. It does that by transforming the AST in a normal form and then performing some pattern matching on that, trying to find common subsection of codes it can optimize. It can work wonderfully but all these patterns are handcrafted and to be really useful they often require the programmer to know how to write the code the make it easy for the compiler to recognize them.

  3. That’s not magic when you have a basic understanding on hamming weight.

    C is a general-purpose language. You can use it as you like.

    Disable all compiler optimization and than look at the asm code.

  4. And how many CPU cycles does it take for popcnt to complete? Compiler might save on program size, but you’re ultimately dependent on microcode for non-trivial instructions. That’s not magic either.

Leave a Reply

Your email address will not be published. Required fields are marked *