Bit packing is fast, but integer logarithm is slow

In How fast is bit packing?, we saw how to store non-negative integers smaller than 2N using N bits per integer by a technique called bit packing. A careful C++ bit packing implementation is fast: e.g., over 1 billion integers per second.

However, before you pack the integers, you might need to scan them to determine the number of bits needed (N). Unfortunately, it is a relatively expensive process.

Given a positive integer x, we seek the smallest integer N such that the integer x is less than 2N. The value N is often called the integer logarithm of x.

There are several clever techniques to compute the integer logarithm using portable C code. Yet you can do better using processor-specific instructions. The GNU GCC compiler makes this easy with a special function that counts the number of leading zeros for 32-bit integers (__builtin_clz). Even so, it is relatively slow.

Thankfully, you can avoid computing the integer logarithm of each integer by a simple test involving a right shift:

if((x>>b) !=0)
b = integer_logarithm(x);

With proper loop unrolling, this is nearly as fast as bit packing.

Update: Preston Bannister correctly points out that you can do much better. Simply compute the logical or between all integers and then compute the integer logarithm of the result. It is much, much faster.

To experiment with this problem, I wrote a small program which finds the maximum integer logarithm of a large array of random integers. It then packs the integers using this logarithm.

  • I find that I can pack between 1 billion and 2 billions integers per second.
  • I compute the maximum integer logarithm at a rate of 3 billion integers per second.

When plotting the speeds as functions of the actual maximum integer logarithm, we see that the computation of the logarithm is not sensitive to the value of the actual logarithm, except for the approach based on the __builtin_clz function which is slower when the logarithm is less than 8.

In my tests, I used the GNU GCC 4.6.2 compiler on an Intel core i7 processor. My code is freely available.

Conclusion When packing an array of integers, finding the maximum logarithm can take anywhere from 1/4 to 1/3 of the running time. However, brute force techniques that compute the integer logarithm of every integer are much slower.

Published by

Daniel Lemire

A computer science professor at the Université du Québec (TELUQ).

4 thoughts on “Bit packing is fast, but integer logarithm is slow”

  1. Thanks for the blog and the code! I like your blog.
    You should get a github/bitbucket account so we can browse your past code easier.
    Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax