Decoding over 4 billion integers per second in C

Programmers routinely work with lists of integers. We recently showed how to compress such lists of integers close to their entropy, while being able to decompress billions of integers per second.

To ensure anyone could do it, we published the FastPFOR C++ library. We also published the JavaFastPFOR Java library and there is even a corresponding Go library.

However, I wanted to provide also a low-level C library that advanced programmers could embed deep in their own software without the inconvenience of a bulky C++ research library.

So we wrote the SIMDComp library. It is a minimalist C library. On a recent PC, it will decompress integers at over 4 billion integers per second: less than one CPU cycle per integer. We use a liberal open source license so it should be suitable for all your projects.

To test it out for yourself, grab a copy, and type “make example; ./example”.

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

8 thoughts on “Decoding over 4 billion integers per second in C”

  1. In my tests SIMD bitpacking offer no speed advantage over optimized scalar bitpacking when used with large buffers (see simplebenchmark in FastPFor). This is valable for most applications (ex. inverted index). A realistic benchmark should compare SIMD/Scalar bitpacking only on large buffers.

  2. @powturbo

    If you are going to take the data from RAM, bring it all the way to L1 cache, load it in registers, then push it out all the back to RAM… you are IO bound… your CPU runs empty and so, saving CPU cycles becomes irrelevant. To make things worse, you can pretty much forget about using more than one core because your L3 cache is going to be overwhelmed by one core.

    So? So you avoid decompressing whole arrays to RAM.

    We have demonstrated directly the benefit of SIMD bit packing in our latest paper (see

  3. If you’re out of disk-space, is there a way to handle updates in a way that won’t require additional scratch space?

  4. @Garen

    This particular library does not handle disk storage at all (by design). However, there is no particular problem with updates and this library. In fact, it compresses super fast so recompressing updating blocks should be quite fast.

Leave a Reply

Your email address will not be published.

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    Markdown is turned off in code blocks:
     [This is not a link](

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see

You may subscribe to this blog by email.