Your software should follow your hardware: the CLHash example

The new Intel Skylake processors released this year (2015) have been met with disappointment. It is widely reported that they improved over the two years old Haswell (2013) processors by a mere 5%. Intel claims that it is more like 10%.

Intel is able to cram many more transistors per unit area in Skylake. The Skylake die size is about 50% smaller. It also uses less power per instruction and can maybe execute more instructions per cycle.

Still. A 5% to 10% gain in two years is hardly exciting, is it?

A few weeks ago, I reported on a new family of hash functions (CLHash) designed to benefit from the new instructions available in recent processors to compute the “carryless multiplication” (or “polynomial multiplications”). This multiplication used to be many times slower than the regular integer multiplications, and required sophisticated software libraries. It was something that only cryptographers cared about. Today, it is a single cheap CPU instruction that anyone who cares about high performance should become familiar with.

I had not had a chance to test it out with Skylake processors earlier, but today I did. A Haswell processor is able to execute one carryless multiplication every two cycles, the Skylake throughput is double: one instruction per cycle.

What does it mean? It means that hash functions based on carryless multiplications can “smoke” conventional hash functions. For example, the following tables gives us the CPU cycles per input byte when hashing 4kB inputs:

Haswell Skylake Progress (Skylake vs. Haswell)
CLHash 0.16 0.096 1.7× faster
CityHash 0.23 0.23 no faster
SipHash 2.1 2.0 5% faster

So if you stick with your old generic code, there might not be anything exciting for you in the new hardware. Minute gains every few years. To really benefit from new hardware, we need to change our software.

Your software should follow your hardware.

Reference: The clhash C library.

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

3 thoughts on “Your software should follow your hardware: the CLHash example”

  1. How do you write software to take advantage of new instructions? It would seem that either a compiler needs to support it first (and support it well, including optimization), or you have to get down to assembler.

Leave a Reply

Your email address will not be published.

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    Markdown is turned off in code blocks:
     [This is not a link](

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see

You may subscribe to this blog by email.