Caching hash values for speed (Swift-language edition)

In my posts Should you cache hash values even for trivial classes? and When accessing hash tables, how much time is spent computing the hash functions?, I showed that caching hash values could accelerate operations over hash tables, sets and maps… even when the hash tables do not fit in CPU cache.

To be clear, I do not mean to say that it is necessarily the computation of the hash values that is the bottleneck, however the whole computation, including the latency of the operations, slows you down more than you would think, even when dealing with out-of-cache data structures.

In my earlier posts, I used the Java programming language. Java already relies on precomputed hash values in its standard library. Indeed, Java strings have precomputed hash values.

It is always prudent to check observations using different programming languages. So I decided to reproduce a similar test using the Swift programming language.

I create a trivial class containing three integer values just like I did in Java:

class Triple :  Hashable, Equatable {
      let x, y, z : Int
      init(x: Int, y:Int, z:Int) {
        self.x = x
        self.y = y
        self.z = z
      }
      final var hashValue: Int {
          return x + 31 &* y + 961 &* z
      }
}

Then I recreate a slightly different version where the value of the hash value is precomputed:

class BufferedTriple :  Hashable, Equatable {
      let x, y, z : Int
      private let hash : Int
      init(x: Int, y:Int, z:Int) {
        self.x = x
        self.y = y
        self.z = z
        hash =  x + 31 &* y + 961 &* z
      }
      final var hashValue: Int {
          return hash
      }
}

My benchmark involves creating a large set of these points, and checking how many of 100 other points are in this set. The code is simple:

for i in a {
          if bs.contains(i) {
            counter += 1
          }
}

I ran this benchmark using Linux, Swift 4.1 and a Skylake processor. Roughly speaking, in my tests, the buffered version (that precomputes the hash values) is about twice as fast:

TripleBufferedTriple
20 us12 us

But maybe it takes much longer creating BufferedTriple instances? In fact, no. It takes about 1.5 us to construct the 100 instances, whether they are Triple and BufferedTriple. It takes slightly more time to construct the BufferedTriple instances but the difference is less than 0.2 us.

My code is available.

My point is not that you should always or often precompute hash values. There are obvious downsides to this memoization approach even if it did not stop the Java engineers from doing it for all strings. Consider this instead: if you think that when working with large, out-of-cache, data structures, computational speed is not important, your mental model of software performance is incomplete.

Further reading: For greater speed, try batching your out-of-cache data accesses

Published by

Daniel Lemire

A computer science professor at the Université du Québec (TELUQ).

4 thoughts on “Caching hash values for speed (Swift-language edition)”

  1. Well this is expected since you didn’t include the extra time it takes to init the BufferedTriples, no? The more you get to re-use the same instances the more beneficial this is.

    1. you didn’t include the extra time it takes to init the BufferedTriples, no?

      I did not. However, once you have the hash value, you can re-use many times.

      I do not advocate pre-computing the hash values in general.

  2. Hi Daniel, I find your post interesting and I observed similar measurements. However, you still have to compute the hash value, be it in the constructor of your object or at lookup time. So unless you reuse the same object instances over and over, you will just move the computation on a different code path. In the last few cases where I had to profile hashmap lookups, it was with caches. I would check if a given key object (tuple of something) would exists in the cache. In this typical use case, keys are rarely available in final form and must be constructed at the point of lookup and dominate execution time.

    1. Though you are correct, I think it is a bit less trivial than what your arguments might imply. If you try running my tests (Swift runs well on Linux), you’ll notice that I record the time required to construct Triples and BufferedTriples and the difference is small.

Leave a Reply

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax