Predicting the truncated xorshift32* random number generator

Software programmers need random number generators. For this purpose, the often use functions with outputs that appear random. Gerstmann has a nice post about Better C++ Pseudo Random Number Generator. He investigates the following generator:

uint32_t xorshift(uint64_t *m_seed) {
  uint64_t result = *m_seed * 0xd989bcacc137dcd5ull;
  *m_seed ^= *m_seed >> 11;
  *m_seed ^= *m_seed << 31;
  *m_seed ^= *m_seed >> 18;
  return (uint32_t)(result >> 32ull);
}

This “truncated xorshift32*” function returns 32-bit “random” integers, and takes in a 64-bit state function. Each time you call the function, the state is updated, so that following random integers will vary. Thus the state and the returned random numbers are distinct concepts. The PCG family of random number generators also uses this nice trick.

Gerstmann asks whether the generator is “predictable” and writes “Unknown?” as an answer. What is the missing answer? The answer is that it is predictable.

What does predictable means?

Suppose that I tell you that the first random number generated is 1 and the second is 2… can you infer what the state is? If you try to setup the probably mathematically, you may find that the problem is quite vexing.

But, in fact, it is easy. I wrote a small program that gives you the answer in 4 seconds, using brute force. And once you know what the state is, you can predict all following random integers.

How does it work? Simply put, from the first 32-bit output of the function (1 in my case), you know the equivalent of 32 bits of the state. Thus you only have a bit over 4 billion possibilities. That’s too much for a human being, but remember that your processor does billions of instructions per second… so 4 billion possibilities is not very many.

My source code is available.

To be clear, it is not an argument against this particular generator.

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

Leave a Reply

Your email address will not be published.

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

You may subscribe to this blog by email.