Science and Technology links (September 22th, 2017)

Mass is preserved. When you lose fat, where does the fat goes? It has to go somewhere. No, it is not transformed in “energy” or “heat”. Neither “energy” nor “heat” have atomic mass. I asked the question on social networks and most people got the wrong answer. This YouTube video gives the correct answer.

I had been assuming that the US consumption of electricity was on the rise. I have gadgets everywhere around me, these use electricity, right? Actually, no, electricity usage, in absolute value, is stagnant in the US, which means that the per capita usage is down:

Economic growth in advanced economies does not require increased energy consumption. Real GDP rose 12.7% in the U.S. between 2008 and 2015. During the same time period, electric consumption declined by 0.3%.

Climate scientists badly failed to predict CO2 emissions:

Global CO2 emission intensity increased despite all major scenarios projecting a decline.

Influential climate scientists have also revised their predictions:

Computer modelling used a decade ago to predict how quickly global average temperatures would rise may have forecast too much warming, a study has found. (…) The Earth warmed more slowly than the models forecast, meaning the planet has a slightly better chance of meeting the goals set out in the Paris climate agreement, including limiting global warming to 1.5C above pre-industrial levels. (…) The study, published this week in the journal Nature Geoscience, does not play down the threat which climate change has to the environment, and maintains that major reductions in emissions must be attained.(…) But the findings indicate the danger may not be as acute as was previously thought.

So you think you should take your antibiotics even after you feel fine? No. Exposing your body to antibiotics longer than needed is counterproductive, as it helps develop antibiotic resistance. Doctor Brad Spellberg is the chief medical officer for the Los Angeles County, and we writes:

There are some chronic infections, such as tuberculosis, where you do indeed have to take long courses of antibiotics, not to prevent resistance but rather to cure the infection. But for most acute bacterial infections, short courses of antibiotics result in equivalent cure rates and with less chance of causing the emergence of antibiotic resistance among the bacteria in and on your body.

Llewelyn et al. (2017) write:

However, the idea that stopping antibiotic treatment early encourages antibiotic resistance is not supported by evidence, while taking antibiotics for longer than necessary increases the risk of resistance.

(Source: HackerNews).

Uber has taken the world by storm, using a small mobile app to allow people to either offer cab services or call a cab service, without the infrastructure that normally supports cab services. The City of London will not renew Uber’s license. The government fears that Uber is a threat to Londoners’ safety and security.

Human beings’ death rate follows a Gompertz’ law which means that the probability that you will die goes up exponentially, year after year. This explains why, even though there are billions of us, none of us get to live beyond 120 years old. Not all living beings follow a Gompertz’ law. Plants do not. However, wooden utility poles do follow a Gompertz’ law just like us.

P. D. Mangan reports that exercise helps keep cancer away:

Exercise prevents cancer (…) A recent meta-analysis found up to 42% less risk for 10 different cancers, including esophageal, liver, lung, kidney, and many other cancers. (…) An interesting question is how exercise prevents cancer, and some recent research sheds light on this. (…) Fat tissue generates cytokines that promote the proliferation of cancer cells, and physical activity diminishes or abolishes the effect, which is dose-dependent, i.e. more exercise means less cancer promoting cytokines. (…) In animals (mice) that were implanted with tumor cells, voluntary running reduced tumor growth by over 60%. The researchers believe that exercise mobilized natural killer (NK) cells, which attack cancer.

Swift as a low-level programming language?

Modern processors come with very powerful instructions that are simply not available in many high-level languages JavaScript or Python. Here are a few examples:

  • Most programming languages allow you to multiply two 64-bit integers and to get the 64-bit results (it is typically written as x = a * b). But the multiplication of two 64-bit integers actually occupies 128 bits. Reasonably enough, most programming languages (including C, Java, Go, Swift, C++,…) do not have 128-bit integers as a standard data type. So these programming languages have no way to represent the result, other than as two 64-bit integers. In practice, they often offer no direct way to get to the most significant 64 bits. Yet 64-bit processors (x64, ARM Aarch64…) have no problem computing the most significant 64 bits.
  • Similarly, processors have specialized instructions to compute population counts (also called Hamming weight). A population count is the number of ones contained in the binary representation of a machine word. It is a critical operation in many advanced algorithms and data structures. You can compute it with shifts and additions, but it is much, much slower than when using the dedicated processors that all modern processors have to solve this problem. And I should stress that you can beat the dedicated instructions with vector instructions.

This disconnect between programming languages and processors is somewhat problematic because programmers have to get around the problem by doing more work, essentially letting the perfectly good instructions that processors offer go to waste. To be clear, that’s not an issue for most people, but most people do not deal with difficult programming challenges.

That is, 95% of all programming tasks can be solved with Python or JavaScript. Then, out of the remaining 5%, the vast majority are well served with something like Java or C#. But then, for the top 1% of all programming problems, the most challenging ones, then you need all of the power you can get your hands on. Historically, people have been using C or C++ for these problems.

I like C and C++, and they are fine for most things. These languages are aging well, in some respect. However, they also carry a fair amount of baggage.

But what else is there?

Among many other good choices, there is Apple’s Swift.

Swift is progressing fast. Swift 4.0 is a step forward. And, in some sense, it is beating C and C++. Let me consider the two problems I mentioned: getting the most significant bits of a product and computing the population count. C and C++ offer no native way to solve these problems. At best, there are common, non-portable, extensions that help.

Swift 4.0 solves both problems optimally in my view:

  • value1.multipliedFullWidth(by: value2).high gets mapped to the mulx instruction on my x64 laptop
  • value.nonzeroBitCount gets mapped to the popcnt instruction on my x64 laptop

(The phrase nonzeroBitCount is a weird way to describe the population count, but I can live with it.)

If you call these operations in a tight loop, they seem to generate very efficient code. In fact, consider the case where you repeatedly call value.nonzeroBitCount over an array:

func popcntArray( _  value  : inout [UInt64] ) -> Int {
    return value.reduce( 0, { x, y in
        x &+ popCnt(y)
        })
}

The compiler does not use popcnt, but rather the more efficient vectorized approach (see Muła et al.). That’s because Swift benefits from the powerful LLVM machinery in the back-end.

I wrote a small Swift 4.0 program to illustrate my points. You can compile a swift program for your hardware using a command such as swiftc myprogram.swift -O -Xcc -march=native. You can then use the lldb debugger to automatically get the assembly produced by a given function.

Conclusion. If you are not targeting iOS, it is crazy to use Swift for high-performance low-level programming language at this time. However, it is getting less and less crazy.

Visiting all values in an array exactly once in “random order”

Suppose that you want to visit all values in an array exactly once in “random order”. You could do it by shuffling your array but it requires some extra storage.

You want your code to use just a tiny bit of memory, and you want the code to be super fast. You do not want to assume that your array size is a power of two.

One way to do it is to use the fact that (a x + b) modulo n will visit all integer values in [0,n) exactly once as x iterates through the integers in [0, n), as long as a is coprime with n. Being coprime just means that the greatest common divisor between a and n is 1. There are fast functions to compute the greatest common divisor between a and n.

A trivial coprime number would be a = 1, but that’s bad for obvious reasons. So we pick a coprime number in [n/2,n) instead. There is always at least one no matter what n is.

Enumerating all coprime numbers in [n/2,n) could get tiresome when n is very large, so maybe we just look at up to 100,000 of them. There is no need to actually store them in memory, we can just select one at random, so it requires very little memory.

To see why the mathematics work, suppose that ( a x + b ) modulo n = ( a x' + b ) modulo n,
then a (x - x') modulo n = 0 which only happens when (x – x’) is a multiple of n
because a and n are coprime. Thus if you map a consecutive range of n
values x, you will get n distinct values ( a x + b ) modulo n.
The choice of the parameter a is critical however: if you set a to 1 or 2,
even if it is coprime with n, the result will not look random.

The running code is ridiculously simple:

    public int getCurrentValue() {
      return ( (long) index * prime + offset ) % ( maxrange);
    }

    public boolean hasNext() {
      return index < maxrange;
    }

    public int next() {
      int answer = getCurrentValue();
      index ++;
      return answer;
    }

There are various ways to optimize this code further (e.g., you can increment a counter instead of doing a multiplication), but it is likely faster than most other approaches people come up with in practice. Of course, it is not really random in the sense that no (good) statistician should accept the result as a fair shuffle of the indexes. Still, it might be “good enough” to fool your colleagues into thinking that it is random.

While my implementation assumes that you are visiting the values in order, you can go back in time, or jump forward and backward arbitrarily.

I make my Java code available. It can be made more elegant, but it should work just fine in your projects.

(As pointed out by Leonid Boytsov, this approach is reminiscent of the Linear congruential generators that are used to produce random numbers.)

If you can find ways to make the result “look” more random without significantly making it slower and without increasing memory usage, please let us know.

You can find ready-made solutions to visit all values in an array with a power of two number of elements. And by restricting your traversal to the subset of elements in [0,n) from a larger virtual array having a power of two size, you will have an alternative to the approach I describe, with the caveat that your main code will require branching. The computational complexity of a call to “next” becomes O(n) whereas I use a small, finite, number of instructions.

Computing the inverse of odd integers

Given x, its (multiplicative) inverse is another value y such that x y = y x = 1. We all know that the multiplicative inverse of x is 1/x and it exists as long as x is non-zero. That’s for real numbers, or at least, rational numbers.

But the idea of a multiplicative inverse is more general.

It certainly fails integers in general. I.e., there is no integer x such that 2 x is 1. But, maybe surprisingly, all odd integers have an inverse if you use normal computer arithmetic. Indeed, when your processor computes x y, then it actually outputs x y modulo 232 or x y modulo 264 depending on whether you use 32-bit or 64-bit instructions. (The value x y modulo 264 can be defined as the remainder of the division of x y by 264.)

Let me briefly explain why there must be an inverse. You can skip this part if you want. Take any odd integer x. Because x is odd, then it is not divisible by 2. In fact, that’s what it means to be odd. But this also means that powers of x will also be odd. E.g., xk is also odd for any integer k. Ok. So xk modulo 264 is never going to be zero. The only way it could be zero is if xk were divisible by 264, but that’s impossible because it is an odd value. At the same time, we only have a finite number of distinct odd values smaller than 264. So it must be the case that xk modulo 264 = xk' modulo 264 for some pair of powers k and k'. Assume without loss of generality that k is larger than k'. Then we have that xk-k' modulo 264 = 1 modulo 264 (I am not proving this last step, but you can figure it out from the previous one). And thus it follows that xk-k'-1 is the inverse of x. If you did not follow this sketch of a proof, don’t worry.

So how do you find this inverse? You can brute force it, which works well for 32-bit values, but not so well for 64-bit values.

Wikipedia has a page on this, entitled modular multiplicative inverses. It points to an Euclidean algorithm that appears to rely on repeated divisions.

Thankfully, you can solve for the inverse efficiently using very little code.

One approach is based on “Newton’s method”. That is, we start with a guess and from the guess, we get a better one, and so forth, until we naturally converge to the right value. So we need some formula f(y), so that we can repeatedly call y = f(y) until y converges.

A useful recurrence formula is f(y) = y (2 - y x ) modulo 264. You can verify that if y is the 64-bit inverse of x, then this will output y. So the formula passes a basic sanity test. But would calling y = f(y) repeatedly converge to the inverse?

Suppose that y is not quite the inverse, suppose that x y = 1 + z 2N for some z and some N that is smaller than 64. So y is the inverse “for the first N bits” (where “first” means “least significant”). That is, x y modulo 2N = 1.

It is easy to find such a y for N greater than zero. Indeed, let y = 1, then x y = 1 + z 21.

Ok, so substituting x y = 1 + z 2N in
y (2 - y x ) modulo 264, we get
y (2 - ( 1 + z 2N) ) modulo 264
or
y (1 - z 2N ) modulo 264.
So I set y' = f(y) = y (1 - z 2N ) modulo 264.
What is x y'? It is (1 + z 2N ) (1 - z 2N ) modulo 264
or (1 - z2 22 N ) modulo 264.

That is, if y was the inverse “for the first N bits”, then y' = f(y) is the inverse “for the first 2 N bits”. In some sense, I double my precision each time I call the recurrence formula. This is great! This means that I will quickly converge to the inverse.

Can we do better, as an initial guess, than y = 1? Yes. We can start with a very interesting observation: if we use 3-bit words, instead of 32-bit or 64-bit words, then every number is its own inverse. E.g., you can check that 3*3 modulo 8 = 1.

(Marc Reynolds pointed out to me that you can get 4 bits of accuracy by starting out with x * x + x - 1 .)

So a good initial guess is y = x, and that already buys us 3 bits. The first call to the recurrence formula gives me 6 bits, then 12 bits for the second call, then 24 bits, then 48 bits, then 96 bits, and so forth. So, we need to call our recurrence formula 4 times for 32-bit values and 5 times for 64-bit values. I could actually go to 128-bit values by calling the recurrence formula 6 times.

Here is the code to compute the inverse of a 64-bit integer:

uint64_t f64(uint64_t x, uint64_t y) {
  return  y * ( 2 - y * x );
}

static uint64_t findInverse64(uint64_t x) {
   uint64_t y = x; 
   y = f64(x,y);
   y = f64(x,y);
   y = f64(x,y);
   y = f64(x,y);
   y = f64(x,y);
   return y;
}

I wrote a complete command-line program that can invert any odd number quickly.

Each call to the recurrence formula should consume about 5 CPU cycles so that the whole function should take no more than 25 cycles or no more than the cost of a single integer division. Actually, it might be cheaper than a single integer division.

Because of the way we construct the inverse, if you somehow knew the 32-bit inverse, you could call the recurrence formula just once to get the 64-bit inverse.

How did we arrive at this formula ( y (2 - y x ))? It is actually a straight-forward application of Newton’s method as one would apply it to finding the zero of g(y) = 1/y - x. So there is no magic involved.

My code seems to assume that I am working with unsigned integers, but the same algorithm works with signed integers, and in binary form, it will provide the same results.

Reference and further reading: Granlund and Montgomery, SIGPLAN Not. (1994). Some people point me at On Newton-Raphson iteration for multiplicative inverses modulo prime powers by Dumas (2012).

Credit: Marc Reynolds ask on Twitter for an informal reference on computing the multiplicative inverse modulo a power of two. It motivated me to write this blog post.

The Xorshift1024* random number generator fails BigCrush

In software, we use random number generators to simulate randomness. They take the form of functions which return numbers that appear random. To test their quality, we apply statistical tests. The goal standard is a statical test called BigCrush. It tests that quality of 32-bit random values.

In an earlier post, I reported that contrary to what you can read on the Xorshift Wikipedia page, the Xorshift128+ random number generator fails BigCrush. This is somewhat significant because Xorshift128+ has been widely adopted.

The Xorshift Wikipedia page also states that a more expensive generator, xorshift1024*, passes BigCrush. So I wondered, does it, indeed, pass BigCrush?

So I used my testing framework to run BigCrush. I use four different “seeds” and I only worry about a failure if it occurs with all four seeds, and with an excessively improbable p-value. Because xorshift1024* generates 64-bit outputs, and BigCrush requires 32-bit inputs, I only keep the 32 least significant bits of each word produced by xorshift1024*.

Here are my results:

 ./summarize.pl testxorshift1024star*.log
reviewing xorshift1024star lsb 32-bits
Summary for xorshift1024star lsb 32-bits (4 crushes):
- #81: LinearComp, r = 29: FAIL!! -- p-values too unlikely (1 - eps1, 1 - eps1, 1 - eps1, 1 - eps1) -- ALL CRUSHES FAIL!!

reviewing xorshift1024star lsb 32-bits (bit reverse)
Summary for xorshift1024star lsb 32-bits (bit reverse) (4 crushes):
- #80: LinearComp, r = 0: FAIL!! -- p-values too unlikely (1 - eps1, 1 - eps1, 1 - eps1, 1 - eps1) -- ALL CRUSHES FAIL!!

So xorshift1024* fails BigCrush systematically when providing the values as is (log file 1, 2, 3, 4), and it fails again with reversing the bit order (log file 1, 2, 3, 4).

So the Wikipedia entry is misleading. Both xorshift128+ and xorshift1024* fail BigCrush.

My code is available for review, and you can rerun the tests for yourself.

How fast are malloc_size and malloc_usable_size in C?

When programming in C, we allocate memory dynamically using the malloc function, by passing how many bytes we wish to allocate. If successful, the function returns a pointer to the newly allocated memory. Later we can free the allocated memory with the free function.

In general, unless you are keeping track of it yourself, you cannot recover the number of bytes you wanted to allocate. That’s because the malloc function can allocate more memory than you request. For example, if you call malloc(1), you should not assume that only one byte was allocated.

However, you can ask the system to report the number of allocated bytes. It will always be at least as much as you requested but could be more (possibly a lot more). Under macOS, you call malloc_size and under Linux you call malloc_usable_size.

I have never seen these functions actually used in the wild. But they could possibly be useful. For example, if you allocate memory for a string buffer, why would you painfully keep track of how much memory you wanted when you can simply request the potentially more useful allocated number of allocated bytes?

At a minimum, these functions could help with debugging: you could easily check dynamically that you have enough memory allocated throughout the life of your problem.

One reason to avoid these functions is that they are not standard. This means that if your code relies on them, it is possible that you could have difficulties porting your code to a new platform in the future.

Another good reason to track the number of bytes yourself is performance.

So how fast are they?

macOSLinux
Time (CPU cycles)~50~10

On Linux, malloc_usable_size is fast enough for almost every purpose, except maybe for your most performance-critical code. On macOS, malloc_size should be used sparingly.

This matches a performance pattern I have observed: the standard C libraries under Linux have very fast memory allocation compared to Apple platforms.

My source code is available.

Credit: This blog post was inspired by Slava Pestov.

Science and Technology links (September 15th, 2017)

An actual robot that can write in Japanese passed university entrance tests in Japan. How well did it do? It bested 80% of all human candidates. It is not good enough to enter the very best Japanese university, but it is good enough to enter in many other universities. The robot’s artificial intelligence shows better reading comprehension, as measured by the tests, than most human beings. The lead designer stresses that the robot does not really understand what it reads, but then it seems that something like 30% of high school students, in Japan, can’t pass basic reading comprehension tests. I stress that the robot had to use the same forms that the student used. It had to write with a pen.

China uses facial scanners to monitor university dorms. To enter the dorms, students must have their card and they must be recognized by the camera.

This week, Apple released the iPhone X. You unlock the phone through facial recognition (by looking at the phone). The new phones have a new processor, the A11 Bionic. According to some reports, it matches recent high-end Intel chips in performance, for both single-threaded and multi-threaded performance. It is hard to say whether this comparison is meaningful, but the results do indicate that it is a significantly faster processor than what the older iPhone 7 uses (Apple refers to a 30% gain). The term “bionic” is a reference to the support, in silicon, for machine-learning (or AI) code. Also, Apple took to designing its own graphics processors, instead of licensing an external design. Apple also released a version of its watch that can be used to make phone calls, without relying on a nearby phone.

Another point worth stressing is that the new iPhones are dropping support for 32-bit applications, at the hardware level. It might allow Apple to optimize the silicon somewhat, though I am unclear as to how much of a benefit this move provides in terms of performance and/or power usage. In any case, this means that it is all 64-bit applications from now on. In comparison, Microsoft Windows is still available in a 32-bit edition.

The current therapies for tooth decay are rather invasive. Scientists in Belfast believes that tooth decay could be reversed using the body’s own repair mechanism, using aspirin.

The next step is to go and try to figure out how you are going to apply the aspirin to the teeth, to regenerate the dentine and to replace the need for fillings.

And before you try it: putting an actual aspirin on your tooth won’t work.

Wired reports that we have achieved a breakthrough in human-computer interface. Using an armband, you can interact with your computer which implies that you can type on a keyboard without having an actual keyboard. This product from CTRL-Labs sounds like a more powerful product than the Myo armband that you can get cheaply on Amazon. However, the CTRL-Labs armband is unreleased and the Wired article reads a bit like an ad.

Scholarship is conservative, tinkering is progressive

We are not rational beings. We cannot string simple sequences of logical arguments without making mistakes. We cannot reason probabilistically.

There is ample evidence that rational arguments fail to convince. That’s not surprising given how poor we are at evaluating rational arguments.

We teach what we have learned. We repeat what we have heard.

People hold up to their conservative models… science and scholarship are fundamentally conservative. Being a scholar is a bet against progress. If you write a book about a given field, and the field gets fundamentally transformed, all your hard earned expertise becomes worthless.

And yet we make progress. Not on all fronts, and not always fast enough… but we move forward. Kuhn advocated that progress happens through “paradigm shifts”.

How is that possible if every intellectual is a conservative?

Not everybody, not every culture, is betting against progress. For every theoretician that insists on keeping his model, there is an engineer or an entrepreneur whose mind is set on constructing reality.

To construct a new reality, you cannot rely on a comforting theoretical framework, as these tend to be fundamentally conservative. So you have to accept that you will be tinkering.

Long before we had Maxwell’s laws, we had tinkerers designing applications for electricity. Long before we had thermodynamics, we had useful engines.

How many people place their bet on progress? This depends very much on your culture. You can very easily make it untenable to be a tinkerer if you ensure that progress cannot happen. Somehow the Western civilization embraced tinkerers, at least in part.

But tinkering is dangerous. Inventing electricity can make the world safer or riskier. You can’t tell which one ahead of time. You have to try.

It is not immediately obvious why a civilization would allow tinkering. But I think it has to do with the fact that without some tinkering, civilizations tend to be fragile, in Nassim Taleb’s sense.

We do not live in a static world. There are strong external shocks. We deplete our ressources, we face new diseases, we enter into unforeseen wars…

In comparison, tinkering causes harm to the civilization, but at a fairly low intensity. And a tinkering civilization becomes more robust because it is kept on its toes. That’s ultimately a lot better than waiting for a massive collapse.

The Xorshift128+ random number generator fails BigCrush

In software, we generate randomness with algorithms called “pseudo-random number generator”, sometimes simply called “random number generator” or RNG. A popular random number generator is xorshift128+: it is used by many JavaScript engines. It was designed by an influential computer science professor, Sebastiano Vigna, who has done a lot of great work. I suspect that the JavaScript engineers made a fine choice by selecting xorshift128+.

How can you tell that your random number generator has a good quality? A standard test is TestU01 designed by professor L’Ecuyer from the University of Montreal. TestU01 comes with different sets of tests, but BigCrush is the gold standard. A good random number generator should pass BigCrush when you pass the values as-is, as well as when you reverse the bits produced by the random number generator. Indeed, Vigna writes about another popular random number generator:

A recent example shows the importance of testing the reverse generator. Saito and Matsumoto [2014] propose a different way to eliminate linear artifacts (…) However, while their generator passes BigCrush, its reverse fails systematically (…) which highlights a significant weakness in its lower bits.

Passing BigCrush, even after reversing the bits, is not an impossible task. The SplittableRandom class in Java appears to pass (Steele et al., 2014), and so does the PCG family. And, of course, all cryptographic random number generators pass BigCrush, irrespective of the order of the bits.

On Wikipedia, we can read about xorshift128+ passing BigCrush robustly:

the following xorshift128+ generator uses 128 bits of state (…) It passes BigCrush, even when reversed.

Is Wikipedia correct? It offers a reference by Vigna who invented xorshift128+ (Vigna, 2017). Vigna’s journal article states:

(…) we propose a tightly coded xorshift128+ generator that does not fail any test from the BigCrush suite of TestU01 (even reversed) (…) xorshift128+ generator (…) is currently the fastest full-period generator we are aware of that does not fail systematically any BigCrush test (not even reversed)

That seems like a pretty definitive statement. It admits that there might be statistical failure, but no systematic one. So it would seem to support the Wikipedia entry.

The xorshift128+ code is not difficult:

#include <stdint.h>
uint64_t s[2];
uint64_t next(void) {
  uint64_t s1 = s[0];
  uint64_t s0 = s[1];
  uint64_t result = s0 + s1;
  s[0] = s0;
  s1 ^= s1 << 23; // a
  s[1] = s1 ^ s0 ^ (s1 >> 18) ^ (s0 >> 5); // b, c
  return result;
}

Can we check the claim? The BigCrush benchmark is available as free software, but it is a pain to set it up and run it. So I published a package to test it out. Importantly, you can check my software, compile it, run it, review it… at your leisure. I encourage you to do it! I use Vigna’s own C implementation.

Statistical tests are never black and white, but you can use p-values to arrive at a reasonable decision as to whether the test passes or not. The BigCrush implementation does this analysis for us. To make things more complicated, random number generators rely on an initial “seed” that you input to initiate the generator. Provide a different seed and you will get different results.

So I did the following when running BigCrush:

  • I use four different input seeds. I only care if a given test fails for all four seeds. I use 64-bit seeds from which I generate the needed 128-bit seed using another generator (splitmix64), as recommended by Vigna. I just chose my seeds arbitrarily, you can try with your own if you have some free time!
  • I only care when BigCrush gives me a conclusive p-value that indicates a clear problem.
  • I use the least significant 32 bits produced by xorshift128+, in reversed bit order. (BigCrush expects 32-bit inputs.)

Let us review the results.

seed: 12345678

 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):
       Test                          p-value
 ----------------------------------------------
 68  MatrixRank, L=1000, r=0          eps  
 71  MatrixRank, L=5000               eps  
 80  LinearComp, r = 0              1 - eps1
 ----------------------------------------------

seed: 412451

The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):
       Test                          p-value
 ----------------------------------------------
 68  MatrixRank, L=1000, r=0          eps  
 71  MatrixRank, L=5000               eps  
 80  LinearComp, r = 0              1 - eps1
 -----------------------------------------------

seed: 987654

The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):
       Test                          p-value
 ----------------------------------------------
 68  MatrixRank, L=1000, r=0          eps  
 71  MatrixRank, L=5000               eps  
 80  LinearComp, r = 0              1 - eps1
 ----------------------------------------------

seed: 848432

 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):
       Test                          p-value
 ----------------------------------------------
 24  ClosePairs mNP1, t = 9          9.2e-5
 24  ClosePairs NJumps, t = 9        1.0e-4
 68  MatrixRank, L=1000, r=0          eps  
 71  MatrixRank, L=5000               eps  
 80  LinearComp, r = 0              1 - eps1
 ----------------------------------------------

Analysis

Xorshift128+ fails BigCrush when selecting the least significant 32 bits and reversing the bit order. The evidence is clear: I used four different seeds, and it failed MatrixRank and LinearComp in all four cases. Thus the Wikipedia entry is misleading in my opinion.

While I reversed the bit orders, you can also get systematic failures by simply reversing the byte orders. You select the least significant 32 bits, and reverse the resulting four bytes.

The recommended replacement for xorshift128+, xoroshiro128+, also fails BigCrush in a similar manner (as you can verify by yourself). Yet the xoroshiro128+ Wikipedia page could serve as an unequivocal recommendation:

Xoroshiro is the best general purpose algorithm currently available. (…) Mersenne Twister might still be a better choice for highly conservative projects unwilling to switch to such a new algorithm, but the current generation of statistically tested algorithms brings a baseline of assurance from the outset that previous generations lacked.

I feel that this would need to be qualified. In my tests, Xoroshiro is no faster than SplittableRandom (which I call splitmix64 following Vigna’s terminology), and SplittableRandom passes BigCrush while Xoroshiro does not. Recall that the PCG functions also pass BigCrush and they are quite fast. There are other choices as well.

To be clear, there is probably no harm whatsoever at using xorshift128+, but it does systematically fail reasonable statistical tests. If your core argument for choosing a generator is that it passes hard statistical test, and it fails, I think you have to change your argument somewhat.

Science and Technology links (September 8th, 2017)

I always naively assumed that the discovery that the speed of light is finite was recent. Yet back in the 17th century, scientists had already concluded that the speed of light must be finite due to astronomical aberrations.

Scientists rejuvenated the brain of old mice using a single protein (ARC).

We can use software to write fake reviews that can pass as being useful. This can be used to potentially spam review sites.

Nature reports that injecting the brains of monkeys with stem cells seems to help combat Parkinson’s disease. By the way, we can transform skin cells directly into neurons.

Magic Leap is a company that claims to be working on very advanced augmented reality glasses. They have received more money than God. Their technology is supposedly beyond anything others are working on. We now have access to one of their patents. This might be the first sign of the future. Or not.

As we age, our mitochondria (power cells) become less efficient, fewer in numbers and damaged. UCLA scientists showed how to improve matters in fruit flies by dosing them with a single protein.

Microsoft released Seeing AI, an iPhone app that let you take pictures and then it tries to guess what the picture represents. If you take a picture of an individual, it will try to guess the mood and the age of the individual. The app is free and works well.

The magazine Wired reports on experiments where we took away all forms of sugars from mice. It seems beneficial. A related Canadian study found that eating proteins all day long can help you remain more muscular. In an extensive study based on 18 countries, sugars (carbohydrates) were associated with higher mortality whereas fat was associated with lower mortality.

If you have a PlayStation 4 and you are looking for a relaxing game this week-end, I recommend Uncharted: The Lost Legacy by Naugty Dog. It is a beautiful game.

Nature reports on a new class of senolytics, these are products that kill the senescent cells we accumulate as we age. If you have been paying attention to my posts, senescent cells are currently the “next big thing” in anti-aging. We have pretty much established that if we could safely remove senescent cells from your body, you would be more fit, assuming that you are past your prime. The research question is whether we can safely remove senescent cells, at an affordable cost.