Enough with the intrusive updates!

This week-end, I went to my gaming PC in my living room. The PC did not respond when I grabbed the mouse. Puzzled, I pressed the “on” button on the PC. Then I saw that Microsoft saw fit to update my PC while I wasn’t looking. I had configured this particular PC to my liking, and many of my careful settings are gone.

Every time the PC restarts, it seems that “Steam” (a popular game store) has to self-update which can often take several minutes. Who are these engineers who think that an application should do a lengthy self-update with each reboot? That’s hostile toward your users.

I told myself: “no matter, I will grab my Apple laptop and do some relaxing work”. Only, my Apple needed to be booted up. And rebooting triggered an update installation… So I waited and I waited… and then the laptop came up with a mysterious message… basically, the update failed and the laptop was left in an unusable state.

Here is what you find online on this problem:

(…) when I got to the office, and turned on my Mac, it rebooted into recovery, with the Installer Log open, and a dialog that read, “The macOS Installation couldn’t be completed”. I called Apple and they had me run a check disk, (…) Since posting this, Apple called me back and they said that I should be able to reinstall macOS Sierra without formatting the drive and I should be back to the state I was before the failure.
(…) Fortunately I was able to simply reinstall macOS High Sierra by booting into Recovery Mode (Command ⌘ + R) and choosing Reinstall macOS High Sierra.

Five minutes of Googling gave me a solution. A whole evening of downloads gave me back a semi-working laptop… except that it is once again bugging me to update it to the very latest version.

Going back to the gaming PC, the Nvidia graphics card driver insisted on an update… which I did… only to be left with a machine that had the disk throughput wasted on something called GeForce Experience… I understand that this invasive Experience software is nearly useless for most people. If anything should not be taxing your machine, it is a piece of software that is nearly useless for most people. Here is what you find online about this “Experience”:

User 1: Hello, whenever I open the GeForce Experience, I cannot use the program because of a box which says there is an update and I should install it. I cannot use the program because of a box which says there is an update and I should install it. Doing so starts a program but nothing is shown, except in Task Manager where the GeForce program is running at 100% on the Disk. The same thing happens to the disk when I attempt to uninstall the GeForce Experience.

User 2: solution: don’t install GFE.

I also visited my mother-in-law. Thankfully, her PC is configured to never update anything ever because I cannot risk her PC becoming unusable to her following an update; I run the updates manually from time to time. In the task bar, I found a few pieces of software that tried to update themselves, including Java and some “zip” utility. I refused the updates. Obviously. Why on Earth would my mother in law even need Java in the first place? And whatever this zip utility is, I am sure it can continue just fine without updates.

All in all, the experience left me in a terrible mood. I also wasted a lot of time. I am quite good with computers, so I could fix all the problems, but I wonder… what do other people do?

What is annoying is how intrusive all these updates are. I realize that it is hard to do updates entirely silently, without the user noticing. But what we have now is the opposite. The updates are “in your face”.

I can excuse Microsoft more easily… because they have to support an untold number of different configurations… but Apple should do better. I understand that security updates require a reboot, but these updates should literally never fail. It is intolerable to be left with an unusable laptop after an update.

Updates should be as unintrusive as possible. They should not disturb or annoy us.

So how do we end up in such as sorry state of affairs?

My theory is that within software companies, the very worse engineers end up in charge of software updates. “Well, Joe can’t be trusted to work on the actual driver, but maybe he can maintain the ‘Experience’ software in charge of updates?” And Joe does a terrible job, but nobody cares because, these are just updates, right?

Science and Technology links (April 22nd, 2018)

  1. You probably can’t write the two forms of the letter g, even if you have seen them thousands and thousands of times.
  2. Some neurodegenerative diseases might result from a fungal infection. This would include diseases like Parkinson’s. The theory seems to be that many of us get infected with nasty things that remain dormant in most of us, but not in the weakest or unluckiest among us. That is, you may already be infected with whatever is going to kill you later. Scary.
  3. Mangan suggests that the sometimes reported health benefits of moderate alcohol consumption might have to do with its bactericidal effect?
  4. Researchers use machine learning to identify cells under a microscope without having to intervene on the cells themselves (e.g., with fluorescent labels). This might turn out to be a big deal as it could reduce the cost of research drastically.
  5. Google appears to be able to isolate voices when several people talk at the same time.
  6. Google launches Talk to Books. It is a tool where you can ask questions and get back answers from a large collection of books. It works well and reminds me of Vernor Vinge’s Rainbows End where some entrepreneurs wanted to digitalize books to build an artificial intelligence. Seems like we are well on schedule to realize the program set forth in the novel (set in 2025).
  7. Two eggs per day do not adversely affect the biomarkers associated with heart diseases, but increase satiety throughout the day in a young healthy population.
  8. Microsoft will distribute its own version of Linux. Bill Gates, Microsoft’s founder, repeatedly opposed the GPL, the license under which Linux is distributed. So it is definitively a post-Bill-Gates era at Microsoft. I have mixed feelings regarding the GPL myself, but I like the new directions that Microsoft is taking. It much easier to like Microsoft today, they seem much less confrontational.
  9. An extra robot per 1,000 workers reduces the employment to population ratio by 0.18-0.34 percentage points and wages by 0.25-0.5%. Should you worry? Maybe not:

    These are sizable effects. But it should also be noted that even under the most aggressive scenario, we are talking about a relatively small fraction of employment in the US economy being affected by robots. There is nothing here to support the view that new technologies will make most jobs disappear and humans largely redundant.

  10. Google published do-it-yourself artificial intelligence maker kits.
  11. Coffee consumption may reduce the total cancer incidence and it also has an inverse association with some type of cancers. (Credit: P. D. Mangan)
  12. When I was a teenager, I assumed that strong people and smart people were distinct groups. Not so it seems. Grip strength is associated with cognitive performance, according to a study over half a million participants. It is unclear which way the causality goes. Fit people are likely to be both smart and strong. I don’t know how you can get smarter directly, but I know how you can get stronger (lift weights and such), could it be that getting stronger makes you smarter? For older people, this might very well be true.
  13. Patients with major depression exhibited higher epigenetic aging in blood and brain tissue, suggesting that they are biologically older than their corresponding chronological age.
  14. Lenovo sells a high-quality virtual reality headset for $200.
  15. According to Ceci and Williams, current policies to increase female representation in science are misguided:

    Explanations for women’s underrepresentation in math-intensive fields of science often focus on sex discrimination in grant and manuscript reviewing, interviewing, and hiring. Claims that women scientists suffer discrimination in these arenas rest on a set of studies undergirding policies and programs aimed at remediation. More recent and robust empiricism, however, fails to support assertions of discrimination in these domains. To better understand women’s underrepresentation in math-intensive fields and its causes, we reprise claims of discrimination and their evidentiary bases. Based on a review of the past 20 years of data, we suggest that some of these claims are no longer valid and, if uncritically accepted as current causes of women’s lack of progress, can delay or prevent understanding of contemporary determinants of women’s underrepresentation. We conclude that differential gendered outcomes in the real world result from differences in resources attributable to choices, whether free or constrained, and that such choices could be influenced and better informed through education if resources were so directed. Thus, the ongoing focus on sex discrimination in reviewing, interviewing, and hiring represents costly, misplaced effort: Society is engaged in the present in solving problems of the past, rather than in addressing meaningful limitations deterring women’s participation in science, technology, engineering, and mathematics careers today.

    In fact, it seems that there are strong biases favoring women as it is:

    Comparisons of oral non–gender-blind tests with written gender-blind tests for about 100,000 individuals observed in 11 different fields over the period 2006–2013 reveal a bias in favor of women that is strongly increasing with the extent of a field’s male-domination. This bias turns (…) to about 10 percentile ranks for women in math, physics, or philosophy.

    If I had to guess, I would think that I have a bias favorable to women when it comes to recruiting. So the question is: why so few women?

    A colleague of mine works on female equity issues. She gave me a tip on how to recruit women: stress that there are already other women around. Women will be hesitant to join a lab that is made exclusively of men. Sadly, that’s kind of a vicious circle.

  16. Cloning horses appears to be a profitable business.
  17. According to the New York Times, robots can assemble Ikea furniture semi-autonomously. It is not clear that the robots actually start from Ikea’s instructions. And they clearly start from already laid out parts. (Credit: Peter Turney)
  18. Gary Bernhardt writes:

    Reminder to people whose “big data” is under a terabyte: servers with 1 TB RAM can be had about $20k. Your data set fits in RAM.

    Gary is correct. In 2011, I wrote:

    Many information systems have storage costs which are proportional to the number of individuals. I call them sapien-bound systems. (…) Soon, all sapien-bound systems will fit in RAM cheaply.

    To describe my own research, I prefer to use the term “data engineering” rather than “big data”. However, processing data quickly and efficiently is not a solved problem. Having a terabyte of RAM does not make your computing problems go away and, in some sense, it outlines them.

  19. Our atmosphere is filled with trillions of viruses according to the New York Times:

    Generally it’s assumed these viruses originate on the planet and are swept upward, but some researchers theorize that viruses actually may originate in the atmosphere. Whatever the case, viruses are the most abundant entities on the planet by far. While Dr. Suttle’s team found hundreds of millions of viruses in a square meter, they counted tens of millions of bacteria in the same space.

    (Source: P.D. Mangan)

  20. Human beings transformed the planet in deeper ways that is often assumed:

    Our assumption that modern ecosystems are “normal” is flawed,” says Theodor. “They’re not necessarily functioning in the way that they did even 11,000 years ago.

  21. Machine-learning and artificial intelligence are popular today, and people tend to want to use them everywhere. Top researchers in these fields are paid millions of dollars.

    It is possible that current machine-learning techniques are being overrated. Makridakis et al. found that, on prediction tasks, the accuracy of machine-learning models (“artificial intelligence”) is below that of simple old-school statistical models. The motivation of their work is interesting:

    The motivation for writing this paper was an article published in Neural Networks in June 2017. The aim of the article was to improve the forecasting accuracy of stock price fluctuations and claimed that “the empirical results show that the proposed model indeed display a good performance in forecasting stock market fluctuations”. In our view, the results seemed extremely accurate for stock market series that are essentially close to random walks so we wanted to replicate the results of the article and emailed the corresponding author asking for information to be able to do so. We got no answer and we, therefore, emailed the Editor-in-Chief of the Journal asking for his help. He suggested contacting the other author to get the required information. We consequently, emailed this author but we never got a reply. Not being able to replicate the result and not finding research studies comparing ML methods with alternative ones we decided to start the research leading to this paper.

    A major fault in contemporary research is that people fail to honestly and fairly compare their “fancy” work with simpler alternatives as if simplicity was a fault. It is not! You always want to use the simplest thing that works.

  22. Jordan has a worthy essay entitled Artificial Intelligence — The Revolution Hasn’t Happened Yet. He points out that what we have is good statistical algorithms, but that it remains to build a new kind of engineering which we might call “artificial intelligence engineering”.

    I’m also a computer scientist, and it occurred to me that the principles needed to build planetary-scale inference-and-decision-making systems of this kind, blending computer science with statistics, and taking into account human utilities, were nowhere to be found in my education. And it occurred to me that the development of such principles — which will be needed not only in the medical domain but also in domains such as commerce, transportation and education — were at least as important as those of building AI systems that can dazzle us with their game-playing or sensorimotor skills.

  23. A technical called “repetition lag training” can significantly improve a particular type of memory task:

    Repetition lag training improved objective measures of recollection, eliminated the age-related recollection decrement, and these improvements maintained over three months.

    That is, elderly people who train become as good as young people at recollection tests. The results appear robust even with people who suffer from mild cognitive impairment. However, participants fail to report an improvement in their daily life. That is, you can train for a memory task and improve to a youthful level even if you are very old, but you only improve what you train. I think that this is viewed in a pessimistic light, but I think it suggests that “training” becomes increasingly important when you get older, or if you suffer from cognitive limitations.

  24. Prescription drugs rapamycin and metformin regenerate hair.
  25. Matt Ridley points out that we have not extended the maximal lifespan of human beings, and I call him up on his pessimism, using Twitter. He wrote a book entitled The Rational Optimist.

By how much does AVX-512 slow down your CPU? A first experiment.

Intel is finally making available processors that support the fancy AVX-512 instruction sets and that can fit nicely in a common server rack. So I went to Dell and ordered a server with a Skylake-X microarchitecture: an Intel Xeon W-2104 CPU @ 3.20GHz.

This processor supports several interesting AVX-512 instruction sets. They are made of very powerful instructions that can manipulate 512-bit vectors.

On the Internet, the word out is that using AVX-512 in your application is going to slow down your whole server, so you should just give up and never use AVX-512 instructions.

Vlad Krasnov from Cloudfare wrote:

If you do not require AVX-512 for some specific high-performance tasks, I suggest you disable AVX-512 execution on your server or desktop, (…)

Table 15-16 in Intel’s optimization manual describes the impact of the various instructions you use on “Turbo Boost” (one of Intel’s frequency scaling technology). The type of instructions you use determines the “license” you are in. If you avoid AVX-512 and heavy AVX2 instructions (floating-point instructions and multiplications), you get the best boost. If you use light AVX-512 instructions or heavy AVX2 instructions, you get less of a boost… and you get the worst results with heavy AVX-512 instructions.

Intel sends us to a sheet of frequencies. Unfortunately, a quick look did not give me anything on my particular processor (Intel Xeon W-2104).

Intel is not being very clear:

Workloads that execute Intel AVX-512 instructions as a large proportion of their whole instruction count can gain performance compared to Intel AVX2 instructions, even though they may operate at a lower frequency. It is not always easy to predict whether a program’s performance will improve from building it to target Intel AVX-512 instructions.

What I am most interested in, is the theory that people seem to have that if you use AVX-512 sparingly, it is going to bring down the performance of your whole program. How could I check this theory?

I picked up a benchmark program that computes the Mandelbrot set. Then, using AVX-512 intrinsics, I added AVX-512 instructions to the program at select places. These instructions do nothing to contribute to the solution, but they cannot be trivially optimized away by the compiler. I used both light and heavy AVX-512 instructions. There are few enough of them so that the overhead is negligible… but if they slowed down the processor in a significant manner, we should be able to measure a difference.

The results?

moderunning time (average over 10)
no AVX-5121.48 s
light AVX-5121.48 s
heavy AVX-5121.48 s

Using spurious AVX-512 instructions made no difference to the running time in my tests. I don’t doubt that the frequency throttling is real, as it is described by Intel and widely reported, but I could not measure it.

This suggests that, maybe, it is less likely to be an issue than is often reported, at least on the type of processors I have. Or else I made a mistake in my tests.

In any case, we need reproducible simple tests. Do you have one?

My code and scripts are available.

Iterating in batches over data structures can be much faster…

We often need to iterate over the content of data structures. It is surprisingly often a performance bottleneck in big-data applications. Most iteration code works one value at a time…

for value in datastructure {
  do something with value
}

There is a request to the data structure for a new value at each iteration. Alternatively, we can query the data structure far less often by asking the data structure to fill a buffer…

for blockofvalues datastructure {
  for value in blockofvalues {
      do something with value
  }
}

It is not automatically faster: you have to store values to a buffer and then read them again. It involves copying data from registers to memory and back. There is some inherent latency and it is an extra step.

However, if you make your buffer large enough but not too large (e.g., 1kB), the latency will not matter much and you will remain in CPU cache (fast memory). Thus you should, in the worst case, be only slightly slower. What do I mean by “slightly”? Basically, you are adding the equivalent of a memory copy over a small buffer.

When accessing data over a network, or even across processes on the same machine, it worth it to process the data in batches because the cost of the transaction is high. When working in data structures that are in your own process, the transaction cost might be low. Repeated function calls in a loop are cheap, and they can become free after inlining. To my knowledge, batched iterations is not typically available in standard libraries.

Thus, until recently, I did not pay much attention to the idea of iterating in batches over data structures. I could imagine some gains, but I expected them to be small.

In the Go implementation of Roaring bitmaps, Ben Shaw contributed a way to iterate over values in batches, recovering many values in a buffer with each function call. It helped the performance considerably (almost doubling the speed on some tests). Richard Startin then did the same in the Java implementation. It also helped a lot:

The batch iterator is always at least twice as fast as the standard iterator (…) Depending on the contents and size of the bitmap, the batch iterator can be 10x faster.

So I started to wonder… is this an underrated strategy?

I modified the popular Go bitset library and on some iteration test, the batched iteration was nearly twice as fast!

The batched code is more complex, but not so terrible:

buffer := make([]uint, 256)
j := uint(0)
j, buffer = bitmap.NextSetMany(j, buffer)
for ; len(buffer) > 0; j, buffer = bitmap.NextSetMany(j, buffer) {
     for k := range buffer {
        // do something with buffer[k]
     }
     j += 1
}

Then I modified the cbitset library. I saw, again, almost a doubling of the speed. The code is once more a bit more complicated:

size_t buffer[256];
size_t howmany = 0;
for(size_t startfrom = 0; 
         (howmany = nextSetBits(b1,buffer,256, &startfrom)) > 0 ;
          startfrom++) {
       for(size_t i = 0; i < howmany ; i++) {
         // do something with  buffer[i];
       }
}

These good results depend on what kind of data you iterate over, how you use the data, and what kind of data structure you have. Obviously, it is useless to batch iterations over an array of values. Yet my few tests provide enough evidence to conclude that batch iteration is worth investigating when speed is a limitation.

On Twitter, Milosz Tanski explained the result as follows:

One thing to remember about CPU and optimization in general is that almost hardware is designed to operate at maximum speed when it’s doing similar work on similar data. Branch prediction, prefetch, caches, op code level parallelization all make this assumption.

Introducing GapminderVR: Data Visualization in Virtual Reality

I am a big fan of sites such as Gapminder and Our World in Data. Such data visualization sites are like intellectual pornography. You want to know which countries are doing better? Which continents drink more alcohol? How is alcohol related to GDP? Have people getting fatter recently, or is that a long trend? You don’t need to read thick books, you can just browse graphs.

I’m also a big fan of virtual reality (VR). In February 2016, I placed a bet against Greg Linden to the effect that by 2019, we would sell at least 10 million VR units a year worldwide. I might very well lose my bet but what is surely correct is that VR is part of our future. It is going to be everywhere, soon… where “soon” remains to be defined, but it is not 30 years.

What if you could mix data visualization and VR? What would happen? Could you make new things that nobody could think of before? I think so. That’s another one of my bets… but unlike my bet with Greg Linden, it is a bet I took by investing time and money in my lab.

To be fair, the idea that you could use data visualization in VR has been around for at least 20 years. It has gone exactly nowhere.

Why would it be different now?

The most obvious factor is cost. VR is much cheaper than it ever were, and it is getting cheaper by the day. And it is not just the hardware. The software is getting better and cheaper. This means that many of us can try new things and iterate much faster than ever before. If we can gather enough heads together, we will get somewhere.

If the work remains limited to a few academics, it is never going to take off. We need to engineers, designers and programmers to jump in.

In my lab, we decided to spend a few months building prototypes of what is possible in VR. We are publishing as demos two interesting cases:

  • Rail of time is a demo where you can control the time dimension with your feet (by walking). Walk forward and the time goes forward. Walk backward and the time goes backward. (YouTube)
  • Museum is a demo where you can visit a museum where the statues represent countries at a given time. The various attributes of the statues represent the various dimensions of the data. (YouTube)

If you have VR headset, you can try our demos in our site: GapminderVR.org. The name of the site is meant to pay homage to Gapminder and to the work of Hans Rosling. All our code is public and you can “steal” it. Or get in touch and we will help you. We hope to inspire future work. If you are interested in helping out, get in touch. If you can do better, please let us know about your work.

The design and programming work was done by Niko Girardelli. He is a super brilliant engineering student, and someone ought to offer him a six-figure job. Yet, to be fair, the programming is less demanding than you might expect. It is all JavaScript in the browser. And yes, the performance is decent. We owe a lot of credit to Mozilla and their work on WebVR. It is amazing.

Science and Technology links (April 13th, 2018)

  1. Somewhat depressingly, there is very little evidence that you can improve people’s overall cognitive abilities:

    Although cognitive ability correlates with domain-specific skills—for example, smarter people are more likely to be stronger chess players and better musicians—there is little evidence that chess or music instruction makes people smarter. Rather, smarter individuals are more likely to engage and excel in these fields.

    That is, if you receive musical training, you may get better at playing an instrument. However, this will not make you better at programming in Java.

    Time spent in school does not significantly improves students’ cognitive skills. However, smarter people spend more time in school.

    This failure to increase overall cognitive skills destroys one of the main premise of schooling. It also means that if you have kids, you should not push them into classes in the hope of making them smarter.

    The same works at the scale of countries. Richer countries educate their people more… but it is not the added education that makes them richer.

    Of course, you can make yourself smarter. Just buy a computer.

  2. A woman’s voice changes after pregnancy. It sounds more masculine, for a time at least.
  3. Women (>45 years old) who consume high-fat dairy products gain less weight over time.
  4. The youngest children in a school class are more likely than their classmates to receive pharmacological treatment for attention deficit/hyperactivity disorder (ADHD). This result suggests that it may be difficult to differentiate immaturity from ADHD.
  5. There are contradictory reports as to whether human beings create new neurons throughout their life. The LA Times has a nice article on neurogenesis: Scientists find signs of new brain cells in adults as old as 79.
  6. In small monkeys, eating less is strongly associated with a longer and healthier life according to a study published in Nature:

    Compared to control animals, caloric restriction extended lifespan by 50% (from 6.4 to 9.6 years, median survival), reduced aging-associated diseases and preserved loss of brain white matter in several brain regions.

    Importantly, this means that aging can be manipulated far more easily than might have been thought. That caloric restriction prolongs life and health has deep consequences. It means that your body does not age because it lacks energy or entropy. It also means that your body does not try to maximize your healthspan and lifespan.

    It is unknown whether caloric restriction has the same effect on human beings. For the record, I do not practice caloric restriction. I eat all the time. I am, however, quite thin and small.

  7. Unsurprisingly, late-life depression is associated with an increased risk for all-cause dementia, vascular dementia and Alzheimer’s disease. This is yet more evidence that we need to take depression seriously. It is not normal to be depressed beyond a certain point.
  8. Kell and Pretorius have published a speculative article where they suggest that many diseases of old age (Alzheimer’s, Parkinson’s, atherosclerosis, arthritis) could be due to dormant microbes that “wake up” later in life, in part due to iron dysregulation. This is not a new theory, but this paper is well written and under open access.

    If you like this theory, you might like Mangan’s book: Dumping Iron. He believes that many of us have too much iron in our bodies. There are two main ways to reduce your iron store: chelation therapy (using drugs) or bleeding out.

    It seems probably that iron dysregulation is real. Sadly, there is weak evidence that you can prevent it. The risk of Parkinson’s disease was higher among men who reported recent multiple blood donations and total iron intake was not associated with an increased risk of Parkinson’s disease.

    The trouble is that iron can take many forms and can go to different places in your body.

For greater speed, try batching your out-of-cache data accesses

In software, we use hash tables to implement sets and maps. A hash table works by first mapping a key to a random-looking address in an array.

In a recent series of blog posts (1, 2, 3), I have documented the fact that precomputing the hash values often accelerates hash tables.

Some people thought that I was merely making the trivial point that precomputing the hash values saved you the time to compute the hash values. That is true, but there is more to it.

On recent Intel processors, batching your load requests can be very helpful. Let me illustrate with some code.

I am going to use a simple hash function that takes integer values and return “mixed up” values:

uint32_t murmur32(uint32_t h) {
  h ^= h >> 16;
  h *= UINT32_C(0x85ebca6b);
  h ^= h >> 13;
  h *= UINT32_C(0xc2b2ae35);
  h ^= h >> 16;
  return h;
}

This function is not very expensive, but it is efficient at generating random-looking outputs.

In a complete loop, it takes between 7 and 8 cycles to compute and store a bunch of these hash values (modulo the array size) using a recent Intel processor.

Let us put this function to good use to randomly go pick up values in a large array and sum them up:

uint64_t sumrandom(uint64_t *values, size_t size) {
  uint64_t sum = 0;
  for (size_t k = 0; k < size; ++k) {
    sum += values[murmur32(k) % size ];
  }
  return sum;
}

You expect that the bulk of the time needed to execute this code would have to do with data accesses (given a large array). And indeed, for arrays exceeding my cache, it takes about 46 cycles per value to compute the sum.

So it means that about 40 cycles are due to the random look-up of the values.

Is that right?

Let us do something more complicated, where we first compute the hash values and then we do the look-ups…

uint64_t sumrandomandindexes(uint64_t *values, uint32_t *indexes, size_t size) {
  uint64_t sum = 0;
  for (size_t k = 0; k < size; ++k) {
    indexes[k] = murmur32(k) % size ;
  }
  for (size_t k = 0; k < size; ++k) {
    sum += values[indexes[k]];
  }
  return sum;
}

This looks more expensive, but it is not. It runs in about 32 cycles per operation. That is, separating the task into two separate tasks, and doing overall more stores and loads, is significantly cheaper. It should not make sense, but the result is robust. (Note that simply unrolling the loop a few times might serve us well, I used an extreme example to make my point.)

operationcost
hash function~8 cycles per value
sum over random values~46 cycles
hash function followed by sum~32 cycles

My code is available. I used GNU GCC compiler 6.0 as it gives the best results on this test.

So what is happening? When I use the Linux perf command to count the number of cache misses (perf stat -B -e cache-misses), I find that the approach the computes the hash values separately from the data loads has about 50% fewer cache misses.

Thus Intel processors have an easier time avoiding cache misses when the data loads are batched.

I have not verified how other processors fare. Do you know?

Science and Technology links (April 7th, 2018)

  1. Mammals have a neocortex, some kind of upper layer on top of our ancestral brain. It is believed to be the key evolutionary trick that makes mammals smart. Yet birds have no cortex, but some of them (parrots and crows) are just as smart as monkeys. Thus Güntürkün and Bugnyar conclude that

    a specific cortical architecture cannot be a requirement for advanced cognitive skills. During the long parallel evolution of mammals and birds, several neural mechanisms for cognition and complex behaviors may have converged despite an overall forebrain organization that is otherwise vastly different.

    That is, the specifics of how mammal brains work are maybe less important than it might appear in our quest to understand intelligence.

  2. Should you go see your doctor each year, just in case? Maybe not:

    Regardless of which screenings and tests were administered, studies of annual health exams dating from 1963 to 1999 show that the annual physicals did not reduce mortality overall or for specific causes of death from cancer or heart disease.

  3. Aspirin is associated with reduced incidence of death from cancers of the colon, lung and prostate. But it increases your risk of bleeding to death.
  4. New neurons can integrate in the adult neocortex, which suggests that progressive cell replacement in the brain is possible.
  5. Up until recently, most laptops had general processors with at most 4 cores. Intel is bringing 6-core processors to some new high-end laptops.
  6. Human beings are unique in their ability to generate new neurons throughout their lifetime (that’s called neurogenesis). It is believed to be important in the process of generating new memories. A new study reports that healthy older people without cognitive impairment generate new neurons as much as younger individuals. In fancy terms, neurogenesis does not decline with aging.
  7. You may move faster when reacting out of instinct than when you are acting deliberately.
  8. Apple recruited John Giannandrea from Google. Giannandrea was responsible for “artificial intelligence” at Google, among other things. Giannandrea looks like a hands-on engineering type. He has repeatedly stated that computers are dumb. He believes that computers should augment human beings, not replace them. It is believed that Giannandrea might help Apple with Siri since it is felt that Siri has stagnated. Giannandrea has a degree in computer science from the University of Strathclyde.
  9. Childless unmarried young women in New York City, Los Angeles and San Diego make 17%, 12% and 15% more than their male peers.

Caching hash values for speed (Swift-language edition)

In my posts Should you cache hash values even for trivial classes? and When accessing hash tables, how much time is spent computing the hash functions?, I showed that caching hash values could accelerate operations over hash tables, sets and maps… even when the hash tables do not fit in CPU cache.

To be clear, I do not mean to say that it is necessarily the computation of the hash values that is the bottleneck, however the whole computation, including the latency of the operations, slows you down more than you would think, even when dealing with out-of-cache data structures.

In my earlier posts, I used the Java programming language. Java already relies on precomputed hash values in its standard library. Indeed, Java strings have precomputed hash values.

It is always prudent to check observations using different programming languages. So I decided to reproduce a similar test using the Swift programming language.

I create a trivial class containing three integer values just like I did in Java:

class Triple :  Hashable, Equatable {
      let x, y, z : Int
      init(x: Int, y:Int, z:Int) {
        self.x = x
        self.y = y
        self.z = z
      }
      final var hashValue: Int {
          return x + 31 &* y + 961 &* z
      }
}

Then I recreate a slightly different version where the value of the hash value is precomputed:

class BufferedTriple :  Hashable, Equatable {
      let x, y, z : Int
      private let hash : Int
      init(x: Int, y:Int, z:Int) {
        self.x = x
        self.y = y
        self.z = z
        hash =  x + 31 &* y + 961 &* z
      }
      final var hashValue: Int {
          return hash
      }
}

My benchmark involves creating a large set of these points, and checking how many of 100 other points are in this set. The code is simple:

for i in a {
          if bs.contains(i) {
            counter += 1
          }
}

I ran this benchmark using Linux, Swift 4.1 and a Skylake processor. Roughly speaking, in my tests, the buffered version (that precomputes the hash values) is about twice as fast:

TripleBufferedTriple
20 us12 us

But maybe it takes much longer creating BufferedTriple instances? In fact, no. It takes about 1.5 us to construct the 100 instances, whether they are Triple and BufferedTriple. It takes slightly more time to construct the BufferedTriple instances but the difference is less than 0.2 us.

My code is available.

My point is not that you should always or often precompute hash values. There are obvious downsides to this memoization approach even if it did not stop the Java engineers from doing it for all strings. Consider this instead: if you think that when working with large, out-of-cache, data structures, computational speed is not important, your mental model of software performance is incomplete.

Further reading: For greater speed, try batching your out-of-cache data accesses

Science and Technology links (March 30th, 2018)

  1. People who score higher on intelligence tests tend to have larger brains. Twin studies suggest the same genetic factors influence both brain size and intelligence.
  2. The effects of campaign contact and advertising on Americans’ candidates choices in general elections is zero. A related fact: in 2016, Clinton has outspent Trump throughout the election by more than $150 million.
  3. In his article “In defense of skepticism about deep learning” (January 2018), Gary Marcus wrote “deep learning by itself, although useful, was unlikely to lead on its own to artificial general intelligence”.
  4. We want to synthesize modified versions of all the genes in the human genome in the next few years” (George Church).
  5. Chinese students get 10% of all doctorates awarded in the U.S. There are more Chinese engineers working on AI at U.S. tech companies than in all of China.
  6. Women score higher in anxiety, vulnerability, altruism, and sympathy while men score higher in excitement-seeking and openness to intellect.
  7. Alcohol consumption is positively associated with handgrip strength in the elderly. Handgrip strength is strongly correlated with life expectancy.
  8. While postdocs are necessary for entry into tenure-track jobs, they do not enhance salaries in other job sectors over time.
  9. Studies in psychology do not hold up to scrutiny:

    We conducted replications of 100 experimental and correlational studies published in three psychology journals using high­-powered designs and original materials when available. Replication effects were half the magnitude of original effects, representing a substantial decline.

    While 97% of the original studies had significant results only 36% of the replications had significant results. That is, given a random published study, if you bet that it is bogus, you will win most of the time.

    It also seems no to matter who is doing the research:

    Moreover, correlational evidence is consistent with the conclusion that variation in the strength of initial evidence (such as original P value) was more predictive of replication success than variation in the characteristics of the teams conducting the research (such as experience and expertise).

  10. You will do better if you believe that performing cognitive work energizes you to do more.
  11. It would have been better to invest all your money in gold back in 2000 than in the stock market. The same is true regarding Bitcoin from its starting point. I don’t know what it means.
  12. Almost all Economics Nobel prizes are closely related through their PhD supervisors. This is yet more evidence that “who you know” is really important to your intellectual productivity.
  13. Canada celebrates its 150th birthday by creating several prestigious research chairs (via Antonio Badia).
  14. Our ancestors had sex with the Neanderthals and the Denisovans.
  15. Obesity is a recent phenomenon:

    Between the birth cohorts of 1930 and 1993, the prevalence of obesity rose from 0 to 14% among boys and from 2 to 12% among girls. (…) Among boys, all these increases began after birth year 1970. Among girls, obesity began to rise after birth year 1980 (…).

    and

    While it may be true that body mass indexes (BMIs) have been increasing slowly for a long time, the increases observed in recent decades are much faster and have pushed many adults and children over the obesity threshold in a remarkably short time. The trend is distressing, but to reverse it we only need to turn the clock back to 1980. We don’t need to go back to 1900.

  16. Rats did not spread the plague:

    While it is commonly assumed that rats and their fleas spread plague during the Second Pandemic, there is little historical and archaeological support for such a claim. Here, we show that human ectoparasites, like body lice and human fleas, might be more likely than rats to have caused the rapidly developing epidemics in pre-Industrial Europe.

  17. Alzheimer’s disease can be spared by nonsteroidal anti-inflammatory drugs. In other words, it could be that aspirin keeps you sharp.