Are your strings immutable?

A value is immutable if it cannot change.

Immutability is a distinct notion than that of a constant. The speed of light in a vacuum is believed to be a universal constant, for example. Constants are immutable in the sense that they cannot change. However, immutability refers to values, not to the assignment of values. For example, the number 3 is immutable. However, if I say that your rank is 3, this rank could change. That’s because your rank is a variable, and variables may change their values even if these values are immutable.

That is, a variable may change its value to point to a different immutable value. That’s a somewhat confusing point for non-programmers. For example, my name is “Daniel”. To say that strings are immutable is to say that I cannot change the string “Daniel”. However, I can certainly go see the government and have my first name changed so that it is “Jack”. Yet this change does not modify the string “Daniel”. If I could change the string “Daniel” then, possibly, all individuals named “Daniel” would see their name changed.

So in the world around us, values are typically immutable. And that’s largely why it is believed that immutable values are safer and easier.

Working with mutable values requires more experience and more care. For example, not only does changing the string “Daniel” affect all people named “Daniel”, but what if two people try to change the string at the same time?

So integer values are always immutable, not only in real life but also in software. There is no programming language where you can redefine the value “3” to be equal to “5”.

Yet I believe that most programming languages in widespread use have mutable arrays. That is, once you have created an array of values, you can always change any one of the entries. Why is that? Because immutability could get costly as any change to an immutable array would need to be implemented as a copy.

Arguably, the most important non-numeric type in software is the string. A string can be viewed as an array of characters so it would not be unreasonable to make it mutable, but strings are also viewed as primitive values (e.g., we don’t think of “Daniel” as an array of 6 characters). Consequently, some languages have immutable strings, others have mutable strings. Do you know whether the strings in your favorite language are mutable?

  • In Java, C#, JavaScript, Python and Go, strings are immutable. Furthermore, Java, C#, JavaScript and Go have the notion of a constant: a “variable” that cannot be reassigned. (I am unsure how well constants are implemented and supported in JavaScript, however.)
  • In Ruby and PHP, strings are mutable.
  • The C language does not really have string objects per se. However, we commonly represent strings as a pointer char *. In general, C strings are mutable. The C++ language has its own string class. It is mutable.

    In both C and C++, string constants (declared with the const qualifier) are immutable, but you can easily “cast away” the const qualifier, so the immutability is weakly enforced.

  • In Swift, strings are mutable.

    However, if you declare a string to be a constant (keyword let), then it is immutable.

Pruning spaces from strings quickly on ARM processors

Suppose that I give you a relatively long string and you want to remove all spaces from it. In ASCII, we can define spaces as the space character (‘ ‘), and the line ending characters (‘\r’ and ‘\n’). I am mostly interested in algorithmic and performance issues, so we can simplify the problem by removing all byte values less or equal to 32.

In a previous post where I asked how quickly we could prune spaces, the best answer involved vectorization using 128-bit registers (SSSE3). It ends up being between 5 and 10 times faster than the naive approach.

Conveniently enough, ARM processors all have 128-bit vector registers, just like x64 processors. So can we make ARM processors go as fast as x64 processors?

Let us first consider a fast scalar implementation:

size_t i = 0, pos = 0;
while (i < howmany) {
    char c = bytes[i++];
    bytes[pos] = c;
    pos += (c > 32 ? 1 : 0);
}

This prunes all character values less or equal to 32, writing back the data in-place. It is very fast.

Can we do better with vector instructions? Vector instructions are instructions supported by virtually all modern processors that operate over wide registers (16 bytes or more).

On x64 processors, the winning strategy is to grab 16 bytes of data, quickly compare against white space characters, then extract a mask (or bitset) value made of 16 bits, one bit per character, where each bit indicates whether the value found is a white space. The construction of such a bitset is cheap on an x64 processor, as there is a dedicated instruction (movemask). There is no such instruction on ARM processors. You can emulate movemask using several instructions.

So we cannot proceed as we did on x64 processors. What can we do?

Just like with SSSE3, we can quickly check whether byte values are less or equal to 32, thus identifying white space characters:

static inline uint8x16_t is_white(uint8x16_t data) {
  const uint8x16_t wchar = vdupq_n_u8(' ');
  uint8x16_t isw = vcleq_u8(data, wchar);
  return isw;
}

Next we can quickly check whether any of the 16 characters is a white space, by using about two instructions:

static inline uint64_t is_not_zero(uint8x16_t v) {
  uint64x2_t v64 = vreinterpretq_u64_u8(v);
  uint32x2_t v32 = vqmovn_u64(v64);
  uint64x1_t result = vreinterpret_u64_u32(v32);
  return result[0];
}

This suggests a useful strategy. Instead of comparing characters one by one, compare 16 characters at once. If none of them is a white space character, just copy the 16 characters back to the input and move on. Otherwise, we fall back on the slow scalar approach, with the added benefit that we do not need to repeat the comparison:

uint8x16_t vecbytes = vld1q_u8((uint8_t *)bytes + i);
uint8x16_t w = is_white(vecbytes);
uint64_t haswhite = is_not_zero(w);
w0 = vaddq_u8(justone, w);
if(!haswhite) {
      vst1q_u8((uint8_t *)bytes + pos,vecbytes);
      pos += 16;
      i += 16;
 } else {
      for (int k = 0; k < 16; k++) {
        bytes[pos] = bytes[i++];
        pos += w[k];
     }
}

Most of the benefit from this approach would come if you can often expect streams of 16 bytes to contain no white space character. This seems like a good guess in many applications.

I wrote a benchmark where I try to estimate how long it takes to prune spaces, on a per character basis, using input data where there are few white space characters, placed at random. My source code is available, but you need an ARM processor to run it. I run the benchmark on a 64-bit ARM processor (made of A57 cores). John Regher has a few more benchmarks on this same machine. I think it is the same cores that you find in the Nintendo Switch.

scalar 1.40 ns
NEON 0.92 ns

The technical specification is sparse. However, the processor runs at 1.7 GHz as one can verify by using perf stat. Here is the number of cycles per character we need…

scalar ARM recent x64
scalar 2.4 cycles 1.2 cycles
vectorized (NEON and SSSE3) 1.6 cycles 0.25 cycles

(The source code for x64 is available on GitHub.)

In comparison, on an x64 processor, the scalar version uses something like 1.2 cycles per character, which would put the ARM machine at half the performance of a recent x64 processor on a per cycle basis. That is to be expected as the A57 cores are hardly meant to compete with recent x64 processors on a cycle per cycle basis. However, with SSSE3 on an x64 machine, I manage to use a little as 0.25 cycles per character, which is more than 5 times better than what I can do with ARM NEON.

This large difference comes from an algorithmic difference. On x64 processors, we are relying on the movemask/pshufb combo and we end up with a branchless algorithm involving very few instructions. Our ARM NEON version is much less powerful.

There is a lot to like about ARM processors. The assembly code is much more elegant than the equivalent with x86/x64 processors. Even the ARM NEON instructions feel cleaner than the SSE/AVX instructions. However, for many problems, the total lack of a movemask instruction might limit the scope of what is possible with ARM NEON.

But maybe I underestimate ARM NEON… can you do better than I did?

Note: The post has been edited: it is possible on 64-bit ARM processors to reshuffle 16 bits in one instruction as one of the commenters observed.

Note: I get better performance in a follow-up blog post.

Science and Technology links (July 1st, 2017)

Canada is 150 years old today.

The iPhone is 10 years old this year. We can safely say that the iPhone 7 is over a hundred times faster, in almost every way than the original iPhone. Very few things get 100 times better over 10 years. You have to improve the performance by 60% each and every year.

Though mammals like us can heal injuries, there is often scarring. Scarring should be viewed as imperfect healing. It is not just a matter of looks, scars make your tissues less functional. As far skin healing is concerned, scientists have found a way to cause skin to heal without scarring, at least in mice.

Essentially, we can manipulate wound healing so that it leads to skin regeneration rather than scarring, (…) the secret is to regenerate hair follicles first. After that, the fat will regenerate in response to the signals from those follicles. (…) regenerating fat cells in skin can be beneficial for conditions beyond scarring. The process could potentially become a new anti-aging treatment, as the formation of deep wrinkles is thought to result from permanent loss of skin fat.

It seems that fasting (going without food) could be a key to regenerating your immune system:

The study has major implications for healthier aging, in which immune system decline contributes to increased susceptibility to disease as people age. By outlining how prolonged fasting cycles  periods of no food for two to four days at a time over the course of six months  kill older and damaged immune cells and generate new ones, the research also has implications for chemotherapy tolerance and for those with a wide range of immune system deficiencies, including autoimmunity disorders.

Chimpanzees are not that much stronger than we are:

But now a research team reports that contrary to this belief, chimp muscles’ maximum dynamic force and power output is just about 1.35 times higher than human muscle of similar size, a difference they call “modest” compared with historical, popular accounts of chimp “super strength,” being many times stronger than humans.

Human beings are optimized for high endurance:

The flip side is that humans, with a high percentage of slow-twitch fibers, are adapted for endurance, such as long-distance travel, at the expense of dynamic strength and power. When we compared chimps and humans to muscle fiber type data for other species we found that humans are the outlier, suggesting that selection for long distance, over-ground travel may have been important early in the evolution of our musculoskeletal system

So how do you fight a chimpanzee? I would guess that getting the fight to last as long as possible is your best bet as a human being. The chimpanzee will get exhausted first. So I would probably either keep the chimpanzee at bay or run away. If the chimpanzee pursues, I would just wear him down.

A few weeks ago, there was an article in Nature claiming that human lifespan is limited to 115 years. There are very few of us that can hope to ever reach 115 years of age at the present time, but the question is whether it will change. Some people believe that 115 years of age is a hard limit that cannot be exceeded. Several scientists have now issued counterpoints. Siegfried Hekimi from McGill University (Montreal) says that…

You can show the data are compatible with many different trajectories and not at all an ongoing plateau (…) by extending trend lines, we can show that maximum and average lifespans could continue to increase far into the foreseeable future. (…) If this trend continues and our life expectancy of the average person becomes 100, the longest person might make it to 150 (…)

Jim Vaupel from the Max Planck Institute writes:

The evidence points towards no looming limit. At present the balance of the evidence suggests that if there is a limit it is above 120, perhaps much above  and perhaps there is not a limit at all.

Maarten Rozing from the University of Copenhagen writes about a biological clock limiting our lifespan:

We now know not only that the idea of such a clock is highly implausible, but also that ageing is proving to be more amenable to change than used to be supposed

The rebuttals can be found in Nature:

Of course, the real answer is at this point is that we do not know how long human beings could live. This being said, Yuval Noah Harari makes a compelling case in his book Homo Deus: A Brief History of Tomorrow that homo sapiens has reached the end of the line. Very solid arguments can be made that, say, in 100 years, there won’t be any homo sapiens left on the planet. So it is entirely possible that we will never find out how long homo sapiens could live.

Video game review… Nier: Automata

Single-player RPG games are having a tough time. Last year I reviewed Deus Ex: Mankind Divided. Though I felt it was an excellent game, it was not a commercial success and it seems that there will not be a follow-up game in the series in the foreseeable future. More recently, I reviewed Mass Effect: Andromeda. I felt that it was a very solid game, but occasional poor writing and some botched graphical models opened up the game to harsh criticism. Again, it looks like Mass Effect might come to an end because of the poor sales.

I am currently playing another single-player RPG, this time from Japan, Nier: Automata. Sales-wise, it looks to be one of the top-10 games of all time on the PlayStation 4, so it is doing quite well.

The game mechanic itself is very much that of an old-school game. In fact, a fair amount of time is spent playing the game as if it were a two-dimensional shooter. Otherwise, the game plays quite a bit like a typical and classical action RPG “à la Zelda”.

The game looks good, but it is quite simple, even simplistic. There are only so many different enemy types. Most of the map looks the same. The 3D models are crude at times though always effective. The layouts are simplistic. I get the impression that the game engine must be simple. This gives the game an old-school look and feel. I also suspect that this means that the game is a massive success financially for its producers. A game like Mass Effect: Andromeda has a sophisticated design, with finely tuned non-trivial combat mechanics and lots of massive unique environments, so it has to be far more expensive to develop.

You play as an android that has two modes of attack that can be used simultaneously. Firstly, there is a drone that follows you around, and you can order this drone to shot continuously at enemies. Given that most enemies, including bosses, have a hard time damaging you if you stay far away, this drone almost trivializes the game. There are entire boss fights that you can win by jumping up a ledge and just having your drone shoot the enemy down. It helps that you have infinite ammunition. Secondly, you can use melee weapons like swords. That’s where the game gets interesting because though your melee weapons can cause a lot of damage, they also open you up to receiving a lot of damage. There is real skill involved in fighting powerful enemies up close.

Because you are an android, you can reprogram yourself by acquiring new programs. For example, you can make it so that whenever your health levels fall under a threshold, you automatically heal yourself using one of your “healing potions”. You can also make it so that after receiving some damage, you become invincible for a second or two. Combining these two programs is sufficient that, for most purposes, you are invincible… as long as you have enough “healing potions”… but these are cheap and widely available in stores.

When I first starting playing, I paid little to no attention to these programs, nor did I pay much attention to my choice of weapon. However, it ends up making a critical difference, at least on the default difficulty level.

There is no automatic save points, so you can die and have to restart the game from the beginning. You have to think about saving. If you die, your body will remain where you die along with some of your gear. You can retrieve it by playing again and getting back to your body.

Playing the game requires some skill, but on the default difficulty level, I only ever had trouble with one part of the game… there is a crazy boss at some point, “the Opera boss”, it is a giant lady with an armored dress. And I suspect that I had so much trouble because I did not understand the game very well.

Not everything is absolutely smooth. Several times I was left wondering about where I was supposed to go, what I was supposed to do, but I never got stuck long enough to be annoyed.

I have done an entire first playthrough but the game has this weird mechanic whereas you are supposed to beat the game several times, and each time you do so, you get to see a different side of the story. Looking at the Wikipedia entry for the game, it seems that I will need to play at least two more times through the game to really see the bulk of the story.

The music of the game really adds a lot to the experience. To be honest, I suspect that I play just to be immersed in the music and aesthetic of the game. I find it relaxing.

Though I have not played through the entire game, I know enough to appreciate the story and the theme. The game is set in our far future. It is supposedly very, very far in our future but, oddly, city structures are still holding more or less intact. There is no human being anywhere, though you are told that they reside on the Moon, unseen. You are an android that looks like a young human being, but there are cruder robots all over the surface of the Earth. The crude robots are your enemies, sometimes. Supposedly, there is a war going on between the crude robots and the androids, but looks can be deceiving.

It is probably most accurate to depict the story about being about the post-human era on Earth. Human beings are gone, but intelligent machines remain behind. It is very reminiscent of Stross’ Saturn’s Children. Though everybody around is a machine, you get to care for them, very much so.

That’s maybe the surest sign that the game is a success. You care for the characters. Even if they are machines that can be rebooted at will. It is saying a lot because I don’t normally empathize easily with Japanese characters, as I find the Japenese culture a bit too strange. So while the game is simple, it is skillfully made.

If you ever liked playing Zelda, and you don’t mind something a bit more serious where Zelda could die, this is a game for you.

Science and Technology links (June 23rd, 2017)

Elon Musk, Making Humans a Multi-Planetary Species, New Space. June 2017, 5(2): 46-61.

Reportedly, Ikea is working on augmented reality software that would allow you to see the furniture in your home before buying it.

Current virtual reality headsets provide a good experience, but if you ever tried to read text while where one of the headsets, you may have noticed that it is quite hard. It seems that it is because the image resolution is too low. When using prototypes with very high resolution, you can “read text across the room”.

You would think that a tree that is over 200 years old would have accumulated a lot of (random) genetic mutations. However, it seems that it is not the case: as trees grow old, even very old, they preserve their genes intact. We do not know why or how.

Top speed for top-k queries

Suppose that I give you a stream of values and you want to maintain the top-k smallest or largest values. You could accumulate all these values, sort them all and keep the smallest k values.

You can do somewhat better by using a binary heap to construct a priority queue. And you can also use the QuickSelect algorithm. I compared naive implementations of both the approach based on a binary heap and on QuickSelect in a previous blog post.

Let us review the code of these techniques. I use JavaScript because it is supported everywhere.

We can sort and slice…

var a = new Array();
for (var i = 0; i < streamsize; i++) {
            a.push(rand(i));
}
a.sort(function(a, b) {
            return a - b
});
var answer = a.slice(0, k);

We can dump everything naively into a priority queue (backed by a binary heap)…

var reverseddefaultcomparator = function(a, b) {
    return a > b;
};
var b = new FastPriorityQueue(reverseddefaultcomparator);
for (var i = 0; i < k; i++) {
            b.add(rand(i));
}
for (i = k; i < streamsize; i++) {
            b.add(rand(i));
            b.poll();
}
var answer = b.array.slice(0, k).sort(function(a, b) {
            return a - b
});

Each time we call poll in the priority queue, we remove the largest element. Thus we maintain a list of k smallest elements.

I now want to consider less naive implementations of these ideas.

At all time, we have access to the largest element in the priority queue through the peek function. Long-time reader Kendall Willets pointed out to me that we can make good use of the peek function as follows.

var b = new FastPriorityQueue(reverseddefaultcomparator);
 for (var i = 0; i < k; i++) {
            b.add(rand(i));
 }
 for (i = k; i < streamsize; i++) {
            var x = rand(i);
            if (x < b.peek()) {
                b.add(x);
                b.poll();
            }
 }
 var answer = b.array.slice(0, k).sort(function(a, b) {
            return a - b
 });

This has the potential of reducing substantially the amount of work that the priority queue has to do.

What about the QuickSelect algorithm?

Michel Lemay pointed out to me what the Google Guava library does, which is to use a small buffer (of size k) and to run QuickSelect on this small buffer instead of the whole stream of data.

var a = new Array();
for (var i = 0; i < 2 * k; i++) {
            a.push(rand(i));
}
QuickSelect(a, k, defaultcomparator);
var eaten = 2 * k;
while (eaten < streamsize) {
            for (var i = 0; i < k; i++) {
                a[k + i] = rand(i + eaten);
            }
            QuickSelect(a, k, defaultcomparator);
            eaten += k;
}
var answer = a.slice(0, k).sort(function(a, b) {
            return a - b
});

David Eppstein described to me something slightly more complex where we use our knowledge of the pivot in the QuickSelect algorithm to only merge potential candidates.

var a = new Array();
var i = 0;
for (; i < 2 * k; i++) {
            a.push(rand(i));
}
QuickSelect(a, k, defaultcomparator);
var pivotvalue = a[k];
var locationinbuffer = k;
for (; i < streamsize; i += 1) {
            var newvalue = rand(i);
            if (newvalue < pivotvalue) {
                a[locationinbuffer++] = newvalue;
            }
            if (locationinbuffer == 2 * k) {
                QuickSelect(a, k, defaultcomparator);
                locationinbuffer = k;
            }
}
if (locationinbuffer != k) {
            QuickSelect(a, k, defaultcomparator);
}
var answer = a.slice(0, k).sort(function(a, b) {
            return a - b
});

It is not exactly what David described, but I believe that it captures the essence of his idea.

So how do these techniques fare?
I have implemented a benchmark that you can run yourself, here are my raw results:

$ node test.js
Platform: linux 4.4.0-38-generic x64
Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
Node version 4.5.0, v8 version 4.5.103.37


starting dynamic queue/enqueue benchmark
FastPriorityQueue x 3,095 ops/sec ±0.15% (97 runs sampled)
FastPriorityQueue-KWillets x 18,875 ops/sec ±0.59% (93 runs sampled)
sort x 490 ops/sec ±0.37% (94 runs sampled)
QuickSelect x 8,576 ops/sec ±0.31% (97 runs sampled)
QuickSelect-naiveGoogleGuava x 6,003 ops/sec ±0.20% (98 runs sampled)
QuickSelect-naiveDavidEppstein x 7,317 ops/sec ±0.22% (98 runs sampled)

So in my particular test, the approach proposed by Kendall Willets, using a priority queue with pruning based on the peek value, is a net winner. Your results might vary, but I think that the Willets approach is attractive. It is simple, it uses little memory and it should provide consistent performance.

Exercise for the reader: I am using a random input in my benchmark. As pointed out by David Eppstein, this means that we can bound the probability that the ith value is in the top-k by min(1, k/i), so that the Willets check only fails O(k log n) times. How well would the different techniques do for nearly sorted data?

Update: As pointed out in the comments, we can further improve the Willets approach by merging the add and poll functions into a single function. The code is similar…

var b = new FastPriorityQueue(reverseddefaultcomparator);
 for (var i = 0; i < k; i++) {
                b.add(rand(i));
 }
 for (i = k; i < streamsize; i++) {
                var x = rand(i);
                if (x < b.peek()) {
                    b.replaceTop(x);
                }
 }
 var answer = b.array.slice(0, k).sort(function(a, b) {
                return a - b
 });

So what is the performance this time?

$ node test.js
Platform: linux 4.4.0-38-generic x64
Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz
Node version 4.5.0, v8 version 4.5.103.37


starting dynamic queue/enqueue benchmark
FastPriorityQueue x 3,137 ops/sec ±0.13% (97 runs sampled)
FastPriorityQueue-KWillets x 19,027 ops/sec ±0.37% (94 runs sampled)
FastPriorityQueue-KWillets-replaceTop x 22,200 ops/sec ±0.46% (95 runs sampled)
sort x 479 ops/sec ±0.42% (92 runs sampled)
QuickSelect x 8,589 ops/sec ±0.31% (98 runs sampled)
QuickSelect-naiveGoogleGuava x 5,661 ops/sec ±0.22% (97 runs sampled)
QuickSelect-naiveDavidEppstein x 7,320 ops/sec ±0.25% (98 runs sampled)

So we improve the performance once more!

Update 2: You can combine a binary heap and QuickSelect using introselect.

Science and Technology links (June 16th, 2017)

How much bandwidth do we have? It seems that each of our eyes has 1 megabyte per second. That’s about 100GB per day assuming you close one of your eyes and you never sleep.

I found a fascinating article on “academic urban legends” including the myth that spinach is a good source of iron:

First, it falsified an idea that I had carried with me since I was a child, that spinach is an excellent source of iron. The most striking thing, however, was that a single decimal point, misplaced 80 years ago, had affected not just myself and my now deceased parents, but also a large number of others in what we place on our table.

(Read the whole article, it is not as simple as a decimal point.)
(via John D. Cook)

There is an app for the Apple Watch called cardiogram that is taking the wearers of the watch by storm:

Cardiogram uses the Apple Watch’s built-in heart sensor to give you advice about heart health — for instance, showing when you’re unusually stressed out by checking your heart rate.

I really need to get myself one of these watches!

As we age, we tend to lose muscle mass, to get weaker. The process is not entirely well understood, but it seems that it could be mediated at least in part by a loss of activity… As we age, we exercise less and this leads to detrimental changes in our metabolism. There is some hard evidence for this theory:

(…) we find that sedentary but not active humans display an age-related decline in the mitochondrial protein that is associated with muscle loss.

(via P. D. Mangan)

China just switched on the world’s largest floating solar power plant. It is still tiny (40MW) in comparison with many conventional power plants.

You can find melatonin at your local drug store or you can order it from Amazon. Many people take it at night to help them sleep. It seems that it could prevent neurological diseases. I sleep very well, so I am not tempted to take melatonin, but if I ever have trouble sleeping, I might quickly change my mind.

WordPress (the company) is shutting down its San Francisco office because employees never show up, preferring to work from home. I tend to work mostly from home myself, and I might never leave my home, except that I have graduate students and postdoctoral fellows to meet, and I am also department chair, so I have to show up for meetings from time to time. This being said, for programming or writing, I am far more productive at home. If I were a full-time programmer, I would certainly expect my employer to let me work from home. Somewhat ironically, it is part of my job to try to entice graduate students to show up to our lab.

When your skin gets red and your body aches, you are probably suffering from inflammation. It is a crude response from our immune system that helps us fight viruses and bacterias. We can suppress inflammation using something as simple as aspirin. However, inflammation is also involved in tissue repair. A recent paper shows that a protein that drives inflammation is critical for muscle regeneration.

Perutz wrote in Nature:

The number of scientists in the biomedical field is growing exponentially (…)

Peter Rothman comments:

Contrary to some assertions, exponential advances in science and technology are not due to some fundamental law of nature but rather the exponential increase in the number of scientists doing science. (…) Biological research progress doesn’t generally scale with computer chip density although some specific areas might scale with available computational power. In other areas, you still need to spend a large number of hours in the lab actually doing the work.

Science, even medical science, is not homogeneous. Many areas of science advance at a breakneck speed, even when relatively few people are involved. Back in 2012, doing gene editing was science-fiction for most people, but today high school students can do it, affordably. Back a few years ago, advanced computing applications such as face recognition required a Ph.D. in the field and advanced mathematics… whereas today you can get it done in a few hours using free software libraries. Meanwhile, other research areas, like cancer research, remain painfully slow and expensive. What we hope is that the areas that make rapid progress can rescue the slow moving fields. This has happened time and time again in our history. That is, it would have been useless for the Roman Empire to try to build the Internet. No matter how much they would have invested, the progress would have been non-existent. However, the Roman Empire could have benefited greatly from an industrial revolution.

Senescent cells are “old” cells that won’t die. We now have the technology to kill them, albeit that’s not yet perfected to the point where your doctor can get rid of your senescent cells. What do senescent cells do? A lot of bad. A recent article shows that senescent cells drive the “fatty liver” disease.

QuickSelect versus binary heap for top-k queries

In a previous post, I considered the problem of finding the k smallest (or k largest) elements from a stream of values.

The naive approach is to collect all the values in an array, sort the array, and return the first k values. When an index is not available, that is how some relational databases reportedly solve SELECT/ORDER BY/LIMIT queries. It is also how many programmers will solve this problem.

Using a JavaScript benchmark, I showed that an approach based on a binary heap could be several times faster. That is a net win for computer science since an approach based on a binary heap has complexity O( n log k ) where n is the number of elements. Moreover the binary heap is attractive because it has an optimal storage requirement of O( k ).

Lots of people objected that we could do even better using an algorithm called QuickSelect (e.g., Rasmus Pagh). Whereas QuickSelect has far higher memory requirements than a binary heap (O( n )) and a worst complexity of O( n2 ), it has a superior average case complexity of O( n ). (Note: you can combine a binary heap and QuickSelect using introselect, I do not consider this possibility.)

Finally, Anno Langen sent me a code sample in JavaScript, so I no longer any excuse not to test QuickSelect. I put together a JavaScript benchmark that you can run and review.

In this benchmark, we receive 1408 random integers and we must collect the smallest 128.

The approach using a binary heap can run about 37,000 times a second, whereas QuickSelect runs 45,000 times per second, or about 20% faster. They are both about an order of magnitude faster than the naive sort/slice approach.

For all practical purposes, 20% faster is negligible in this case. I have actually hit a sweet spot where QuickSelect and the binary heap are comparable.

What are other cases of interest?

  • If you only seek the top 5 elements out of an array, then the binary heap is likely to beat QuickSelect, irrespective of how many elements I have. The binary heap will fit in one or two cache lines, and the log k factor will be small.
  • If I keep k at a sizeable 128, but I increase substantially the size of the array, then QuickSelect will start to win big.
  • However, if I keep increasing the array size, the benefits of QuickSelect might start to fade. Indeed, QuickSelect will start to operate in RAM whereas the binary heap will remain in CPU cache. QuickSelect will become increasingly limited by potentially expensive cache misses.
  • QuickSelect still has the worst case quadratic-time scenario that could be triggered by really bad input data. The binary heap is more likely to provide consistent speed.

What else could we do?

  • Martin Cohen suggested a simple insertion sort on the basis that, for large streams, you are unlikely to encounter many candidates for the top-k position. This would probably work well for very long streams, but it could degenerate badly in some cases.
  • Michel Lemay refered to a fancier approach used by Google Guava: maintain a buffer of 2k elements initially empty. Fill it up. Once it is full, use QuickSelect on it and discard half of it. Rinse and repeat. This seems like an appealing idea and Michel testifies that it provides very good practical performance.

So I am probably going to have to return to this problem.

Follow-up blog post: Top speed for top-k queries.

Science and Technology links (June 9th, 2017)

This week, Apple told us about two new interesting pieces of technology.

  • On the one hand, we have ARKit. It makes it easy for developers to build and deploy “augmented reality” applications to the iPhone in your pocket. What is augmented reality? In the visual space, it is a way to add “virtual” objects on top of what you’d normally see. So, for example, a furniture company could make it easy for you to “see” the new furniture in your living room as if you owned it. That’s nice except that looking at your mobile phone is an awkward way to do augmented reality. We’d prefer to have something like “smart glasses”. It is surely coming in the near future.
  • On the other hand, Apple released Core ML which is a way for app developers to make use of fancy machine learning techniques without too much effort. Suppose you want to build an app that can look at people’s skin to help diagnose a disease. You could train a model using your PC, and then drag-and-drop this model into your app, and then deploy it to all new iPhones.

    Suppose that we had smart watches that measure your pace, your sleep, your heart rate and, one might dream, your glucose level… combine this with Core ML and you might have an app that assesses your health and makes personalized recommendations.

It is interesting to look at the contrast between Apple and the rest of the industry. Google and Microsoft want us to use their services, Apple wants us to buy devices on which we can run apps. This leads to a different focus.

Speaking of Apple, its new iPad Pro uses a processor designed by Apple. It is faster than the Intel processor Apple uses in its cheapest laptops. Given that even the cheapest laptop from Apple is good enough for most people since its cheap laptops are about “mid-range” compared to other vendors, we can say that an iPad Pro could, in theory, replace adequately a laptop for most people, if we only cared about performance. To be fair to Intel, the Apple processor used in its iPad Pro has extra cores and it runs at a much higher clock frequency than the mobile Intel processor used in the laptops (though the laptop has more RAM). Still, we are at the point where top-of-the-line mobile phones and tablets are more powerful than mid-range laptops. It will be interesting to see how the trend goes in coming years.

Brent Smith and Greg Linden review two decades of recommender systems at Amazon.com. Do read it no matter what your scientific level is, it is worth your time. The mathematics can be safely skipped if you want.

India Launches 104 Satellites From a Single Rocket.

A professor was fascinated by the fact that cats that had their spinal cord severed at birth could still walk. After decades, he managed to get some positive results in human beings that had their spinal cord damaged. That is, paralyzed human beings can regain some control of their legs. As a researcher, he was the subject of mockery.

It turns out that if you introduce a computerized symptom monitoring system in cancer patients, they do better.

If you combine machine learning and radiology, you can make predictions about one’s health, including longevity.

Digital therapeutics is the idea that you can use software as medicine. I don’t think it is as crazy as it sounds. That you can cure a disease with a pill is not so intuitive, it is only because we are used to it that we take it for granted. I do think we may see software in the future being a strong component of medical therapies. It is much cheaper to deploy software than to distribute and monitor pills.

Dubai has robot cops.

IBM claims to have breakthrough technology regarding the design of smaller and cheaper silicon-based microprocessors.

Sugar is often thought to be bad for you, but a special type of sugar called trehalose seems to protect against cardiovascular diseases when injected in mice. You can find trehalose on Amazon. It is widely used in Japan, so it ought to be safe, at least in modest quantities.

Quickly returning the top-k elements: computer science vs. the real world

A standard problem in computer science is to quickly return the smallest or largest K elements of a list. If the list is made of N elements, then we can solve this problem in time N log (K) using a binary heap.

How would the average programmer solve this problem? My guess is that unless K is 1, then most programmers would simply sort the list and pick the first or last K elements.

That’s not such a bad deal. The difference between log (K) and log (N) might not be so large. The logarithm function grows slowly.

It could even be that a well optimized sort function could beat a poorly optimized binary heap.

To test it out in the real world, I wrote a benchmark using a JavaScript library. In this benchmark, I generate 1408 random integers and I seek to select the largest 128 integers.

You might ask: why JavaScript? Because JavaScript is everywhere and we should care about its performance.

The code with a priority queue backed by a binary heap looks like this… we eat the first 128 integers, and then we do a push/poll dance to maintain always the largest 128 integers. (The blocks parameter is set at 10 in my tests, generating (blocks + 1) * 128 values.)

var b = new FastPriorityQueue(defaultcomparator);
for (var i = 0 ; i < 128  ; i++) {
  b.add(rand(i));
}
for (i = 128 ; i < 128 * blocks  ; i++) {
  b.add(rand(i));
  b.poll();
}

The sorting routine is somewhat simpler. Just construct a big array, sort it and then extract the first few values:

var a = new Array();
for (var i = 0 ; i < 128 * (blocks + 1)  ; i++) {
   a.push(rand(i));
}
a.sort(function(a, b) {
  return b - a; // in reverse
});
return a.slice(0,128);

How does it fare?

The approach using a binary heap can run about 37,000 times a second, whereas the sorting approach is limited to about 4,000 times a second. So a factor-of-ten difference.

In this instance, computer science wins out: using a binary heap pays handsomely.

Another way programmers might implement a top-K is by a SQL ORDER BY/LIMIT query. My friend, professor Antonio Badia, checked how PostgreSQL solves these queries and he believes that they result in a full sort unless there is a tree index on the relevant attribute. Can other databases, such as Oracle or SQL Server do better than PostgreSQL? It is possible, but PostgreSQL is hardly a naive database engine.

Interestingly, it might be quite challenging for a database user to somehow implement a binary heap solution on top of a relational database. Database engines rarely give you direct access to the underlying data structures.

So all your fancy computer science knowledge might be of limited use if your data is managed by an engine you cannot hack.

Science and Technology links (June 2nd, 2017)

Methylene blue rejuvenates old skin. You can buy methylene blue on Amazon and spray it on your face. It is reportedly safe, even in high concentration. Of course, you are likely to turn blue.

A school in the UK is offering a degree in eSport (video games).

There are 20 CRISPR (gene editing) clinic trials going on right now.

According to Bloom et al., ideas and in particular the exponential growth they imply are getting harder and harder to find.

Microgravity spurs genetic mutation.

The US pulled out of the Paris Agreement on climate change. What does that mean in concrete terms?

(…) since Paris was an entirely voluntary deal, Trump’s climate policies weren’t constrained in any meaningful way by our participation in it. Conversely, his climate policies won’t be substantively different now that we’re pulling out of Paris.

Samsung plans to make processors with 4nm features in 2020. As far as I know, this is half as small as any current processor.

Solar power is now cheaper than coal power in India.

The cortex is the part of the brain of mammals that is responsible for most higher-level brain functions. We sometimes lose some of the neurons in the cortex which, if this damage accumulate, can lead to serious cognitive problems. French researchers have shown that we can replenish these neurons using stem cells.

Japan is facing a decade-long period of stagnation, and it is running out of workers. So it is reportedly moving to legalize drone deliveries by 2020.

Some company claims that the blood plasma from young people can rejuvenate older people. This seems highly speculative and I would not recommend you try it. I would bet that the results are wrong. I am sure we can rejuvenate old human beings, but probably not through plasma transfusions.

Unsigned vs. signed integer arithmetic

Given any non-negative integers x and d, we can write uniquely x = q d + r where q (the quotient) and r (the remainder) are non-negative and r is less than d. We write r = x mod d.

Most modern processors represent signed and unsigned integers in the following manner:

  • Unsigned integers as simply 32-bit or 64-bit binary numbers. E.g., we write the number 15 as 0b0...01111. Programmers write bits from right to left, starting with the least significant bits: the right-most bit has value 1, the second right-most bit has value 2, and so forth.

    We have to worry about overflows: e.g., when adding two 32-bit integers that won’t fit in 32 bits. What happens if there is an overflow? How your programming language handles it is one thing, but processors often make it easy to compute the operations “modulo the word size”. This means that x + y is the same as (x + y) mod 2w where w is the word size in bits (typically 32 or 64).

    Dividing by two can be achieved by “right-shifting” the bits by 1. Multiplying by two is equivalent to “left-shifting” the bits by 1.

  • Signed integers are implemented at the processor level in a manner similar to unsigned integers, using something called Two’s complement. We can think about Two’s complement as a way of mapping signed values to unsigned (binary) values. Positive values (in [0,2w-1)) are mapped to themselves (m(x)= x) whereas negative values (in [-2w-1, -1]) are mapped to the complement (m(x)= 2wx).

    You can distinguish negative integers from positive integers since their left-most bit is set to true. That’s convenient.

    Programmers prefer to say that we can negate a number by flipping all its bits (sometimes called “one’s complement”) and adding the value one. You can verify that if you take a number, flip all its bits and add the original number, you get a binary value with all the bits set to 1, or 2w-1. So the result follows by inspection but I prefer my formulation (m(x)= 2wx) as it makes the mathematics clearer.

A fun problem with Two’s complement is that you have more negative numbers than you have positive (non-zero) numbers. This means that taking the absolute number of a signed integer is a bit tricky and may lead to an overflow. Thus we cannot do something like a/b = sign(b) * a / abs(b) or a * b = sign(a) * sign(b) * abs(a) * abs(b) since there is no safe absolute-value function.

The nice thing with Two’s complement is that because the processor implements unsigned integer arithmetic modulo a power of two (e.g., (x + y) mod 2w ), then you “almost” get the signed arithmetic for free. Indeed, suppose that I want to add two values x and y but that one of them (y) is negative. Then I first apply my map to transform them into unsigned values (y becomes 2w-y), and I end up with x + 2w-y mod 2w which is just x - y mod 2w as one would expect.

So we get that multiplications (including right shifts), additions, and subtractions are nearly identical operations whether you have signed or unsigned integers.

This means that a language like Java can avoid unsigned integers for the most part. Signed and unsigned integers more or less work the same.

When you program in assembly, it is not quite true. For example, on x86 processors, the MUL (unsigned multiplication) and IMUL (signed multiplication) instructions are quite different. However, as long as you only care for the multiplication modulo the word size, you can probably use them interchangeably: a critical difference is that the unsigned multiplication actually compute the full result of the multiplication, using another word-sized register to store the more significant bits. Other processors (e.g., ARM) work slightly differently, of course, but the need to distinguish between signed and unsigned integers still arise from time to time.

The division, remainder and right shifts are something else entirely.

Let us divide by two. The value -2 is mapped to 0b111...10. With unsigned arithmetic, we would simply shift all bits right by one, to get 0b0111...1, but that’s no longer a negative value in Two’s complement notation! So when we shift right signed integers, we typically replicate the left-most bit, so that we go from 0b111...10 to 0b111...11.

This works well as long as we only have negative even numbers, but let us consider a negative odd number like -1. Then doing the signed right shift trick, we go from 0b111...11 to 0b111...11, that is (-1>>1) is -1. It is not entirely unreasonable to define -1/2 as -1. That’s called “rounding toward -infinity”. However, in practice people seem to prefer to round toward zero so that -1/2 is 0.

The net result is that whereas division by two of unsigned integers is implemented as a single shift, the division by two of signed integers ends up taking up a handful of instructions.

Science and Technology links (May 26th, 2017)

Linux, the operating system driving cloud computing and the web, was developed using an open source model. For a time, Linux was seen as a direct competitor to Microsoft, but things have changed and Microsoft is happy to see Linux as just a piece of technology. Because of how large and complicated the software got, the Linux project manager, Linus Torvalds, ended up writing its own source control tool, Git. It quickly became a standard. Today, Windows is built on Git. This shows the power of open source. “Open source” is a concept just as powerful and important for our civilization as “the scientific method”. Though both science and open source can wipe out business models, they are also engines of innovation making us all richer. Microsoft does not use Linux and Git because it gave up on having viable competitors, but rather because it understands that fighting open source is about as useful as fighting the scientific method. (As an aside, modern implementations of Git are accelerated with compressed bitmaps called EWAH, something I personally worked on.)

Organic food is better for you, right? Galgano et al. (2016) find no evidence of such benefits:

The organic food market is growing in response to an ever increasing demand for organic products. They are often
considered more nutritious, healthier, and free from pesticides than conventional foods. However, the results of scientific studies do not show that organic products are more nutritious and safer than conventional foods.

Ok but organic food is better for the environment, right? Maybe not because organic farming requires more land and more animals:

Furthermore, higher on-farm acidification potential and global warming potential per kilogram organic milk implies that higher ammonia, methane, and nitrous oxide emissions occur on farm per kilogram organic milk than for conventional milk. Total acidification potential and global warming potential per kilogram milk did not differ between the selected conventional and organic farms.

A company called Warby Parker has a mobile app you can use to sidestep entirely optometrists and opticians in some cases. The main point seem to be that a lot of what opticians do can be computerized easily and that it is not hard to check your prescription.

Omega-3 fats rejuvenate the muscles of old people:

Omega-3 polyunsaturated fatty acids reduce mitochondrial oxidant emissions, increase postabsorptive muscle protein synthesis, and enhance anabolic responses to exercise in older adults.

You have heard that teenage acne was unrelated to diet, right? Not so fast:

We found a positive association between intake of skim milk and acne. This finding suggests that skim milk contains hormonal constituents, or factors that influence endogenous hormones, in sufficient quantities to have biological effects in consumers.

The epidemic incidence of adolescent acne in Western milk-consuming societies can be explained by the increased insulin- and IGF-1-stimulation of sebaceous glands mediated by milk consumption.

We found a positive association with acne for intake of total milk and skim milk. We hypothesize that the association with milk may be because of the presence of hormones and bioactive molecules in milk.

Keytruda is the first cancer drug that targets a genetic dysfunction rather than specific cancer type. This type of drug might open the door to cheap and effective genetic cancer therapies. Note that Keytruda is actually approved for use in the US, so this is not purely speculative.

Volvo makes self-driving garbage-collecting trucks. A report from Goldman Sachs suggests that self-driving cars could destroy 25,000 jobs per month in the US. That sounds like a lot, but I don’t think it is anything dramatic, if true. See also the Sector Disruption Report (May 2017) by James Arbib and Tony Seba that sees the effect of self-driving car as enormous:

This will keep an additional $1 trillion per year in Americans’ pockets by 2030, potentially generating the largest infusion of consumer spending in history

Arthritis is a terrible disease where some people end up living in near constant pain. Yet it could be prevented with good diet and exercise, according to an article in Nature.

Jeff Bezos, Amazon’s CEO, wants us to build permanent settlements on the Moon. This is the same man who wants to deliver us goods using drones, and help cure aging through the clearance of senescent cells.

Scientists have linked 52 genes to intelligence. Let me caution you: no we cannot build super smart babies by manipulating genes. Not yet at least.

Counting exactly the number of distinct elements: sorted arrays vs. hash sets?

Suppose that you have ever larger sets of 64-bit integers, and you want to quickly find out how many distinct integers there are. So given {10, 12, 10, 16}, you want an algorithm to output 3, as there are three distinct integers in the set. I choose 64-bit integers, but strings would do fine as well.

There are sensible algorithms to estimate this number, but you want an exact count.

Though there are many good ways to solve this problem, most programmers would first attempt to use one of these two techniques:

  • Create a hash set. Throw all the values in the hash set (implemented with a hash table). Then check how many values are found in the hash set in the end. In C++, you might implement it as such:
    size_t distinct_count_hash(const uint64_t * values, size_t howmany) {
      std::unordered_set<uint64_t> hash(values, values + howmany);
      return hash.size();
    }
    
  • Put all the values in an array, sort the array then run through it, deduplicating the values. In C++, you might implement it as follows:
    size_t distinct_count_sort(const uint64_t * values, size_t howmany) {
      std::vector<uint64_t> array(values, values + howmany);
      std::sort(array.begin(), array.end());
      return std::unique(array.begin(), array.end()) - array.begin();
    }
    

Which is best? Sorting has complexity O(n log n) whereas insertion in a hash set has expected constant time O(1). That would seem to predict that the hash set approach would always be best.

However, there are many hidden assumptions behind textbook naive big-O analysis, as is typical. So we should be careful.

Simple engineering considerations do ensure that as long as the number of distinct elements is small (say no larger than some fixed constant), then the hash set approach has to be best. Indeed, sorting and copying a large array with lots of repeated elements is clearly wasteful. There is no need for fancy mathematics to understand that scenario.

But that’s not the difficult problem that will give you engineering nightmares. The nasty problem is the one where the number of distinct elements can grow large. In that case, both the array and the hash set can become large.

Which is best in that difficult case? I wrote a small C++ benchmark which you can run yourself.

N hash set (cycles/value) array sort (cycles/value)
1,000 220 161
10,000 260 163
100,000 340 200
1,000,000 820 245
10,000,000 1,100 282

So when there are many distinct values to be counted, sorting an array is an efficient approach whereas the hash table should be avoided.

How can we understand this problem? One issue is that as the hash table becomes large, it comes to reside in RAM (as it no longer fits in CPU cache). Because of how hash sets work, each operation risks incurring an expensive cache miss. A single retrieval from RAM can take dozens of CPU cycles. Meanwhile, sorting and scanning an array can be done while avoiding most cache misses. It may involve many more operations, but avoiding cache misses can be worth it.

What if I kept cranking up the data size (N)? Would the hash set ever catch up? It might not.

The problem is the underlying assumption that you can access all memory using a constant time. That’s not even close to true.

Science and Technology links (May 18th, 2017)

Google has announced at its annual conference (I/O 2017) that it has computing pods capable of 11.5 petaflops. They are made of 64 customized TPU (processors specialized for deep learning/AI), each generate 180 teraflops. It is going to be available to other companies via Google cloud. Google has also announced Google.ai, a new website where Google presents its work on AI. It seems likely that Google wants to sell AI as a service. Interestingly, 11.5 petaflops is Ray Kurweil‘s estimate regarding the computing power needed to simulate the human brain. There are supercomputers exceeding 10 petaflops right now, like Sunway TaihuLight, but it takes a whole room to host them whereas Google’s computing pods look to be the size of a server rack. And, of course, you and I cannot have access to China’s Sunway TaihuLight whereas, for the right price, Google gives us access to its computing pods. So is Google capable of emulating the human brain yet? Some less optimistic people have estimated that the computing power of the human brain is around 1 exaflop. So take 100 computing pods at 11.5 petaflops each and, according to some, you have the computing power of the human brain. Of course, we do not really know. Maybe 1 pod is enough to match a brain, or maybe we would need thousands. However, it looks like Google is within striking distance of matching the human brain in raw computing power with a single rack of computers. I should add that not all petaflops are comparable: Google has designed specialized hardware. Their computing pods may not be great a simulating the weather, for example.

What if you are not Google and want to build your own computing pods? Nvidia just announced its Nvidia Volta GPU. It has 120 teraflops. That’s a lot less than Google’s TPUs, but Nvidia GPUs are available to the public. With 85 Nvidia Volta GPUs, you are hitting 10 petaflops and you too, according to some, is within striking distance of the computing capabilities of the brain. I don’t know how much Nvidia will charge of its GPUs but a reasonable estimate might be $1000 (the actual price might be less). So for $100,000, you can build your own computing pod.

How does that compare to Intel chips that we all rely upon? A lot of Intel chips deploy about 100 gigaflops, or 0.1 teraflops. So to get to 10 petaflops, you’d need 100,000 chips. Not very practical.

I should qualify, once more, these numbers: counting the number of flops is just a way to get a rough estimate of the possibilities behind the hardware. It is quite possible to have 100 theoretical petaflops, but be unable to make good use out of them.

By the way, next time you are in a bar, here is a great pick-up line: “You must have a lot of petaflops”. You read it here first.

Google rolled up its smart reply feature. Way ahead of you Google! My students have been getting only one of three answers for last few years: “That’s great”, “Can you rephrase that?”, “Don’t worry about it”.

Scientists have figured out how to create stem cells that can regenerate blood cells:

When the researchers injected these stem cells into mice that had been treated with radiation to kill most of their blood and immune cells, the animals recovered. The stem cells regenerated the blood, including immune cells, and the mice went on to live a full life—more than 1.5 years in the lab.

Half of all jobs will be replaced by Artificial Intelligence in 10 years according to Kai-Fu Lee. He is a smart and important man. He could still be very wrong. He makes some great points such that China rose tremendously in artificial intelligence, starting from nothing and rising up to levels comparable to the US in a few years.

Facebook’s CEO wants to cure all diseases by the end of the century. To this end, they are giving grant money to researchers. There is a condition though:

The CZI (…) ask that all code developed in support of CZI-funded studies be published on public code repositories such as GitHub.

I have mixed feelings about this. Forcing people to publish their code does not necessarily have the expected effects. I have had colleagues who trumpeted that their research software was “open source”. And, sure enough, you could download the software online. As for building upon it or using it? Well. Good luck with that. You can’t mandate culture, you can only hack it.

Researchers describe in a Nature article how they were able to restore ovarian function in sterilized mice using 3D printing.

Tony Seba, an economist at Stanford University, predicts that fossil-fuel vehicles will disappear in 8 years, according to reports. They will be replaced by electrical self-driving vehicles.

Determining the truth is hard:

I show that fact-checkers rarely fact-check the same statement, and when they do, there is little agreement in their ratings.

Reminds me of science.

Macular degeneration is a terrible disease where older people become progressively blind. It looks like it has to do with eating too much sugar:

Changing the diet to a low-glycemic-index diet, even late in life, arrested the development of AMD, (…)

As an aside, it also looks like Type 2 diabetes and obesity are largely preventable through diet and exercise. Sadly, this is not enough to make these problems easy ones.

Scientists have bioengineered a synthetic pancreas that was transplanted into a patient. It cured its diabetes. Could it be that we are about to cure diabetes for good?

According to CNBC, Apple has engineers working on better ways to monitor blood sugar. Even though I am not diabetic (to my knowledge), I would love an Apple watch that monitors my blood glucose.

The initiative is far enough along that Apple has been conducting feasibility trials at clinical sites across the Bay Area and has hired consultants to help it figure out the regulatory pathways, the people said. One person said about 30 people were working in this group as of a year ago. But speculation has been flying around since the company snapped up about a dozen biomedical experts from companies like Vital Connect, Masimo, Sano, Medtronic and C8 Medisensors. Some of these people joined the secretive team dedicated to glucose, sources said, while others are on Apple Watch team. One of the people said that Apple is developing optical sensors, which involves shining a light through the skin to measure indications of glucose.

The world faced a major cyber attack where attackers captured computers and asked for compensation to release them. The virus was called WannaCry and affected solely the Windows operating system. The English health organizations were hit hard. The attack was reportedly diffused by a 22-year-old dropout who found a hidden kill switch in the ransomware virus.

I love the power of simple mathematics! On this note, Maurer provides us with a nice analysis with respect to population growth. Currently, the world can be roughly divided in two. There are countries with low fertility but high longevity (e.g., Japan) and countries with high fertility and short lives (e.g., many countries in Africa). Which are more likely to exhibit overpopulation in the future? As an empirical observation, we should point out that many countries with high longevity, like Japan and Germany, are actually undergoing “de-population” in the sense that their population is falling. But what is the math telling us?

Assume an initial population of 1000 people. The fertility rate is 2, and the life expectancy is 80. Women give birth at 20. Now, let us consider two variations:

Case A: Death disappears. Nobody dies anymore!

Case B: The fertility rate slightly increases from 2 to 2.5.

Which of these two cases will lead to the greater population increase?

– After 1000 years, the population will be 51 000 in case A, and at least 206 000 000 in case B: more than 4000 times case A! The gap will be enormous.

The conclusion is clear: if you are worried at all about overpopulation, you must be concerned about fertility. And we know, empirically, that fertility falls when women have well-paid jobs, education, contraceptives, and freedom. The solution is clear. We must opt for prevention of the diseases of old age (to diminish the burden of care, mostly affecting women) and ensure that women are well educated, free and well remunerated (so that they have low fertility). This simple strategy alone is very likely to prevent overpopulation and it has the side benefits of making people (starting with women) better off.

Educational backgrounds of the CEOs of the top corporations in the US

  • Apple is the most valuable company in the US. The CEO is Tim Cook who has a bachelor of science in industrial engineering from Auburn University. The chairman is Arthur D. Levinson who has a degree in molecular biology from the University of Washington.
  • Alphabet (Google’s parent company) is the second most valuable company. The CEO is Larry Page who has a bachelor of science in computer engineering from the University of Michigan. The CEO of Google itself is Sundar Pichai who has a degree in metallurgical engineering from Indian Institute of Technology Kharagpur. The chairman is Eric Schmidt who has a degree in electrical engineering from Princeton. Eric Schmidt wrote software that many of us rely upon to this day.
  • Microsoft’s CEO is Satya Nadella. He has a bachelor degree in electrical engineering from the Manipal Institute of Technology.
  • Amazon’s CEO is Jeff Bezos, he has bachelor of science degrees in electrical engineering and computer science from Princeton.
  • ExxonMobil’s CEO is Darren Woods who has bachelor degree in electrical engineering from Texas A&M University.
  • Johnson & Johnson’s CEO is Alex Gorsky who has a bachelor of science degree from the U.S. Military Academy at West Point.
  • Facebook’s CEO is Mark Zuckerberg, a college drop-out who was studying computer science at Harvard.

Of all degrees granted in 2014-2015 in the US, about 7% were in engineering or computer science, about as much as were granted in psychology. By far the most popular field of study is business (20% of all degrees).

Has the Internet killed real estate agents yet?

Back in 2002 when I was first interested in buying a house, I went on the Internet and found lots of houses for sale, directly from the sellers. I bought the house I own right now directly from the seller. At the time, I was convinced that the days of real estate agents were counted. I remember telling a friend who wanted to go into real estate that the Internet would soon kill this industry.

It made sense. Idiots like myself could buy houses from other idiots, that is people without any training in real estate, without any other intermediary than the Internet. How long could the real estate agents last?

Real estate agents don’t inspect houses, they do not have power of attorney, they do not provide deeds, they do not provide the financing, they do not provide the insurance. Real estate agents may take the pictures and post them on the Internet, but iPhones take decent pictures.

A home inspection (not covered by the agent’s fees) might cost you $300. A lawyer will charge you a flat fee to represent you in the transaction (maybe $1000, not covered by the agent’s fees). The bulk of the transaction costs are taken up by the real estate agent.

Yet real estate agents are still with us, charging 5% in commission. That’s a sweet deal: sell a single home and you can charge half of what many people make in half a year.

It hurts my ego to admit that I was badly wrong: the Internet has not affected real estate agents in the least.

You’d think people would be eager to keep the commission fee for themselves (it is tens of thousands of dollars!). The Washington Post tell us that nothing of the sort is happening:

And over the past decade, the Internet has disrupted almost every aspect of a transaction that sits at the core of the American Dream. Everyone now has free access to information that used to be impossible to find or required an agent’s help. But as a new home-buying season kicks off, one thing remains mostly unchanged: the traditional 5-to-6-percent commission paid to real estate agents when a home sells. While the Internet has pummeled the middlemen in many industries decimating travel agents, stomping stock-trading fees, cracking open the heavily regulated taxi industry  the average commission paid to real estate agents has gone up slightly since 2005, according to Real Trends. In 2016, it stood at 5.12 percent. There’s not a shred of evidence that the Internet is having an impact, Murray said, sounding like he almost can’t believe it himself.

The article argues that the sale of a home is a complicated transaction. Oh! Come on! That’s a pathetic explanation: planning a trip abroad is complicated and, yet, we have no qualm doing away with travel agents and using the Internet instead. Of course, the transaction cost is higher which makes it worthwhile to pay someone to help. But 5% of the transaction is lot. In Canada, that’s about $25,000 to sell a single house (5% of $500,000): the price of a brand new car. And selling and buying houses is really not that complicated. It is not $25,000-complicated.

Recall that real estate agents do not provide home inspection, insurance, financing, legal titles… all of these things are separate expenses provided by separate people.

Whether real estate agents have expenses, and how much of the 5% they pocket is irrelevant. The fact is that this 5% has remained the same for decades. This means that, in real dollars, real estate agents cost the same today as they did decades ago.

To put it another way, the productivity of real estate agents has, if anything, decreased in recent decades despite all the technological progress. In comparison, all industries confounded, productivity grows by about 1% a year. That’s why Americans, on average, are much richer than they were decades ago.

On average, workers are at least 20% more productive today than they were 20 years ago. But not real estate agents.

Another way to describe a stagnation or decline in productivity is to say that real estate agents, despite all their new tools, are not getting any better over time, and are probably getting slightly worse since their cost is rising.

They have cheap mobile phones, the Internet, databases, fancy software… all of that has not, in the least, made them more productive.

How well do the real estate agents serve the interest of their clients? Maybe not so well:

Those selling without an estate agent were more satisfied and the gap between sales price and asking price was smaller than for those selling through a real estate broker. (Stamsø, 2015)

Our central finding is that, when listings are not tied to brokerage services, a seller’s use of a broker reduces the selling price of the typical home by 5.9% to 7.7%, which indicates that agency costs exceed the advantages of brokers’ knowledge and expertise by a wide margin. (Bernheim and Meer, 2012)

Many real estate agent recommend that sellers lower their prices (thus making their job much easier) on the belief that buyers are going to bid on the house. Yet this is a terrible strategy for their clients:

While the (…) recommendations of real estate agents (…) favor underpricing, alluding to a potential herding effect, our market data do not provide any support for this strategy. (Bucchianeri and Minson, 2013)

Can you do better with a cheap, flat-fee broker? It seems you can:

Brokers with a flat-fee structure who charge an up-front fee (which is substantially lower than the average fee of traditional brokers) and leave the viewings to the seller sell faster and at – on average – 2.7 percent higher prices. (Gautier et al. 2017)

So knowing all this… why hasn’t the Internet at least forced the real estate agents to lower their commission fees? If Uber was able to break the cab driver’s back, why can’t we come up with the equivalent for real estate?

I have nothing against real estate agents, I am just curious. And please, don’t tell me it is the “human element”. People don’t go around hugging their real estate agents, not any more than they hugged their travel agents.

Update: A comment by Panos Ipeirotis suggests that travel booking sites also charge a large percentage (15%-20%) on hotel reservations while AirBnB charges 6% to 12%. This would mean that real estate agents might not be such outliers. I went looking for signs that travel agents had disappeared and it seems that there are still many of them, though their work was transformed over time. This makes me question the belief that “the Internet has pummeled the middlemen in many industries” as stated in the Washington Post.

My review of Change Agent: A Novel (by Daniel Suarez)

Change Agent is a sci-fi novel still hot from the presses. It set in our near future (2049).

The genre has been captured by writers who love dystopian futures. Suarez can’t quite distance himself from this trend. We are in for massive climate changes and millions of climate refugees. We have gene therapies that are owned and operated by organized crime, complete with children being experimented and discarded. It is not a pleasant world.

However, there was enough in this novel to keep me interested. Some interesting bits:

  • The novel is clearly inspired by George Church work. In particular, Church’s book Regenesis is cited. The general thesis is that genetic engineering will soon supplant computers and electronics.
  • The novel pretty much embraces everything from Church’s work, save for, apparently, the possibility that we might stop the aging process. That makes for some intriguing effects. For example, in the novel, you can change someone’s genes dynamically and this results in a new appearance. However, even after changing all of your genes, and having your organs reconfigured, you somehow remain the same age.
  • The novel adopts the idea that once we can manipulate genes, parents will be eager to twice the genes of their embryos, going as far as dealing with organized crime to get the desired results. This seems very unlikely. Parents do not tend to be eager to take risks with their children.
  • It seems that intelligence can be greatly improved using genetic updates. This seems unlikely, at least by 2049.
  • The center of the world is no longer the Silicon Valley but rather Singapore. American engineers were prevented from participating in the bio-engineering revolution, but the Singapore government had no qualm about embracing the new technology.
  • Self-driving cars are ubiquitous and you can easily rent one. However, it is easy for troublemakers to stand in front of your rented self-driving car to prevent you from moving.
  • Augmented reality is ubiquitous. Some interesting applications are discussed such as the possibility that you might browse a pharmacy without having to read the tiny prints.
  • Artificial intelligence is everywhere.

I think it is a fairly realistic depiction of a possible near-term future.

Science and Technology links (May 12th, 2017)

The Apple watch can be used to diagnose heart disease automatically. This is not marketing talk, but hard research. And, of course, there is no reason for this kind of work to be limited to Apple products. In the near future, many of us, beyond a certain age, will wear devices monitoring our health. It only makes sense.

Nvidia released new processors based on the Volta architecture. The new V100 card is made of 21 billion transistors and has hundreds of cores dedicated to deep learning (called tensor cores). It seems that Nvidia has no problem bumping up the performance of its chips year after year.

Could the industrial revolution have arisen in France? Howes points out that French scientists were said to be rather more concerned with abstract theorising than with applying their knowledge.

Apple is the first company to be worth $800 billion. Its main product is the iPhone, a product that did not exist 10 years ago.

Amazon is selling a $230 device (Echo Show) that you can talk to, use as a touchscreen and use to make video calls. Yes. $230.

Published in Nature: A chronic low dose of tetrahydrocannabinol (THC) restores cognitive function in old mice. In other words, cannabis could rejuvenate your brain. Now that cannabis is being decriminalized in North America, we might get to learn a lot more about it is medical uses.

Kinsey reflects on the fact that current machine learning techniques require large budgets:

 many of the papers deemed most significant relied on massive compute resource that is usually unavailable to academics.

I would add that access to the data is also limited outside large corporations like Google. The fact that progress depends on large corporations is hardly new of course. It is not like academics can design new computer chips to compete with Intel.

North American Robotics Market Surges 32 Percent.

According to Nature: A fully organic retinal prosthesis restores vision in a rat model of degenerative blindness.

Again in Nature: Generation of inner ear organoids containing functional hair cells from human pluripotent stem cells.

Some researchers claimed to have figured out what causes our hair to turn gray with age: our findings reveal the identities of hair matrix progenitors that regulate hair growth and pigmentation. The interesting part of the story is that we don’t know what causes hair to become gray, though it is assuredly the result of uncontrolled oxidation. Whether these researchers have cracked the problem is another story.

Signed integer division by a power of two can be expensive!

Remember when you learned the long division algorithm in school? It was painful, right?

It turns out that even on modern processors, divisions are expensive. So optimizing compilers try to avoid them whenever possible. An easy case is the division by a power of two. If a compiler sees x / 2 when x is an unsigned integer then it knows that it can simply “shift” the bits of the variable x by 1 because data is represented in binary. A shift is a relatively inexpensive operation: it completes in a single CPU cycle on most processors…

Some programmers cannot resist and they will write x >> 1 instead of x / 2. That’s mostly wasted time, however. Optimizing compilers can be counted to be smart enough to figure it out.

What if x is a signed integer? Sadly, a simple shift is no longer sufficient and several instructions must be used. Most compilers seem to generate about 4 to 5 instructions. Some versions of the Intel compiler generate three separate shifts.

This means that, some of the time, if we use x >> 1 (or x >>> 1 in Java) instead of x / 2, we might get a different performance even if the actual value stored in x is positive.

We might think that it is no big deal. It is still going to be super fast, right?

Let us consider a generally useful algorithm: the binary search. It finds the location of a value in a sorted array. At the core of the algorithm is the need to divide a length by two repeatedly. Looking at Java’s source code, we find how the Java engineers implemented the binary search in the standard library:

public static int BinarySearch(int[] array, int ikey) {
    int low = 0;
    int high = array.length - 1;
    while (low <= high) {
        final int middleIndex = (low + high) >>> 1;
        final int middleValue = array[middleIndex];

        if (middleValue < ikey)
            low = middleIndex + 1;
        else if (middleValue > ikey)
            high = middleIndex - 1;
        else
            return middleIndex;
    }
    return -(low + 1);
}

Notice the shift? Let us rewrite the function with an integer division instead:

public static int BinarySearch(int[] array, int ikey) {
    int low = 0;
    int high = array.length - 1;
    while (low <= high) {
        final int middleIndex = (low + high) / 2;
        final int middleValue = array[middleIndex];

        if (middleValue < ikey)
            low = middleIndex + 1;
        else if (middleValue > ikey)
            high = middleIndex - 1;
        else
            return middleIndex;
    }
    return -(low + 1);
}

It is nearly the same function. You may expect that it will have nearly the same performance.

Maybe surprisingly, that’s not true at all.

I wrote a little Java benchmark. It shows that the version with integer division runs at 2/3 the speed of the version with a shift. That’s a severe performance penalty.

The result is not specific to Java, it also holds in C.

The lesson?

When working with signed integer, do not assume that the compiler will turn divisions by powers of twos into code that nearly as efficiently as a single shift.

Credit: The observation is based on work by Owen Kaser.