Does software performance still matter?

This morning, a reader asked me about the real-world relevance of software performance:

I’m quite interested in your work on improving algorithm performance using techniques related to computer architecture. However, I think that this may only be of interest to academia. Do you think that there are jobs opportunities related with this profile, which is very specialized?

To paraphrase this reader, computers and software are fast enough. We may need people to implement new ideas, but performance is not important. And more critically, if you want to be gainfully employed, you do not need to worry about software performance.

To assess this question, we should first have a common understanding of what software performance is. Software performance is not about how quickly you can crunch numbers. It is how you manage memory, disks, networks, cores… it is also about architecture. It is not about rewriting your code in machine code: you can write fast applications in JavaScript and slow ones in C++. Software performance is related to algorithmic design, but distinct in one important way: you need to take into account your architecture. Many algorithms that look good on paper do really poorly in practice. And algorithms that appear naive and limited can sometimes be the best possible choice for performance. In some sense, being able to manage software performance requires you to have a good understanding of how computer hardware, operating systems, and runtime libraries work.

So, should we care about software performance in 2017?

Here is my short answer. There are two basic ways in which we can assess you as a programmer. Is your software correct? Is your software efficient? There are certainly other ways a programmer can bring value: some exploit their knowledge of some business domains, others will design marvelous user interfaces. However, when it comes down to hardcore programming, being correct and being efficient are the two main attributes.

Consider the great programmers. They are all good at producing software that is both correct and efficient. In fact, it is basically the definition of a great programmer. Programming is only challenging when you must be both correct and efficient. If you are allowed to sacrifice one or the other, you can trivialize most tasks.

In job ads, you probably won’t see many requests for programmers who write efficient code, nor are you going to see many requests for programmers who write correct code. But then, you do not see many ads for doctors who cure people, nor do you see many ads for lawyers who avoid expensive lawsuits. Producing efficient code that takes into account the system’s architecture is generally part of your job as a programmer.

Some of my thoughts in details:

  • Software performance only matters for a very small fraction of all source code. But what matters is the absolute value of this code, not it is relative size.

    Software performance is likely to be irrelevant if you have few users and little data. The more important the software is, the more important its performance can become.

    Given that over 90% of all software we write is rarely if ever used for real work, it is a safe bet to say that software performance is often irrelevant, but that’s only because, in these cases, the software brings little value.

    Let us make the statement precise: Most performance or memory optimizations are useless.

    That’s not a myth, it is actually true.

    The bulk of the software that gets written is not performance sensitive or worth the effort. Pareto’s law would tell you that 20% of the code accounts for 80% of the running time, but I think it is much worse than this. I think that 1% of the code accounts for 99% of the running time… The truth is maybe even more extreme.

    So a tiny fraction of all code will ever matter for performance, and only a small fraction of it brings business value.

    But what matters, if you are an employee, is how much value your optimized code brings to the business, not what fraction of the code you touch.

  • We can quantify the value of software performance and it is quite high.

    If I go on Apple’s website and I shop for a new MacBook Pro. The basic one is worth $1,800. If I want a processor with a 10% faster clock speed, it is going to cost me $2,100, or 15% more. An extra 10% in the clock speed does not make the machine nearly 10% faster. Let us say that it is maybe 5% faster. So to get a computer that runs 5% faster (if that) some people are willing to pay 15% more. I could do the same analysis with smartphones.

    If constant factors related to performance did not matter, then a computer running at twice the speed would be worth the same. In practice, a computer running at twice the speed is worth multiple times the money.

    With cloud computing, companies are now often billed for the resources (memory, compute time) that they use. Conveniently, this allows them to measure (in dollars) the benefits of a given optimization. We can find the stories of small companies that save hundreds of thousands of dollars with some elementary optimization.

    We could also look at web browsers. For a long time, Microsoft had the lead with Internet Explorer. In many key markets, Google Chrome now dominates. There are many reasons for people to prefer Google Chrome, but speed is a key component. To test out my theory, I searched Google for guides to help me choose between Chrome and Internet Explorer, and the first recommendation I found was this:

    Chrome is best for speed – arguably, a web browser’s most crucial feature is its ability to quickly load up web pages. We put both Chrome and Internet Explorer 11 through a series of benchmark tests using Sunspider, Octave and HTML 5 test. In every event, Google’s Chrome was the clear winner.

    So yes, performance is worth a lot to some users.

  • Adding more hardware does not magically make performance issues disappear. It requires engineering to use more hardware.

    People object that we can always throw more machines, more cores at the problem if it is slow. However, even when Amdahl’s law does not limit you, you still have to contend with the fact it can be hard to scale up your software to run well on many machines. Throwing more hardware at a problem is just a particular way to boost software performance. It is not necessarily an inexpensive approach.

    It should be said that nobody ever gets an asymptotically large number of processors in the real world. Moreover, when you do get many processors, coordination issues can make it difficult (even in principle) to use a very large number of processors on the same problem.

    What about our practice? What the past decades have taught us is that parallelizing problems is hard work. You end up with more complex code and non-trivial overhead. Testing and debugging get a lot more difficult. With many problems, you are lucky if you manage to double the performance with three cores. And if you want to double the performance again, you might need sixteen cores.

    This means that doubling the performance of your single-threaded code can be highly valuable. In other words, tuning hot code can be worth a lot… And adding more hardware does not make the performance problems go away magically, using this hardware requires extra work.

  • We use higher level programming language, but an incredible amount of engineering is invested in recovering the traded-away performance.

    Today’s most popular programming language is JavaScript, a relatively slow programming language. Isn’t that a sign that performance is irrelevant? The performance of JavaScript was multiplied over the years through vast engineering investments. Moreover, we are moving forward with high-performance web programming techniques like Web Assembly. If performance did not matter, these initiatives would fall flat.

    It is true that, over time, people migrate to high-level languages. It is a good thing. These languages often trade performance for convenience, safety or simplicity.

    But the performance of JavaScript in the browser has been improved by two orders of magnitude in the last fifteen years. By some estimates, JavaScript is only about ten times slower than C++.

    I would argue that a strong component in the popularity of JavaScript is precisely its good performance. If JavaScript was still 1000 times slower than C++ at most tasks, it would not have the wide adoption we find today.

    Last year, a colleague faced a performance issue where simulations would run forever. When I asked what the software was written in… she admitted with shame that it was written in Python. Maybe to her surprise, I was not at all dismissive. I’d be depressed if, in 20 years, most of us were still programming in C, C++, and Java.

    One of the things you can buy with better performance is more productivity.

  • Computers are asked to do more with less, and there is a never ending demand for better performance.

    Software performance has been regularly dismissed as irrelevant. That’s understandable under Moore’s law: processors get faster, we get faster disks… who cares if the software is slow? It will soon get faster. Let us focus on writing nice code with nice algorithms, and we can ignore the rest.

    It is true that if you manage to run Windows 3.1 on a recently purchased PC, it will be ridiculously fast. In fact, I bet you could run Windows 3.1 in your browser and make it fast.

    It is true that some of the hardware progress reduces the pressure to produce very fast code… To beat the best chess players in the 1990s, one probably needed the equivalent of hand-tuned assembly code whereas I am sure I can write a good chess player in sloppy JavaScript and get good enough performance to beat most human players, if not the masters.

    But computers are asked to do more with less. It was very impressive in 1990 to write a Chess program that could beat the best Chess players… but it would simply not be a great business to get into today. You’d need to write a program that plays Go, and it is a lot harder.

    Sure, smartphone hardware gets faster all the time… but there is pressure to run the same software on smaller and cheaper machines (such as watches). The mouse and the keyboard will look quaint in a few decades, having been replaced by more expensive interfaces (speech, augmented reality…).

    And yes, soon, we will need devices the size of a smartwatch to be capable of autonomous advanced artificial intelligence. Think your sloppy unoptimized code will cut it?

  • We do want our processors to be idle most of the time. The fact that they are is not an indication that we could be sloppier.

    Aren’t most of our processors idle most of the time?

    True, but it is like saying that it is silly to own a fast car because it is going to spend most of its time parked.

    We have enormous overcapacity in computing, by design. Doing otherwise would be like going to a supermarket where all of the lines for the cashiers are always packed full, all day long.

    We have all experienced what it is to use a laptop that has its CPU running at 100%. The laptop becomes sluggish, unresponsive. All your requests are queued. It is unpleasant.

    Servers that are running at full capacity have to drop requests or make you wait. We hate it.

    A mobile phone stressed to its full capacity becomes hot and burns through its battery in no time.

    So we want our processors to be cold and to remain cold. Reducing their usage is good, even when they were already running cold most of the time. Fast code gets back to the users faster and burns less energy.

    I imagine that the future will be made of dark silicon. We will have lots of computing power, lots of circuits, but most of them will be unpowered. I would not be surprised if we were to soon start rating software based on how much power it uses.

Science and Technology links (March 17, 2017)

We live in a world where the most powerful companies in the world have super smart people working on trying to emulate human intelligence in machines. Yann LeCun, a Frenchman who directs Facebook’s artificial intelligence research group, tells us that, soon, computers could acquire what he calls “common sense”. I can’t resist pointing out that this would put machines ahead of most human beings. Kidding aside, it seems that the goal is to produce software that can act as if it understood the world we live in. There is a ball up in the air? It is going to come down. Someone dressed in dark is waiting in an alley? He might be waiting to mug someone. Instead of having to train machines for specific tasks, and having them perform well only on these specific tasks, the goal would be to start with software that can reason about the world, and then build on that. Want to build an automated vacuum cleaner? Want it to stop whenever someone enters the room? Maybe in the future, software will be smart enough to understand well enough what “someone enters the room” means without months of tweaking and training. Given that software can be copied inexpensively if one company succeeds at building one such general-purpose engine, it could quickly apply it to a wide range of problems, creating specialist engines like my automated vacuum cleaner. In 2017, that’s still science fiction but people like LeCun are not clowns: they got tangible results in the past. People use their ideas to solve real problems, in the real world. So who knows?

In Myths that will not die, Megan Scudellari review a few scientific myths that you may believe:

  • Screening saves lives for all types of cancer. The truth is that cancer screening is often negative. The story usually goes that if they catch cancer early, your chance of survival goes up. The problem is that many cancers will kill you no matter what, knowing about them early just makes your miserable longer. And lots of cancers would not have killed you anyhow, so you are just going to go through stress and painful therapies for no good reason.
  • Antioxidants are good. You should eat your fruits and vegetables. Take your vitamins. It is full of antioxidants. And so on. The idea underneath is that your body is like an automobile and it is “oxidizing” (rusting).You can actually see this process: gray hair is a form of oxidation. This was one of the flawed theory of why we age: oxidation kills us. Take antioxidants and you will prevent this process. Not so fast! Your body has pretty good antioxidants. Taking more may not help, it may even cause some slight harm. If anything, free radicals, in moderation, might be good for you.
  • Human beings have exceptionally large brains. We are pretty much in line with other primates. However, our brains are configured somewhat differently.
  • Individuals learn best when taught in their learning style. This is pure junk. There is no such thing as a “visual learner”. That people keep on insisting that it must be, despite all the scientific evidence being against it… is really puzzling.
  • The human population is growing exponentially and we are doomed. The richer a country is, the slower its population grows. Advanced countries like Germany and Japan have a population decline. Because the world is getting richer, all continents but Africa will reach a population plateau within a few decades. Even Africa will catch up. Meanwhile, we produce, today, enough food to feed 10 or 12 billion people. So yes, there will be more human beings alive in 50 years, but there may actually be fewer in 100 years. Meanwhile, having more people means to have more engineers and scientists, more entrepreneurs, more astronauts.

    By the way, by mere extrapolation, it is quite certain, I think, that there won’t be anything that we would recognize as “human” in a couple of centuries… if technology keeps getting better. Any long-term doomsday scenario that ignores technology as a factor is just ridiculous.

The diseases of old age are tremendously expensive. Dementia, the diseases affecting cognition are particularly cruel and expensive:

The worldwide costs of dementia were estimated at United States (US) $818 billion in 2015, an increase of 35% since 2010 (…) The threshold of US $1 trillion will be crossed by 2018.

It is hard to wrap our head around how much of an expense that is. Let me try. If we did not have to pay for dementia, we could afford five million well-paid researchers, engineers, and scientists at 200k$ a year each. Care to wonder what 5 million of them could achieve every single year? I don’t know if we will have cures for dementia in 10 or 20 years, but if we do, we will be immensely richer.

Regarding Parkinson’s… the terrible disease that makes people shake uncontrollably, lose the ability to speak normally… Ok. So right now, there is nothing we can do to reverse or slow this disease, at best we can help control the damages. We need to do better. Almost every day, there is a grand proclamation that a cure is around the corner, but it always disappoints. The latest one I have seen is nilotinib. It is a drug used to treat some cancers. It turns out that it has another great benefit: it boost autophagy. Autophagy is the process by which your body “eats itself”. I know it sounds bad, but that’s actually a great thing if you are a multicellular organism. The idea is that repair is too hard, so what the body does is to eat the broken stuff and then use the byproduct to rebuild anew. If you could crank up autophagy, you would assuredly be healthier. So, they think that in cases of Parkinson’s, the body faces an accumulation of garbage that is so bad that it leads to cell death. If you could convince the body to eat up the garbage, you might be able to stop the progression of the disease. It would not, by itself, reverse Parkinson’s, but once the damage stops worsening, the body can be given a chance to route around the damage, and we could start thinking about damage-repair therapies. As it stands, with damages accumulating at a high rate, there is little we can do until we slow it down. Anyhow, in mice designed to suffer from Parkinson’s, nilotinib stops the disease. There are now large clinical trials to see if it works in human beings. Could a cure for Parkinson’s be around the corner?

Technology is very much about globalization. Many people are concerned, for example, with the fact that the United States is running trade deficits year after year. To put it in simple terms, Americans buy many good from China, but China buys comparatively fewer goods from the USA. Could this go on forever? Economist Scott Sumner explain that it can:

A common mistake made by famous economists is to confuse statistical measures (…) with the theoretical concept (…). In terms of pure economic theory, the US sale of an LA house to a Chinese investor is just as much an “export” as the sale of a mobile home that is actually shipped overseas. But one is counted as an export and one is not.

It is a common problem in science that we measure something that differs from our model. The difference appears irrelevant until it is not.

Ever since I was a teenager, I have been told that cholesterol causes heart disease. Statins, a family of drugs to drastically lower cholesterol, is making pharmaceutical companies very rich. A controversial study published last year suggests that lowering cholesterol might be vain:

What we found in our detailed systematic review was that older people with high LDL (low-density lipoprotein) levels, the so-called “bad” cholesterol, lived longer and had less heart disease.

Lowering cholesterol with medications for primary cardiovascular prevention in those aged over 60 is a total waste of time and resources, whereas altering your lifestyle is the single most important way to achieve a good quality of life

This does not mean that statins do not have benefits. They seem to be working for men who have had a prior cardiovascular incident. However, do they work because they lower cholesterol? This being said, taking low-dose aspirin has also some benefits regarding cardiovascular health, and it is much cheaper. Both aspirins and statins have side-effects, of course.

Going to space is bad for your health. It seems to cause some form of accelerated aging. The effect remains even if you keep astronauts active (through special exercises) and if you shield them as best as possible from radiation. (Spending a month in space does not expose you to more radiation that what flight crew experience in a lifetime, yet pilots do not suffer from accelerated aging.) There are many unknowns in part due to the fact that there are very few astronauts. We know that all of them are closely monitored for health problems. Surprisingly, up until now, retired astronauts did not receive government-paid health care. Given that they are effectively Guinee pigs and living science experiments, I would have thought that they were covered. This is about to change.

IBM claims that it is reaching human-like speech recognition levels on some very difficult tests.

Last year, IBM announced a major milestone in conversational speech recognition: a system that achieved a 6.9 percent word error rate. Since then, we have continued to push the boundaries of speech recognition, and today we’ve reached a new industry record of 5.5 percent.

It seems that human parity would be 5.1 percent. So we are really close to human parity. Maybe next year? I tend to be skeptical of such announcements, whether they come from IBM, Microsoft or Google. I much prefer to assess deployed technologies and it is clear that neither Google nor Apple is at human-level speech recognition. So let us take this with a grain of salt, for now.

I take most of the academic research with a large grain of salt. My experience is that, too often, I can’t reproduce the work of my peers. Sometimes they flat out lie about their methods, but most often, they were just careless. I am sure my own work has problems, but I am have worked hard to ensure that all my papers are backed by freely available code and data. This way, at least, it is often easy to check my work. We already know that most psychology studies cannot be replicated. What about cancer research? Cancer research is the best-funded research, period. If Computer Science is a house in the suburbs, cancer is the Trump tower. If anything should be top notch, it is cancer research. Yet it does not look good:

Glenn Begley and Lee Ellis from Amgen said that the firm could only confirm the findings in 6 out of 53 landmark cancer papers—just 11 percent. Perhaps, they wrote, that might explain why “our ability to translate cancer research to clinical success has been remarkably low.”

If you asked 20 different labs to replicate a paper, you’d end up with 10 different methodologies that aren’t really comparable. (…) If people had deposited raw data and full protocols at the time of publication, we wouldn’t have to go back to the original authors (…)

Simply put, we are paying for research papers, and so we get them, but little else. To turn things around, scientists need to have skin the game. If their work does not lead to a cure, and it won’t if it is not reproducible, they need to pay a price.

On the topic of cancer, some researchers claim that they have figured out how to block the spread of cancer by disabling Death Receptor 6 (in mice, at least). Who comes up with these names? Anyhow, cancer is mostly a dangerous disease because it spreads. If tumors remained where they are, they would either not cause much harm, or be easy to manage (or remove). Sadly, many cancers tend to spread uncontrollably through our blood stream. It seems that it punches through the walls of blood vessels and moves on from there. Speculatively, we could maybe disable this ability in cancer is make it a much less dangerous disease. Maybe. But can we reproduce the Death Receptor 6 experiments?

The lack of reproducibility also plagues gerontology and development biology. Researchers have been studying the lifespan of short-lived worms for decades. At a high-level, this is compelling research. Take a compound, bath worms in it, and see whether they live shorter or longer. If biology had maximized lifespans, then it would be very hard to find compounds that prolong life, but it is not so hard in practice. What is hard is to get the same compound to extend the life of worms robustly across different species of worms and in a way that competing labs can reproduce. Lucanic et al. showed that one compound (and only one in their test) robustly extended the lifespan of a wide variety of worms: Thioflavin T. What is this magical compound and can you order it online? It is a fluorogenic dye that is routinely used in medical research. Don’t drink the stuff just yet. On a related note, methylene blue, another dye, has also been reported to have great health properties. Here is a vision for you: the world is filled with ageless centenarians, but they are either dyed in yellow, blue or green.

So maybe you want to remain healthy for a long time with something less colorful and easier to find. What about baking soda? Scientists who want to be taken seriously, call the stuff NaHCO3. Wesson in Is NaHCO3 an antiaging elixir? tells us that it maybe be good for our health:

Emerging evidence supports that the largely acid-producing diets of developed societies induce a pathological tissue milieu that contributes to premature death, mediated in part through vascular injury, including vascular calcification.

Speaking for myself, before I start consuming large quantities of baking soda, I’ll expect scientists to make a very convincing case. The stuff does not taste very good.

To remain youthful longer, maybe you do not need to do anything at all. Stenzaltz tell us that we are aging slower as time passes, in the following sense:

the generation of American men who fought in World War II and are now in their 90s lived, on average, about eight years longer than their great-grandfathers who fought in the Civil War, once they reached adulthood.
According to the 1985 United States Health Interview Survey 23 percent of Americans aged 50 to 64 reported limitations on daily activities due to chronic illness; in 2014, this was down to 16 percent
Perhaps most striking, a new study has discovered that over the past two decades the incidence of new dementia cases has dropped by 20 percent. Men in the United Kingdom develop dementia today at the same rate as men five years younger in the 1990s (…)

It is fairly easy to see why people who fought in the WW II would be healthier than previous generations: they had access to inexpensive antibiotics. Antibiotics are our first line of defense against infectious diseases. Yet it is much less clear why people in their fifties would be healthier today than they would have been in the 1980s. Why would dementia rates fall? It demands an explanation and I see none being offered.

DNA is used by biology as an information storage mechanism, not unlike our digital storage devices. It seems that we could back up the entire Internet on a gram of DNA. The real challenge, of course, would be to access the information quickly. Biology is often not in a hurry. Google users are. Pangloss has a nice review of the technology where he hammers the point that it is just too expensive to write data to DNA:

Although this is useful research, the fact remains that DNA data storage requires a reduction in relative synthesis cost of at least 6 orders of magnitude over the next decade to be competitive with conventional media, and that currently the relative write cost is increasing, not decreasing.

There is an emerging consensus that as we age, our blood acquires a high level of harmful factors. These factors may not be harmful by themselves, but having too much of it for a long time could be detrimental to our bodies. We are seeing the first glimpses at technology to block or remove the bad stuff from old blood:

Yousef has found that the amount of a protein called VCAM1 in the blood increases with age. In people over the age of 65, the levels of this protein are 30 per cent higher than in under-25s.

To test the effect of VCAM1, Yousef injected young mice with blood plasma taken from older mice. Sure enough, they showed signs of aging (…) Blood plasma from old people had the same effect on mice. (…)

These effects were prevented when Yousef injected a compound that blocks VCAM1. When the mice were given this antibody before or at the same time as old blood, they were protected from its harmful effects.

So we can envision therapies that “rejuvenate” our blood. And no, it won’t involve drinking the blood of the young. Rather you could imagine a device that would scan your blood, determine that you have too much of protein X and decide to remove some of it to bring it back to a youthful level. Still science fiction, but I bet that we shall be trying it out on mice soon.

A nasty problem that affects some of us is retinal degeneration. Basically, you start to have a hard time seeing, you go to your doctor, and he tells you that your retina is going to shred, and it is only going to get worse. “Go buy a white cane now.” That would not be a good day. If your retina degenerates on its own, it is not like glasses will help you. But Yu et al. showed that you could use a nifty new gene therapy to prevent retinal degeneration. Basically, they reprogram the cells, in vivo, and all is well. It works in mice.

On the topic of the retina, three ladies went to a clinic in Florida for macular degeneration, a form of age-related retinal degeneration. They paid $5000 for an experimental stem-cell therapy and came out blind. The experimental therapy is not the problem per se, one has to try new things from time to time. If you are going blind and someone suggests you try something that might restore your eyesight… wouldn’t you want to try it? At some point, you have little to lose and much to gain. Moreover, some people might be motivated by the fact that they help advance science. The real problems arise because the therapy was not administered in a sensible fashion. For example, they treated both eyes at the same time. That’s obviously unnecessarily dangerous. They also charged the patients for something that is an experiment. Again, that’s less than ideal. For unproven therapies, you’d really prefer not to charge your Guinee pigs for the experiment. Stem cells have great potential in medicine, but we are not yet at the point where we can do in vivo regeneration of organs. At least, we can’t do it safely.

So having children is stressful and a lot of work. It is hard. But did you know that parents live longer? Evidently, having to work hard is not a negative when it comes to health and longevity.

Is the Sun bad or good for your skin? It may depend on the wavelength. High-frequency light (UV-A and UV-B) seems to be bad (burns, cancer, wrinkles and so forth), hence the ubiquitous sunscreen, but low-frequency (infrared) light might be good for us:

In the last decade, it has been proposed that the sun’s IR-A wavelengths might be deleterious to human skin and that sunscreens, in addition to their desired effect to protect against UV-B and UV-A, should also protect against IR-A (…) IR-A might even precondition the skin–a process called photo prevention–from an evolutionary standpoint since exposure to early morning IR-A wavelengths in sunlight may ready the skin for the coming mid-day deleterious UVR. Consequently, IR-A appears to be the solution, not the problem. It does more good than bad for the skin.

So maybe you should get an infrared lamp to improve your skin? Sounds less painful than having to eat baking soda.

Stable Priority Queues?

A priority queue is a data structure that holds a set of elements and can return quickly the smallest (or alternatively the largest) element. It is usually implemented using a binary heap.

So you “add” elements to the priority queue, and then you can “poll” them out.

Suppose however that you insert elements that are equal. What happens? Because binary heaps are not stable, your elements may not come out in insertion order.

For example, suppose you add the following tuples to a priority queue:

[{ name: 'player', energy: 10},
 { name: 'monster1', energy: 10},
 { name: 'monster2', energy: 10},
 { name: 'monster3', energy: 10}

You could poll them back out based on their “energy” value in a different order… even though they all have the same “energy”…

[{ name: 'player', energy: 10},
 { name: 'monster3', energy: 10},
 { name: 'monster2', energy: 10},
 { name: 'monster1', energy: 10}

That’s not very elegant.

Thankfully, there is an almost trivial approach to get a stable priority queue. Just add some kind of counter recording the insertion order, and when you insert elements in the binary heap, just use the insertion order as to differentiate elements. Thus, for a to be smaller than b, it is enough for the value of a to be smaller than the value b or that a be the same as b in value, but with a smaller insertion counter.

For example, we might store the following:

[{ value: { name: 'player', energy: 10 }, counter: 0 }
{ value: { name: 'monster1', energy: 10 }, counter: 1 }
{ value: { name: 'monster2', energy: 10 }, counter: 2 }
{ value: { name: 'monster3', energy: 10 }, counter: 3 }]

When comparing any two objects in this example, we not only compare them by their “energy” attribute, but also by their “counter” attribute.

So I implemented it in JavaScript as a package called StablePriorityQueue.js.


I can’t promise that the performance will be as good as a speed-optimized priority queue, however.

This lead me to a follow-up question: what is the best (most efficient) way to implement a stable priority queue?

Since the standard binary heap does not support tracking the insertion order, we chose to append an insertion counter. That’s reasonable, but is it the most efficient approach?

And, concretely, what would be the best way to implement it in a given language? (Java, JavaScript…)

The ultimate goal would be to get a stable priority queue that has nearly the same speed as a regular priority. How close can we get to this goal?

Credit: Thanks to David Ang for inspiring this question.

Science and Technology links (March 10, 2017)

In Mnemonic Training Reshapes Brain Networks to Support Superior Memory (published in Neuron, March 2017), we learned that 6 weeks of mnemonic training at a rate of 30 minutes a day lead to a large scale brain network re-organization making the brain of control subjects more like that of memory athletes.

In How worried should we be about artificial intelligence?, Andrew Ng, the chief scientist of the search engine giant Baidu is quoted as saying: “Worrying about evil-killer AI today is like worrying about overpopulation on the planet Mars.” Another famous computer scientist, Moshe Vardi, was quoted as saying “the superintelligence risk, which gets more headlines, is not an immediate risk. We can afford to take our time to assess it in depth”. The article includes lots of interesting quotes by other scientists and thinkers.

In adult mammals, damaged nerve cells do not normally regenerate and neurogenesis is barely noticeable. So a damaged brain can route around the damage, but it does not normally repair itself fully. In an October 2016 article published in the journal Neuron, Tedeschi et al. showed that treating mice with the drug Pregabalin caused damaged nerve connections to regenerate. Speculatively, this could help find cures for neurodegenerative diseases.

Yann Collet, a Facebook engineer famous for LZ4, a fast compression format, has released Zstandard is a new compressed format that has superior performance. Though the software is open source, it may be covered by patents, so check with your lawyers.

Mice age faster than human beings: they barely live a couple of years whereas human beings can live decades. We don’t know how the cells know how fast they are supposed to develop and age. In vitro, the human and mice cells develop at different rates. What happens if you put human embryonic stem cells in mice? They still develop at a human rate. This suggests that the cells themselves are equipped with some kind of clock. What this clock might be is unknown. (Source: Barry et al., Species-specific developmental timing is maintained by pluripotent stem cells ex utero)

We know how to reset cells in live mice so that they become pluripotent stem cells (e.g., using the Yamanaka factors). That is, we rewind the clock of the cells to zero, in a live animal. See In Vivo Amelioration of Age-Associated Hallmarks by Partial Reprogramming by Ocampo et al. More recently, Marión et al. showed in Common Telomere Changes during In Vivo Reprogramming and Early Stages of Tumorigenesis that this is accompanied by what appears to be a reset on the length of the telomeres. The telomeres are a component of your chromosomes that get shorter with every division. Once the telomeres become too short, your cells stop dividing and may become senescent. Anyhow, it looks more and more certain that, yes, we can reset the age of cells in a living organism.

College and inequality

Most college professors are squarely on the left ideologically. They believe that part of their mandate is to reduce inequality, by helping to provide college degrees to all who qualify.

This always felt as strange to me. Higher education is highly subsidized, but the money goes overwhelmingly to people who are better off than average. So if you are a young adult, you either go find work, in which you will probably end up paying taxes… or else you attend college, in which case you will receive net benefits.

You are much more likely to go to college, and thus receive government subsidies if you are from a wealthier family. Moreover, you are more likely to go to college, and be subsidized, if you inherited from characteristics that are likely to turn you into a sought-after professional.

It does not stop there. Subsidies overwhelmingly go to elite schools. For example, in 2008, Princeton and Harvard received $105,000 and $48,000 in tax benefits per student.

We find an insane wealth concentration among the elite universities. The top 10 schools in the USA account for 5 percent of the world’s 211,275 people worth $30 million or more.

The lowly community college receives much, much less than elite schools. It should be clear that through higher education, governments subsidize the privileged.

But, at least, some students from modest backgrounds make it to good universities. However, it is far from clear that they will reap the same benefits as the well-off students. Elizabeth Armstrong and Laura Hamilton wrote a book Paying for the Party. In their book, they examine how well college serves the poorest students. To the question “What happened to the working-class students you studied?”, they answer “On our floor, not one graduated from the university within five years.” And what about the well-off students? “They were able to recreate their parents’ success. They all graduated.” But maybe college at least allows people from a modest background to rub shoulders with well-off students? The researchers found that “cross-class roommate relationships were extremely negative for the less privileged person”.

Where do you think well-off young people fall in love? In colleges. If you care at all about inequality, then you should care about assortative mating. One of the most important factor in determining your future earnings is who you mate with. Elite colleges act as subsidized mating grounds for the privileged. So it is not just the subsidies to elite colleges go to the privileged, it also helps create and sustain assortative mating, a powerful driver for inequality.

And let us not forget age. Most elite colleges will actively discriminate against you if you are older and uneducated. If you are 40 and a truck driver, and you find yourself out of a job, don’t bother applying at Princeton, even if you had great grades in high school. Age is a mediating factor: people from more modest backgrounds tend to have false starts. If you discriminate against people because they had false starts, you are effectively discriminating against them because they come from a more modest background.

I am not concerned about inequality, but if I were and I thought that governments should subsidize higher education, then I would favor the following policies:

  • I would check that people from the lowest quintile receive benefits from higher-education subsidies that are larger than the benefits received from top 1%. Given that children from the last quintile rarely attend college at all, and when they do they rarely graduate, while children from the top 1% mostly attend college and mostly graduate, this would be a harsher requirement to meet than might appear.
  • Assuming that higher-education should be subsidized at all, then it should be subsidized in reverse ranking. Highly accessible community colleges should receive more from the state per student than elite schools. Students from the lowest quintile should receive the bulk of the support and funding.
  • Government subsidies should favor low-income students and the graduation of low-income students. I would never subsidize a school for providing to the well-off students.
  • I would subsidize more generously adult education.
  • I would not subsidize Harvard or Princeton at all. Their endowments should be taxed and the income used to pay for better community colleges.

So you would think that activists on the left would have this agenda already. If you are going to “Occupy Wall Street”, you should certainly be concerned that Harvard’s $35 billion endowment is allowed to profit tax-free, and that the benefits go mostly to the elite. Why is there no call to redistribute these endowments?

Let us look at the intellectuals on the left. We have David Graeber, who helped spur the Occupy Wall Street movement. Graeber was a professor at Yale and is now a professor at the London School of Economics. Both schools that received students from well-off families and hand degree to people who are going to join the top 1%. At the London School of Economics, we also find Thomas Piketty, who became famous for his socialist treaty, Capital in the Twenty-First Century. It is interesting that these intellectuals do not choose to work for schools where blue-collar workers are likely to send their children. They are nowhere to be seen at community colleges or vocational schools. How likely is it that Graeber and Piketty would be eager to teach at a community college or at an adult-education school?

It is not just the USA and England. In Canada, the most elite schools, the less likely to cater to children who have minimum-wage parents, also receive the best government funding.

So the idea that subsidized colleges are a force of equality is flat out wrong. At best, they do well by the middle class. Let us at least be honest about it.

The technology of Logan (2017 Wolverine movie set in 2029)

Last night I went to see Logan, the latest and maybe the last Wolverine movie with Hugh Jackman. The movie is set in 2029. The year 2029 is an interesting choice, as it is the year of the story “Ghost in the shell”, made into a movie featuring Scarlett Johansson and coming out later this year. So we shall have to sci-fi movies representing the year 2029 this year.

Writing stories about the near future is always a bit daring because you are sure to get most things wrong, and people will soon find out how wrong you were. So how does Logan represent the year 2029?

  • To my eyes, fashion, cosmetic, clothes… all appear unchanged from today.
  • People still read newspapers printed on paper, comics printed on paper.
  • People carry pictures printed on paper.
  • Augmented and virtual reality are nowhere to be seen? There are no smart glasses, no enhanced hearing, no smart displays.
  • Though people very much drive their own cars and trucks, and they have not changed very much, self-driving tractor trailers are ubiquitous on the highway.
  • Slot machines in casinos are unchanged.
  • TVs and remote controls look the same.
  • Inexpensive mobile phones look very much like they do today. We charge them up with cords, as we do today. They have cameras that are no better than what we have today. People send “texts” like they do today. A phone call works like it does today.
  • It seems that the Internet and computers work exactly like they do today, no progress or notable change.
  • Computerized artificial limbs appear to be common. So if you lose a hand, you can get a metallic replacement that looks to be as good as a “natural” hand.
  • Drones that look like the drones we have today are used for surveillance. It is unclear whether they are more advanced than the drones we have today.
  • Artificial human insemination using collected DNA (i.e., without sperm) is possible.
  • Large farms are automated, with giant metallic beasts taking care of the fields.
  • We grow genetically engineered corn that looks much larger than any corn I have ever seen.
  • It does not look like governments can track people using satellites.
  • We can grow entire human bodies (clones) in vats.
  • Convenience stores are still manned by human clerks.
  • In 2017, we have giant digital screens on the roadsides displaying ads. They are still around in 2029 but look brighter to my eyes.
  • In the movie, Professor X (Charles Francis Xavier) is suffering from some form of dementia, probably Alzheimer’s. In 2017, Alzheimer’s is basically incurable and untreatable. It looks like there is no cure in 2029. However, the main character (Wolverine) gets some medication that appears to help manage symptoms.

How many floating-point numbers are in the interval [0,1]?

Most commodity processors support single-precision IEEE 754 floating-point numbers. Though they are ubiquitous, they are often misunderstood.

One of my readers left a comment suggesting that picking an integer in [0,232) at random and dividing it by 232, was equivalent to picking a number at random in [0,1). I am not assuming that the reader in question made this inference, but it is a tempting one.

That’s certainly “approximately true”, but we are making an error when doing so. How much of an error?

Floating-point numbers are represented as sign bit, a mantissa and an exponent as follows:

  • There is a single sign bit. Because we only care about positive number, this bit is fixed and can be ignored.
  • The mantissa is straight-forward: 23 bits. It is implicitly preceded by the number 1.
  • There are eight bits dedicated to the exponent. It is pretty much straight-forward, as long as the exponents range from -126 to 127. Otherwise, you get funny things like infinity, not-a-number, denormal numbers… and zero. To represent zero, you need an exponent value of -127 and zero mantissa.

So how many “normal” non-zero numbers are there between 0 and 1? The negative exponents range from -1 all the way to -126. In each case, we have 223 distinct floating-point numbers because the mantissa is made of 23 bits. So we have 126 x 223 normal floating-point numbers in [0,1). If you don’t have a calculator handy, that’s 1,056,964,608. If we want to add the number 1, that 126 x 223 + 1 or 1,056,964,609.

Most people would consider zero to be a “normal number”, so maybe you want to add it too. Let us make it 1,056,964,610.

There are 4,294,967,296 possible 32-bit words, so about a quarter of them are in the interval [0,1]. Isn’t that interesting? Of all the float-pointing point numbers your computer can represent, a quarter of them lie in [0,1]. By extension, half of the floating-point numbers are in the interval [-1,1].

So already we can see that we are likely in trouble. The number 232 is not divisible by 1,056,964,610, so we can’t take a 32-bit non-negative integer, divide it by 232 and hope that this will generate a number in [0,1] in an unbiased way.

How much of a bias is there? We have a unique way of generating the zero number. Meanwhile, there are 257 different ways to generate 0.5: any number in between 2,147,483,584 and 2,147,483,776 (inclusively) gives you 0.5 when divided by the floating-point number 232.

A ratio of 1 to 257 is a fairly sizeable bias. So chances are good that your standard library does not generate random numbers in [0,1] in this manner.

How could you get an unbiased map?

We can use the fact that the mantissa uses 23 bits. This means in particular that you pick any integer in [0,224), and divide it by 224, then you can recover your original integer by multiplying the result again by 224. This works with 224 but not with 225 or any other larger number.

So you can pick a random integer in [0,224), divide it by 224 and you will get a random number in [0,1) without bias… meaning that for every integer in [0,224), there is one and only one number in [0,1). Moreover, the distribution is uniform in the sense that the possible floating-point numbers are evenly spaced (the distance between them is a flat 2-24).

So even though single-precision floating-point numbers use 32-bit words, and even though your computer can represent about 230 distinct and normal floating-point numbers in [0,1), chances are good that your random generator only produces 224 distinct floating-point numbers in the interval [0,1).

For most purposes, that is quite fine. But it could trip you up. A common way to generate random integers in an interval [0,N) is to first generate a random floating-point number and then multiply the result by N. That is going to be fine as long as N is small compared to 224, but should N exceed 224, then you have a significant bias as you are unable to generate all integers in the interval [0,N).

I did my analysis using 32-bit words, but it is not hard to repeat it for 64-bit words and come to similar conclusions. Instead of generating 224 distinct floating-point numbers in the interval [0,1], you would generate 253.

Further reading: The nextafter function and Uniform random floats: How to generate a double-precision floating-point number in [0, 1] uniformly at random given a uniform random source of bits.

Tech jobs are already largely automated

Lately, I have been reading a lot about the threat to computer jobs from automation. For example, we have AI systems that can write their own code. And billionaire Mark Cuban predicts that “software will soon begin writing itself, which will ultimately eliminate those lucrative software development jobs”.

I think that people like Cuban fundamentally miss that reality of the tech industry. How do they think that a couple of college graduates can build a tool that can be used by millions of people?

We work on layers upon layers of automatisation. We have been adding layers for decades.

Did you look around lately? More of us than ever are computer programmers, IT specialists and so forth. More automation has lead to more jobs.

Will software write its own code? It does so all the time. The optimizing compilers and interpreters we rely upon generate code for us all the time. It is not trivial automatisation. Very few human beings would be capable of taking modern-day JavaScript and write efficient machine code to run it. I would certainly be incapable of doing such work in any reasonable manner.

Programmers work actively to make themselves obsolete. That is, they work hard to solve problems so that nobody else has ever to worry about them again. The solution to the problem become automated.

Naively, one might think that software automation makes computer jobs less important. This makes some kind of sense if you think that all computer jobs will get automated.

But that’s not how the world tends to work. As automation increases, more and more new jobs are created higher up the tree. We no longer need many people to write C data structures, but we have seen an explosion in the number of web developers, data scientists and so forth.

Thoughts on my new laptop (Dell XPS 13 with Windows 10)

I love computers. Unlike many people, who stick to one brand and one operating system, I like to use many different systems. I own several game consoles, several types of tablets and so forth.

My primary laptop for many years has been an inexpensive and light MacBook Air. It is nearly perfect.

I also have a Thinkpad. My first Thinkpad did not even have wifi. I wrote my Ph.D. thesis on a Thinkpad. The one I have by my desk looks like it was made in the 1990s. And using it feels quite retro too! Anyhow, it is not a serious machine for me.

Here are my general requirements:

  • I do not need powerful laptops. For compute intensive tasks, I use remote servers. I do play games, but not on laptops. However, a laptop should be fast enough to run a modern IDE and compile C++ code without falling apart. If I need to run serious computations, I will use a server.
  • I expect my machines to be perfectly silent.
  • I carry around my laptops all the time. If I need a stationary machine, I will use a desktop. So I want my laptops to be lightweight.

I recently acquired a somewhat expensive Dell XPS 13. These are Dell’s flagship laptops, the best of the best. It is a serious machine.


  • Windows 10 is a good operating system. The usability is as good as macOS. This wasn’t always true.
  • The Windows Subsystem for Linux really does work. You have “something” like a bash shell with the common Unix tools running within a hidden directory on your machine.
  • The Dell XPS looks good. The screen is gorgeous and the fonts look good. I must admit that it is much better than my aging MacBook Air.
  • I like that Microsoft is offering a decent and free version of Visual Studio. On weaker machines, Visual Studio 2015 works really poorly, but on the XPS 13, it is fast enough.
  • I really like the work that the GitHub crew has been doing. GitHub Desktop for Windows is great. Throw in the Atom editor and you are good.
  • Though it is not nearly as easy as it should, you can generate Visual-Studio “solution files” with CMake, make it possible to build your macOS/Linux C/C++ code on Windows without too much fuss.

Neither positive nor negative:

  • Once booted up, the XPS 13 feels quite fast. It has a significantly better (and more recent) CPU than my MacBook Air… but I would have a hard time telling them apart.
  • For what I use them for, Cortana and Siri appears to be on par. Both of them can give me the weather. It is rather amazing that we can take for granted speech recognition in 2017.


  • Maybe I am crazy, or maybe my machine was hacked, but it feels like Microsoft is showing me ads?
  • On a MacBook, you just lift the lid and the machine awakes, instantly. On my XPS, I have to enter a passcode, wait several seconds for the machine to “boot up” and I can resume my work. It is annoying when you are used to better.
  • Windows 10 seems to suffer from random Wifi dropouts. I have had something like it on the new laptop: after rebooting, the Wifi might randomly fail to reconnect automatically. It is a mere nuisance, but I find it surprisingly amateurish. How hard can it be to connect to a Wifi network? Why do I have to manually do it?
  • Windows definitively feels like it needs to clean the house. I have a “settings” application along with a “control panel” application. Both appear to have closely related functions, but the layout is drastically different. Windows 10 feels like it was added to Windows 7.
  • I spend a lot of time copying and pasting. The ctrl-c and ctrl-v shortcuts are fine, but Dell put the left ctrl key to the extreme left of the keyboard, beyond the useless “Fn” key. I will probably have to remap the keys at some point, somehow.
  • I find it unacceptably hard to grab a Window and move it or to scroll down a web page. Thankfully, I have a touch screen and I can just put my finger on the window and slide it. It often works fine.
  • Though the Windows Subsystem for Linux works well, it is poorly integrated into Windows. You can’t (or shouldn’t) edit Linux files using your Windows text editor. The two systems run different file systems and there are (apparently) synchronization issues. This is probably fine.
  • The webcam is located at the bottom of the screen so that it will point to the underside of your nose during a video conference. It feels like it was designed by monkeys.
  • There is more plastic than I’d like on the Dell laptop, and I can already see some wear, but I suppose it is fine if you plan on replacing the machine regularly.
  • Soon after getting the machine, I had a bad case of screen flickering and tearing. I found out on posting boards and YouTube that it has been common for years with Dell XPS machines. For a time, I thought that my hardware was broken, but a common recommendation (turning off “Hyper-V”) solved the problem. It was simple enough, but how can it escape quality testing considering that the Dell support forums are filled with people complaining about screen flickering?
  • I can “hear” the machine read and write on disk. I did not know that SSDs could make noise!
  • There is “coil whining” (the computer makes high-pitch sounds). Most of the time (99%), it is unaudible (you have to put your hear to the machine to hear something), but it can randomly become quite audible whether the machine is under load or not.


I like the XPS 13 well enough as a secondary laptop. If I had to use it as a primary laptop, I’d invest in noise-cancelling headphones.

How fast can you count lines?

Inspired by earlier work by Llogiq, I decided to look at how fast I could count the number of lines in a string. By assuming that the string relies on ASCII, UTF-8 or other ASCII superset character encoding, the bulk of the work consists in counting the linefeed (‘\n’) characters.

Most programmers would proceed as follows…

cnt = 0;
 for(i = 0; i < size; i++)
      if(buffer[i] == '\n') cnt ++;
 return cnt;

It runs at about 0.66 cycles per input character on a recent x64 processor.

Your compiler achieves this speed by automatically vectorizing the problem.

Can we do better?

Let us try a vectorized approach using AVX intrinsics. The idea is simple. We use a vectorized counter made of 32 individual counters, initialized at zero. Within a loop, we load 32 bytes, we compare them (using a single instruction) with the linefeed character. We add the result to the running counter.

__m256i cnt = _mm256_setzero_si256(); // 32 counters
// ...
__m256i newdata = // load 32 bytes
__m256i cmp = _mm256_cmpeq_epi8(newline,newdata);
cnt = _mm256_add_epi8(cnt,cmp);

So per blocks of 32 input bytes, we use 3 instructions (one load, one comparison and one addition), plus some overhead. Could we do better? Maybe but I suspect it would be very difficult to use far fewer than 3 instructions per 32 input bytes on current processors.

With appropriate loop unrolling, this vectorized approach runs about ten times faster than the naive approach (0.04 cycles per input byte) or about 1.3 cycles per 32 input bytes. Because x64 processors have a hard time sustaining more than 4 instructions per cycle, we can estimate that our vectorized implementation is probably within a factor of two of optimality.

How does it compare with Llogiq’s best approach? I’d be interested in knowing, but his code is written in Rust, so this makes a direct comparison with my C code somewhat more difficult.

My source code is available.

Note: I have updated my code following a comment by Turbo.