Never reason from averages

StackOverflow published its list of “top paying technologies“. Worldwide, the best-paid programmers, on average, work in Clojure and Rust (these are programming languages in case you are wondering). In the US, the top paying programmers work in Go and Scala. In the UK, it is TypeScript and Ruby. Yes, these are all programming languages, but I will excuse you if you have never heard of them.

Here is the analysis we are being offered:

Globally, developers who use Clojure in their jobs have the highest average salary at $72,000. In the U.S., developers who use Go as well as developers who use Scala are highest paid with an average salary of $110,000. In the UK, it’s TypeScript at $53,763 (…)

The implication seems clear enough: go study little-known programming languages to improve your salary! Clojure, Go and TypeScript, here I come!

Sadly, for programmers looking to improve their financial outlook, this information may be worse than useless. It might be dangerously misleading. To be fair, at no point does StackOverflow recommends this list as career advice… but I think it might be perceived as such.

There is nothing wrong with learning Clojure, Go and TypeScript, of course… but will it improve your odds of earning the big bucks?

Suppose you have to choose one programming language. You can go for JavaScript or Go. Of course, you can actually learn both in a couple of week-ends, but humor me. You want an income of at least $110,000.

There are 100 opened Go jobs and half of them pay that much. So you get to apply to 50 jobs. Great!

There may be 10,000 opened JavaScript jobs… but a much smaller percentage offering at least $110,000, say 10%… yet, this leaves you with 1000 jobs. Much better than with Go.

And, of course, the abundance of jobs at a salary below what you value yourself is a bonus. If your current JavaScript job does not pan out, you can find another one quickly. With a niche language? Not so much. You may actually end up unemployed.

These are made up numbers, but my point is that the average salary is only likely meaningful if you are comparing sets having a similar size.

In practice, I strongly suspect that you maximize your odds at earning large salaries by focusing on “standard” programming languages.

I submit to you that it is no accident if the StackOverflow list of top-paying programming languages is made of obscure languages. They are comparing the average of a niche against the average of a large population.

We see a similar problem when doing international comparisons regarding academic achievements. It is not uncommon for the US to have a poor standing. Yet the US is a very large and diverse country. Just look at the last two presidents, Obama and Trump. Not very alike, are they?

If you divide the US in states, then you find that their rankings are often all over the place… but that residents of Massachusetts should not be worried in the least about these rankings…

The U.S. continues to trail its peers in global measures of academic excellence. Based on results from the latest Program for International Student Assessment (PISA) test, of the 65 countries ranked, the US ranks 31st in math, 24th in science, and 21st in reading. (…) if Massachusetts were allowed to report subject scores independently — much the way that, say, Shanghai is allowed to do so — the Bay State would rank 9th in the world in Math Proficiency, tied with Japan, and on the heels of 8th-ranked Switzerland. In reading, Massachusetts would rank fourth in the world, tied with Hong Kong, and not far behind third-ranked Finland.

Statisticians know all about this. They would never let you publish a paper that compares unqualified averages with vastly different populations. The rest of us need to be reminded periodically not to reason from averages.

The technology of Mass Effect Andromeda

Mass Effect Andromeda is the long-awaited sequel to the popular Mass Effect video game series. It is available to a game console near you. It was given a rough time by the critics and many users.

The game is set in the far future of humanity. Human beings are starting to colonize a new galaxy. I don’t mean a new star system… no… literally a new galaxy. The action is set in 2819, but it took 634 years to get to the new galaxy, so the technology is something like 200 years in our future. But the story relies on the fact that we found artifacts on Mars that propelled us hundreds of years ahead technologically. There is also intense trade with advanced races. All in all, we should expect these people to be many centuries ahead of where we might be if we just develop technologically in isolation without convenient artifacts on Mars.

Hundreds of years of technology is a lot. A hundred years ago, there were horses in New York City.

So what can we learn?

  • They have not yet invented the smartphone: you need to physically walk to a console to check your emails. Thankfully, consoles that look more or less like a modern-day PC are everywhere.
  • The main character has an implant that allows her or him to have augmented reality functions. She or he can see conduits behind walls, or point at an object and get an identification. Oddly, it seems that very few people have such implants. Technicians do not appear to be so equipped. So maybe it is very expensive?
  • Many people use fancy gauntlets as a user interface to access computers. It looks like a holographic keyboard on your arm. If you can project holographic interfaces, why would you project them on your arm specifically? Note that everyone, not just you, can see the holographic display, so it is not augmented reality.
  • The game is incredibly techno-optimistic regarding space travel. Human beings that are, as far as we can tell, nearly indistinguishable from us, make it safe and sound to another galaxy.
  • Our characters are constantly exposed to near-lethal levels of radiations, but they do not seem to worry about things like… cancer? So we can realistic think that radiation-related cancer has been cured somehow? Maybe there has been some genetic engineering of some kind?
  • For historical reasons, artificial intelligence (AI) is generally regarded with suspicion. There are automated AI-like user interfaces in the form of holograms, but they appear to have the intelligence of Commodore 64. One program has a level of sentience but it must be large or requiring a lot of energy because it is stuck in a ship, occupying a very large room. That seems unrealistic. We have no reason to believe that the powerful computers of the future will be especially large.

    I find it very odd that people who develop video games can’t imagine the future of computing as more or less what we have today.

  • There are no robots. Or rather, the robots are part of exotic technologies. Or they live independently.

    Yeah. Right. We have, today, colonized Mars with robots. There is no way that the future will have fewer robots. There is a back story in Mass Effect to explain why they do not have sentient robots, but they don’t even have robots to clean the floors like we do in 2017!

  • Programmers centuries from now still debug code using line numbers. Scary thought.
  • Though we meet many people who survived major battles, nobody seems to be equipped with artificial organs or limbs. You’d think we would see people with artificial legs or arms? Or maybe they are just very good at regrowing limbs and organs?
  • Starships rely on human pilots.
  • One medical doctor complains about arthritis and being unable to work on the battlefield due to his age. He hints that he recently reached 40 years of age and does not appear particularly frail. Other characters have noticeably old skin. Are they in short supply of anti-wrinkle cream? Yet we never meet people who are frail due to aging. So it is conceivable that if our 40-something doctor complains of arthritis and not being much good on the battlefield, it might be that he is simply too scared. That is, I could imagine that long after we will have cured arthritis, people of a certain age will use it as an excuse to get out of some tasks. Also, characters could let their skin go to hell to show their character? Looking older could be advantageous in some leadership positions. The game features people with advanced “biotic” powers that allow them, for example, to jump at an enemy from a distance. Clearly, these people benefited from some form of genetic engineering or the equivalent.

What did I think of the game? I like it. The game itself is nothing special. It is a run-of-the-mill RPG, with a good shooter. Basically, it is yet another Mass Effect game. But it is really pretty! You get to drive a 6-wheel monster… shoot at things… it is great fun if you are patient enough.

Science and Technology links (March 24, 2017)

There are many claims that innovation is slowing down. In the XXth century, we went from horses to planes. What have we done lately? We have not cured cancer or old age. We did get the iPhone. There is that. But so what? There are many claims that Moore’s law, the observation that processors gets twice as good every two years or so, is faltering if not failing entirely. Then there is Eroom’s law, the observation that new medical drugs are getting exponentially expensive. I don’t think that anyone questions the fact that we are still on an exponential curve… but it matters whether we make progress at a rate of 1% a year or 5% a year. So what might be happening? Why would we slow down? Some believe that all of the low-hanging fruits have been picked. So we invented the airplane, the car, and the iPhone, that was easy, but whatever remains is too hard. There is also the theory that as we do more research, we start duplicating our efforts in vain. Knott looked at the data and found something else:

One thought is that if R&D has truly gotten harder, it should have gotten harder for everyone. (…) That’s not what I found when I examined 40 years of financial data for all publicly traded U.S. firms. I found instead that maximum RQ [R&D productivity] was actually increasing over time! (…) I restricted attention to a particular sector, e.g., manufacturing or services. I found that maximum RQ was increasing within sectors as well. I then looked at coarse definitions of industry, such as Measuring Equipment (Standard Industrial Classification 38), then successively more narrow definitions, such as Surgical, Medical, And Dental Instruments (SIC 384), then Dental Equipment (SIC 3843). What I found was that as I looked more narrowly, maximum RQ did decrease over time (…) What the pattern suggests is that while opportunities within industries decline over time, as they do, companies respond by creating new industries with greater technological opportunity.

The way I understand this finding is that once an industry reaches maturity, further optimizations will provide diminishing returns… until someone finds a different take on the problem and invents a new industry.

With time, animals accumulate senescent cells. These are cells that should die (by apoptosis) but somehow stick around. This happens very rarely, so no matter how old you are, you have very few senescent cells, to the point where a biologist would have a hard time finding them. But they cause trouble, a lot of trouble it seems. They might be responsible for a sizeable fraction of age-related health conditions. Senolytics are agents that help remove senescent cells from your tissues. There is a natural product (quercetin), found in apples and health stores that is a mild senolytic. (I do not recommend you take quercetin though eating apples is fine.) A few of years ago, I had not heard about senolytics. Judging by the Wikipedia page, the idea has emerged around 2013. A quick search in Google Scholar seems to reveal that 2013 is roughly accurate. (Update: Josh Mitteldorf credits work by Jan van Deursen of Mayo Clinic dating back to 2011) You may want to remember this term. Anyhow, the BBC reported on a recent trial in mice:

They have rejuvenated old mice to restore their stamina, coat of fur and even some organ function. The findings, published in the journal Cell, showed liver function was easily restored and the animals doubled the distance they would run in a wheel. Dr de Keizer said: “We weren’t planning to look at their hair, but it was too obvious to miss.” “In terms of mouse work we are pretty much done, we could look at specific age-related diseases eg osteoporosis, but we should now prepare for clinical translation.”

At this point, the evidence is very strong that removing senescent cells is both practical and beneficial. It seems very likely that, in the near future, older people will be healthier through senolytics. However, details matter. For example, senescent help your skin to heal, so removing all of your senescent cells all the time would not be a good thing. Moreover, senolytics are likely slightly toxic, after all they get some of your cells to die, so you would not want to overdose. You probably just want to maintain the level of senescent cells at a low level, by periodic “cleansing”. How to best achieve this result is a matter of research.

Are professors going to move to YouTube and make a living there? Some are doing it now. Professor Steve Keen has gone to YouTube to ask people to fund his research. Professor Jordan Peterson claims that he makes something like 10k$ a month through donations to support his YouTube channel. I am not exactly sure who supports these people and what it all means.

We are inserting synthetic cartilage in people with arthritis.

It seems that the sugar industry paid scientists to dismiss the health concerns regarding sugar:

The article draws on internal documents to show that an industry group called the Sugar Research Foundation wanted to “refute” concerns about sugar’s possible role in heart disease. The SRF then sponsored research by Harvard scientists that did just that. The result was published in the New England Journal of Medicine in 1967, with no disclosure of the sugar industry funding.

I think we should all be aware that sugar in large quantities makes you at risk for obesity, heart disease, and diabetes. True dark chocolate is probably fine, however.

It seems that when it comes to fitness, high-intensity exercises (interval training) works really well, no matter your age: it improves muscle mitochondrial function and hypertrophy in all ages. [Translation: you have more energy (mitochondrial function) and larger muscles (hypertrophy).] So the treadmill and the long walks? They may help a bit, but if you want to get in shape, you better crank up the intensity.

John P. A. Ioannidis has made a name for himself by criticizing modern-day science. His latest paper is Meta-assessment of bias in science, and the gist of it is:

we consistently observed that small, early, highly-cited studies published in peer-reviewed journals were likely to overestimate effects.

What does it mean in concrete terms? Whenever you hear a breakthrough for the first, take it with a grain of salt. Wait for the results to be confirmed independently. Also, we may consider established researchers as more reliable, as per the paper’s results.

Viagra not only helps with erectile dysfunction, it seems that it keeps heart disease at bay too. But Viagra is out of patent at this point, so pharmaceutical companies are unlikely to shell out millions to market it for other uses. Maybe the government or academics should do this kind of research?

When thinking about computer performance, we often think of the microprocessor. However, storage and memory are often just as important as processors for performance. The latest boost we got were solid-state disks (SSD) and what difference does it make! Intel is now commercializing what might be the start of a new breakthrough (3D XPoint). Like a disk, the 3D XPoint memory is persistent, but it has latency closer to that of internal memory. Also, unlike our solid-state drives, this memory is byte addressable: you can modify individual bytes without having to rewrite entire pages of memory. In effect, Intel is blurring the distinction between storage and memory. For less than 2k$, you can now get a fancy disk having hundreds of gigabytes that works a bit like internal memory. The long-term picture is that we may get more and more computers with persistent memory that has nearly the performance of our current volatile memory, but without the need to be powered all the time. This would allow our computers to have a lot more memory. Of course, for this to happen, we need more than just 3D XPoint, but chances are good that competitors are hard at work building new types of persistent memory.

Leonardo da Vinci once invented a “self-supporting bridge”. Basically, given a few straight planks, you can quickly build a strong bridge without nail or rope. You just assemble the planks and you are done. It is quite impressive: I would really like to know how da Vinci’s mind worked. Whether it is was ever practical, I do not know. But I came across a cute video of a dad and his son building it up.

We have been told repeatedly that the sun was bad for us. Lindqvist et al. in Avoidance of sun exposure as a risk factor for major causes of death find contrary evidence. If you are to believe their results, it is true that if you spend more time in the sun, you are more likely to die of cancer. However, this is because you are less likely to die of other causes:

Women with active sun exposure habits were mainly at a lower risk of cardiovascular disease (CVD) and noncancer/non-CVD death as compared to those who avoided sun exposure. As a result of their increased survival, the relative contribution of cancer death increased in these women. Nonsmokers who avoided sun exposure had a life expectancy similar to smokers in the highest sun exposure group, indicating that avoidance of sun exposure is a risk factor for death of a similar magnitude as smoking. Compared to the highest sun exposure group, life expectancy of avoiders of sun exposure was reduced by 0.6–2.1 years.

Sun exposure is good for your health and makes you live longer. No, we do not know why.

Our DNA carries the genetic code that makes us what we are. Our cells use DNA as a set of recipes to make useful proteins. We know that as we age, our DNA does not change a lot. We know because if we take elderly identical twins, their genetic code is very similar. So the body is quite careful not to let our genes get corrupted. Random mutations do occur, but a single cell being defective is hardly cause for concern. Maybe you are not impressed to learn that your cells preserve their genetic code very accurately, but you should be. Each day, over 50 billion of your cells die through apoptosis and must be replaced. You need 2 million new red blood cells per second alone. Anyhow, DNA is not the only potential source of trouble. DNA is not used directly to make the proteins, our cells use RNA instead. So there is a whole complicated process to get from DNA to protein and even if your DNA is sane, the produced protein could still be bad. A Korean team recently showed that something called “nonsense-mediated mRNA decay” (NMR), a quality-control process for RNA, could improve or degrade the lifespan of worms if it is tweaked. Thus, even if you have good genes, it is possible that your cells could start making junk instead of useful proteins as you grow older.

Our bodies are built and repaired by our stem cells. Though we have much to learn, we know that injecting stem cells into a damaged tissue may help make it healthier. In the future, it is conceivable that we may regenerate entire organs in vivo (in your body) by stem cells injections. But we need to produce the stem cells first. The latest trend in medicine is “autologous stem cell transplantation”. What this means is that we take your own cells, modify them as needed, and then reinject them as appropriate stem cells where they may help. This is simpler for obvious reasons than using donated stem cells. For one thing, these are your own cells, so they are not likely to be rejected as foreign. But a lot of sick people are quite old. Are the stem cells from old people still good enough? In Regenerative capacity of autologous stem cell transplantation in elderly, Gonzalez-Garza and Cruz-Vega tell us that it is indeed the case: stem cells from elderly donors are capable of self-renewal and differentiation in vitro. That’s true even though the gene expression of the stem cells taken from elderly donors differs from that of younger donors.

In a Nature article, researchers report being able to cause a tooth to regrow using stem cells. The subject (a dog) saw a whole new tooth grow and become fully functional. If this works, then we might soon be able to do the same in human being. Can you imagine regrowing a whole new tooth as an adult? It seems that we can do it.

Back in 2010, researchers set up the ImageNet challenge. The idea was to take a large collection of images and to ask a computer what was in the image. For the first few years, the computers were far worse than human beings. Then they got better and better. And better. Today, machines have long surpassed human beings to the point of making the challenge less irrelevant in the same way Deep Blue defeating Kasparov made computer Chess programs less exciting. It seems like the competition is closing down a last workshop: “The workshop will mark the last of the ImageNet Challenge competitions, and focus on unanswered questions and directions for the future.” I don’t think that the researchers imagined, back in 2010, that the competition would be so quickly defeated. Predicting the future is hard.

A new company, Egenesis wants to build genetically modified pig that can be used as organ donors for human beings. George Church from Harvard is behind the company.

Intel, the companies that make the microprocessors in your PCs, is creating an Artificial Intelligence group. Artificial Intelligence is quickly reaching peak hype. Greg Linden reacting to “TensorFlow [an AI library] is the new foundation for Computer Science”: “No. No, it’s not.”

Currently, if you want to stop an incoming rocket or a drone, you have to use a missile of your own. That’s expensive. It looks like Lockheed Martin has a laser powerful enough to stop a rocket or a drone. In time, this should be much more cost effective. Want to protect an airport from rockets and drones? Deploy lasers around it. Next question: can you build drones that are impervious to lasers?

Andy Pavlo is a computer science professor who tries to have real-world impact. How do you do such a thing? From his blog:

(…) the best way to have the most impact (…) is to build a system that solves real-world problems for people.

Andy is right, of course, but what is amazing is that this should even be a question in the first place. Simply put: it is really hard to have an impact on the world by writing academic papers. Very hard.

There is a new planet in our solar system: “it is almost 10 times heavier than the Earth”. Maybe.

Western Digital sells 14TB disks, with helium. This is huge.

Netflix is moving from a rating system based on 5 stars to a thumbs-up, thumbs-down model.

In a New Scientist article, we learn about new research regarding the “rejuvenation of old blood”. It is believed that older people have too much of some factors in their blood, and it is believed that simply regularizing these levels would have rejuvenation effect. But, of course, it may also be the case that old blood is missing some “youthful factors” and that other tissues than the blood, such as the bone marrow, need them. This new research supports this view:

When Geiger’s team examined the bone marrow of mice, they found that older animals have much lower levels of a protein called osteopontin. To see if this protein has an effect on blood stem cells, the team injected stem cells into mice that lacked osteopontin and found that the cells rapidly aged.

But when older stem cells were mixed in a dish with osteopontin and a protein that activates it, they began to produce white blood cells just as young stem cells do. This suggests osteopontin makes stem cells behave more youthfully (EMBO Journal, doi.org/b4jp). “If we can translate this into a treatment, we can make old blood young again,” Geiger says.

Tech people often aggregate in specific locations, such as the Silicon Valley, where there are jobs, good universities, great experts and a lot of capital. This lead to rising cost of living and high real estate prices. Meanwhile, you can buy houses for next to nothing if you go elsewhere. It seems that the price differential keeps on rising. Will it go on forever? Tyler Cowen says that it won’t. He blames the high real estate prices on the fact that technology disproportionally benefits specific individuals. However, he says, technology invariably starts to benefit a wider share of the population, and when it does, real estate prices tend toward a fairer equilibrium.

Does software performance still matter?

This morning, a reader asked me about the real-world relevance of software performance:

I’m quite interested in your work on improving algorithm performance using techniques related to computer architecture. However, I think that this may only be of interest to academia. Do you think that there are jobs opportunities related with this profile, which is very specialized?

To paraphrase this reader, computers and software are fast enough. We may need people to implement new ideas, but performance is not important. And more critically, if you want to be gainfully employed, you do not need to worry about software performance.

To assess this question, we should first have a common understanding of what software performance is. Software performance is not about how quickly you can crunch numbers. It is how you manage memory, disks, networks, cores… it is also about architecture. It is not about rewriting your code in machine code: you can write fast applications in JavaScript and slow ones in C++. Software performance is related to algorithmic design, but distinct in one important way: you need to take into account your architecture. Many algorithms that look good on paper do really poorly in practice. And algorithms that appear naive and limited can sometimes be the best possible choice for performance. In some sense, being able to manage software performance requires you to have a good understanding of how computer hardware, operating systems, and runtime libraries work.

So, should we care about software performance in 2017?

Here is my short answer. There are two basic ways in which we can assess you as a programmer. Is your software correct? Is your software efficient? There are certainly other ways a programmer can bring value: some exploit their knowledge of some business domains, others will design marvelous user interfaces. However, when it comes down to hardcore programming, being correct and being efficient are the two main attributes.

Consider the great programmers. They are all good at producing software that is both correct and efficient. In fact, it is basically the definition of a great programmer. Programming is only challenging when you must be both correct and efficient. If you are allowed to sacrifice one or the other, you can trivialize most tasks.

In job ads, you probably won’t see many requests for programmers who write efficient code, nor are you going to see many requests for programmers who write correct code. But then, you do not see many ads for doctors who cure people, nor do you see many ads for lawyers who avoid expensive lawsuits. Producing efficient code that takes into account the system’s architecture is generally part of your job as a programmer.

Some of my thoughts in details:

  • Software performance only matters for a very small fraction of all source code. But what matters is the absolute value of this code, not it is relative size.

    Software performance is likely to be irrelevant if you have few users and little data. The more important the software is, the more important its performance can become.

    Given that over 90% of all software we write is rarely if ever used for real work, it is a safe bet to say that software performance is often irrelevant, but that’s only because, in these cases, the software brings little value.

    Let us make the statement precise: Most performance or memory optimizations are useless.

    That’s not a myth, it is actually true.

    The bulk of the software that gets written is not performance sensitive or worth the effort. Pareto’s law would tell you that 20% of the code accounts for 80% of the running time, but I think it is much worse than this. I think that 1% of the code accounts for 99% of the running time… The truth is maybe even more extreme.

    So a tiny fraction of all code will ever matter for performance, and only a small fraction of it brings business value.

    But what matters, if you are an employee, is how much value your optimized code brings to the business, not what fraction of the code you touch.

  • We can quantify the value of software performance and it is quite high.

    If I go on Apple’s website and I shop for a new MacBook Pro. The basic one is worth $1,800. If I want a processor with a 10% faster clock speed, it is going to cost me $2,100, or 15% more. An extra 10% in the clock speed does not make the machine nearly 10% faster. Let us say that it is maybe 5% faster. So to get a computer that runs 5% faster (if that) some people are willing to pay 15% more. I could do the same analysis with smartphones.

    If constant factors related to performance did not matter, then a computer running at twice the speed would be worth the same. In practice, a computer running at twice the speed is worth multiple times the money.

    With cloud computing, companies are now often billed for the resources (memory, compute time) that they use. Conveniently, this allows them to measure (in dollars) the benefits of a given optimization. We can find the stories of small companies that save hundreds of thousands of dollars with some elementary optimization.

    We could also look at web browsers. For a long time, Microsoft had the lead with Internet Explorer. In many key markets, Google Chrome now dominates. There are many reasons for people to prefer Google Chrome, but speed is a key component. To test out my theory, I searched Google for guides to help me choose between Chrome and Internet Explorer, and the first recommendation I found was this:

    Chrome is best for speed – arguably, a web browser’s most crucial feature is its ability to quickly load up web pages. We put both Chrome and Internet Explorer 11 through a series of benchmark tests using Sunspider, Octave and HTML 5 test. In every event, Google’s Chrome was the clear winner.

    So yes, performance is worth a lot to some users.

  • Adding more hardware does not magically make performance issues disappear. It requires engineering to use more hardware.

    People object that we can always throw more machines, more cores at the problem if it is slow. However, even when Amdahl’s law does not limit you, you still have to contend with the fact it can be hard to scale up your software to run well on many machines. Throwing more hardware at a problem is just a particular way to boost software performance. It is not necessarily an inexpensive approach.

    It should be said that nobody ever gets an asymptotically large number of processors in the real world. Moreover, when you do get many processors, coordination issues can make it difficult (even in principle) to use a very large number of processors on the same problem.

    What about our practice? What the past decades have taught us is that parallelizing problems is hard work. You end up with more complex code and non-trivial overhead. Testing and debugging get a lot more difficult. With many problems, you are lucky if you manage to double the performance with three cores. And if you want to double the performance again, you might need sixteen cores.

    This means that doubling the performance of your single-threaded code can be highly valuable. In other words, tuning hot code can be worth a lot… And adding more hardware does not make the performance problems go away magically, using this hardware requires extra work.

  • We use higher level programming language, but an incredible amount of engineering is invested in recovering the traded-away performance.

    Today’s most popular programming language is JavaScript, a relatively slow programming language. Isn’t that a sign that performance is irrelevant? The performance of JavaScript was multiplied over the years through vast engineering investments. Moreover, we are moving forward with high-performance web programming techniques like Web Assembly (see video presentation). If performance did not matter, these initiatives would fall flat.

    It is true that, over time, people migrate to high-level languages. It is a good thing. These languages often trade performance for convenience, safety or simplicity.

    But the performance of JavaScript in the browser has been improved by two orders of magnitude in the last fifteen years. By some estimates, JavaScript is only about ten times slower than C++.

    I would argue that a strong component in the popularity of JavaScript is precisely its good performance. If JavaScript was still 1000 times slower than C++ at most tasks, it would not have the wide adoption we find today.

    Last year, a colleague faced a performance issue where simulations would run forever. When I asked what the software was written in… she admitted with shame that it was written in Python. Maybe to her surprise, I was not at all dismissive. I’d be depressed if, in 20 years, most of us were still programming in C, C++, and Java.

    One of the things you can buy with better performance is more productivity.

  • Computers are asked to do more with less, and there is a never ending demand for better performance.

    Software performance has been regularly dismissed as irrelevant. That’s understandable under Moore’s law: processors get faster, we get faster disks… who cares if the software is slow? It will soon get faster. Let us focus on writing nice code with nice algorithms, and we can ignore the rest.

    It is true that if you manage to run Windows 3.1 on a recently purchased PC, it will be ridiculously fast. In fact, I bet you could run Windows 3.1 in your browser and make it fast.

    It is true that some of the hardware progress reduces the pressure to produce very fast code… To beat the best chess players in the 1990s, one probably needed the equivalent of hand-tuned assembly code whereas I am sure I can write a good chess player in sloppy JavaScript and get good enough performance to beat most human players, if not the masters.

    But computers are asked to do more with less. It was very impressive in 1990 to write a Chess program that could beat the best Chess players… but it would simply not be a great business to get into today. You’d need to write a program that plays Go, and it is a lot harder.

    Sure, smartphone hardware gets faster all the time… but there is pressure to run the same software on smaller and cheaper machines (such as watches). The mouse and the keyboard will look quaint in a few decades, having been replaced by more expensive interfaces (speech, augmented reality…).

    And yes, soon, we will need devices the size of a smartwatch to be capable of autonomous advanced artificial intelligence. Think your sloppy unoptimized code will cut it?

  • We do want our processors to be idle most of the time. The fact that they are is not an indication that we could be sloppier.

    Aren’t most of our processors idle most of the time?

    True, but it is like saying that it is silly to own a fast car because it is going to spend most of its time parked.

    We have enormous overcapacity in computing, by design. Doing otherwise would be like going to a supermarket where all of the lines for the cashiers are always packed full, all day long.

    We have all experienced what it is to use a laptop that has its CPU running at 100%. The laptop becomes sluggish, unresponsive. All your requests are queued. It is unpleasant.

    Servers that are running at full capacity have to drop requests or make you wait. We hate it.

    A mobile phone stressed to its full capacity becomes hot and burns through its battery in no time.

    So we want our processors to be cold and to remain cold. Reducing their usage is good, even when they were already running cold most of the time. Fast code gets back to the users faster and burns less energy.

    I imagine that the future will be made of dark silicon. We will have lots of computing power, lots of circuits, but most of them will be unpowered. I would not be surprised if we were to soon start rating software based on how much power it uses.

Science and Technology links (March 17, 2017)

We live in a world where the most powerful companies in the world have super smart people working on trying to emulate human intelligence in machines. Yann LeCun, a Frenchman who directs Facebook’s artificial intelligence research group, tells us that, soon, computers could acquire what he calls “common sense”. I can’t resist pointing out that this would put machines ahead of most human beings. Kidding aside, it seems that the goal is to produce software that can act as if it understood the world we live in. There is a ball up in the air? It is going to come down. Someone dressed in dark is waiting in an alley? He might be waiting to mug someone. Instead of having to train machines for specific tasks, and having them perform well only on these specific tasks, the goal would be to start with software that can reason about the world, and then build on that. Want to build an automated vacuum cleaner? Want it to stop whenever someone enters the room? Maybe in the future, software will be smart enough to understand well enough what “someone enters the room” means without months of tweaking and training. Given that software can be copied inexpensively if one company succeeds at building one such general-purpose engine, it could quickly apply it to a wide range of problems, creating specialist engines like my automated vacuum cleaner. In 2017, that’s still science fiction but people like LeCun are not clowns: they got tangible results in the past. People use their ideas to solve real problems, in the real world. So who knows?

In Myths that will not die, Megan Scudellari review a few scientific myths that you may believe:

  • Screening saves lives for all types of cancer. The truth is that cancer screening is often negative. The story usually goes that if they catch cancer early, your chance of survival goes up. The problem is that many cancers will kill you no matter what, knowing about them early just makes your miserable longer. And lots of cancers would not have killed you anyhow, so you are just going to go through stress and painful therapies for no good reason.
  • Antioxidants are good. You should eat your fruits and vegetables. Take your vitamins. It is full of antioxidants. And so on. The idea underneath is that your body is like an automobile and it is “oxidizing” (rusting).You can actually see this process: gray hair is a form of oxidation. This was one of the flawed theory of why we age: oxidation kills us. Take antioxidants and you will prevent this process. Not so fast! Your body has pretty good antioxidants. Taking more may not help, it may even cause some slight harm. If anything, free radicals, in moderation, might be good for you.
  • Human beings have exceptionally large brains. We are pretty much in line with other primates. However, our brains are configured somewhat differently.
  • Individuals learn best when taught in their learning style. This is pure junk. There is no such thing as a “visual learner”. That people keep on insisting that it must be, despite all the scientific evidence being against it… is really puzzling.
  • The human population is growing exponentially and we are doomed. The richer a country is, the slower its population grows. Advanced countries like Germany and Japan have a population decline. Because the world is getting richer, all continents but Africa will reach a population plateau within a few decades. Even Africa will catch up. Meanwhile, we produce, today, enough food to feed 10 or 12 billion people. So yes, there will be more human beings alive in 50 years, but there may actually be fewer in 100 years. Meanwhile, having more people means to have more engineers and scientists, more entrepreneurs, more astronauts.

    By the way, by mere extrapolation, it is quite certain, I think, that there won’t be anything that we would recognize as “human” in a couple of centuries… if technology keeps getting better. Any long-term doomsday scenario that ignores technology as a factor is just ridiculous.

The diseases of old age are tremendously expensive. Dementia, the diseases affecting cognition are particularly cruel and expensive:

The worldwide costs of dementia were estimated at United States (US) $818 billion in 2015, an increase of 35% since 2010 (…) The threshold of US $1 trillion will be crossed by 2018.

It is hard to wrap our head around how much of an expense that is. Let me try. If we did not have to pay for dementia, we could afford five million well-paid researchers, engineers, and scientists at 200k$ a year each. Care to wonder what 5 million of them could achieve every single year? I don’t know if we will have cures for dementia in 10 or 20 years, but if we do, we will be immensely richer.

Regarding Parkinson’s… the terrible disease that makes people shake uncontrollably, lose the ability to speak normally… Ok. So right now, there is nothing we can do to reverse or slow this disease, at best we can help control the damages. We need to do better. Almost every day, there is a grand proclamation that a cure is around the corner, but it always disappoints. The latest one I have seen is nilotinib. It is a drug used to treat some cancers. It turns out that it has another great benefit: it boost autophagy. Autophagy is the process by which your body “eats itself”. I know it sounds bad, but that’s actually a great thing if you are a multicellular organism. The idea is that repair is too hard, so what the body does is to eat the broken stuff and then use the byproduct to rebuild anew. If you could crank up autophagy, you would assuredly be healthier. So, they think that in cases of Parkinson’s, the body faces an accumulation of garbage that is so bad that it leads to cell death. If you could convince the body to eat up the garbage, you might be able to stop the progression of the disease. It would not, by itself, reverse Parkinson’s, but once the damage stops worsening, the body can be given a chance to route around the damage, and we could start thinking about damage-repair therapies. As it stands, with damages accumulating at a high rate, there is little we can do until we slow it down. Anyhow, in mice designed to suffer from Parkinson’s, nilotinib stops the disease. There are now large clinical trials to see if it works in human beings. Could a cure for Parkinson’s be around the corner?

Technology is very much about globalization. Many people are concerned, for example, with the fact that the United States is running trade deficits year after year. To put it in simple terms, Americans buy many good from China, but China buys comparatively fewer goods from the USA. Could this go on forever? Economist Scott Sumner explain that it can:

A common mistake made by famous economists is to confuse statistical measures (…) with the theoretical concept (…). In terms of pure economic theory, the US sale of an LA house to a Chinese investor is just as much an “export” as the sale of a mobile home that is actually shipped overseas. But one is counted as an export and one is not.

It is a common problem in science that we measure something that differs from our model. The difference appears irrelevant until it is not.

Ever since I was a teenager, I have been told that cholesterol causes heart disease. Statins, a family of drugs to drastically lower cholesterol, is making pharmaceutical companies very rich. A controversial study published last year suggests that lowering cholesterol might be vain:

What we found in our detailed systematic review was that older people with high LDL (low-density lipoprotein) levels, the so-called “bad” cholesterol, lived longer and had less heart disease.

Lowering cholesterol with medications for primary cardiovascular prevention in those aged over 60 is a total waste of time and resources, whereas altering your lifestyle is the single most important way to achieve a good quality of life

This does not mean that statins do not have benefits. They seem to be working for men who have had a prior cardiovascular incident. However, do they work because they lower cholesterol? This being said, taking low-dose aspirin has also some benefits regarding cardiovascular health, and it is much cheaper. Both aspirins and statins have side-effects, of course.

Going to space is bad for your health. It seems to cause some form of accelerated aging. The effect remains even if you keep astronauts active (through special exercises) and if you shield them as best as possible from radiation. (Spending a month in space does not expose you to more radiation that what flight crew experience in a lifetime, yet pilots do not suffer from accelerated aging.) There are many unknowns in part due to the fact that there are very few astronauts. We know that all of them are closely monitored for health problems. Surprisingly, up until now, retired astronauts did not receive government-paid health care. Given that they are effectively Guinee pigs and living science experiments, I would have thought that they were covered. This is about to change.

IBM claims that it is reaching human-like speech recognition levels on some very difficult tests.

Last year, IBM announced a major milestone in conversational speech recognition: a system that achieved a 6.9 percent word error rate. Since then, we have continued to push the boundaries of speech recognition, and today we’ve reached a new industry record of 5.5 percent.

It seems that human parity would be 5.1 percent. So we are really close to human parity. Maybe next year? I tend to be skeptical of such announcements, whether they come from IBM, Microsoft or Google. I much prefer to assess deployed technologies and it is clear that neither Google nor Apple is at human-level speech recognition. So let us take this with a grain of salt, for now.

I take most of the academic research with a large grain of salt. My experience is that, too often, I can’t reproduce the work of my peers. Sometimes they flat out lie about their methods, but most often, they were just careless. I am sure my own work has problems, but I am have worked hard to ensure that all my papers are backed by freely available code and data. This way, at least, it is often easy to check my work. We already know that most psychology studies cannot be replicated. What about cancer research? Cancer research is the best-funded research, period. If Computer Science is a house in the suburbs, cancer is the Trump tower. If anything should be top notch, it is cancer research. Yet it does not look good:

Glenn Begley and Lee Ellis from Amgen said that the firm could only confirm the findings in 6 out of 53 landmark cancer papers—just 11 percent. Perhaps, they wrote, that might explain why “our ability to translate cancer research to clinical success has been remarkably low.”

If you asked 20 different labs to replicate a paper, you’d end up with 10 different methodologies that aren’t really comparable. (…) If people had deposited raw data and full protocols at the time of publication, we wouldn’t have to go back to the original authors (…)

Simply put, we are paying for research papers, and so we get them, but little else. To turn things around, scientists need to have skin the game. If their work does not lead to a cure, and it won’t if it is not reproducible, they need to pay a price.

On the topic of cancer, some researchers claim that they have figured out how to block the spread of cancer by disabling Death Receptor 6 (in mice, at least). Who comes up with these names? Anyhow, cancer is mostly a dangerous disease because it spreads. If tumors remained where they are, they would either not cause much harm, or be easy to manage (or remove). Sadly, many cancers tend to spread uncontrollably through our blood stream. It seems that it punches through the walls of blood vessels and moves on from there. Speculatively, we could maybe disable this ability in cancer is make it a much less dangerous disease. Maybe. But can we reproduce the Death Receptor 6 experiments?

The lack of reproducibility also plagues gerontology and development biology. Researchers have been studying the lifespan of short-lived worms for decades. At a high-level, this is compelling research. Take a compound, bath worms in it, and see whether they live shorter or longer. If biology had maximized lifespans, then it would be very hard to find compounds that prolong life, but it is not so hard in practice. What is hard is to get the same compound to extend the life of worms robustly across different species of worms and in a way that competing labs can reproduce. Lucanic et al. showed that one compound (and only one in their test) robustly extended the lifespan of a wide variety of worms: Thioflavin T. What is this magical compound and can you order it online? It is a fluorogenic dye that is routinely used in medical research. Don’t drink the stuff just yet. On a related note, methylene blue, another dye, has also been reported to have great health properties. Here is a vision for you: the world is filled with ageless centenarians, but they are either dyed in yellow, blue or green.

So maybe you want to remain healthy for a long time with something less colorful and easier to find. What about baking soda? Scientists who want to be taken seriously, call the stuff NaHCO3. Wesson in Is NaHCO3 an antiaging elixir? tells us that it maybe be good for our health:

Emerging evidence supports that the largely acid-producing diets of developed societies induce a pathological tissue milieu that contributes to premature death, mediated in part through vascular injury, including vascular calcification.

Speaking for myself, before I start consuming large quantities of baking soda, I’ll expect scientists to make a very convincing case. The stuff does not taste very good.

To remain youthful longer, maybe you do not need to do anything at all. Stenzaltz tell us that we are aging slower as time passes, in the following sense:

the generation of American men who fought in World War II and are now in their 90s lived, on average, about eight years longer than their great-grandfathers who fought in the Civil War, once they reached adulthood.
(…)
According to the 1985 United States Health Interview Survey 23 percent of Americans aged 50 to 64 reported limitations on daily activities due to chronic illness; in 2014, this was down to 16 percent
(…)
Perhaps most striking, a new study has discovered that over the past two decades the incidence of new dementia cases has dropped by 20 percent. Men in the United Kingdom develop dementia today at the same rate as men five years younger in the 1990s (…)

It is fairly easy to see why people who fought in the WW II would be healthier than previous generations: they had access to inexpensive antibiotics. Antibiotics are our first line of defense against infectious diseases. Yet it is much less clear why people in their fifties would be healthier today than they would have been in the 1980s. Why would dementia rates fall? It demands an explanation and I see none being offered.

DNA is used by biology as an information storage mechanism, not unlike our digital storage devices. It seems that we could back up the entire Internet on a gram of DNA. The real challenge, of course, would be to access the information quickly. Biology is often not in a hurry. Google users are. Pangloss has a nice review of the technology where he hammers the point that it is just too expensive to write data to DNA:

Although this is useful research, the fact remains that DNA data storage requires a reduction in relative synthesis cost of at least 6 orders of magnitude over the next decade to be competitive with conventional media, and that currently the relative write cost is increasing, not decreasing.

There is an emerging consensus that as we age, our blood acquires a high level of harmful factors. These factors may not be harmful by themselves, but having too much of it for a long time could be detrimental to our bodies. We are seeing the first glimpses at technology to block or remove the bad stuff from old blood:

Yousef has found that the amount of a protein called VCAM1 in the blood increases with age. In people over the age of 65, the levels of this protein are 30 per cent higher than in under-25s.

To test the effect of VCAM1, Yousef injected young mice with blood plasma taken from older mice. Sure enough, they showed signs of aging (…) Blood plasma from old people had the same effect on mice. (…)

These effects were prevented when Yousef injected a compound that blocks VCAM1. When the mice were given this antibody before or at the same time as old blood, they were protected from its harmful effects.

So we can envision therapies that “rejuvenate” our blood. And no, it won’t involve drinking the blood of the young. Rather you could imagine a device that would scan your blood, determine that you have too much of protein X and decide to remove some of it to bring it back to a youthful level. Still science fiction, but I bet that we shall be trying it out on mice soon.

A nasty problem that affects some of us is retinal degeneration. Basically, you start to have a hard time seeing, you go to your doctor, and he tells you that your retina is going to shred, and it is only going to get worse. “Go buy a white cane now.” That would not be a good day. If your retina degenerates on its own, it is not like glasses will help you. But Yu et al. showed that you could use a nifty new gene therapy to prevent retinal degeneration. Basically, they reprogram the cells, in vivo, and all is well. It works in mice.

On the topic of the retina, three ladies went to a clinic in Florida for macular degeneration, a form of age-related retinal degeneration. They paid $5000 for an experimental stem-cell therapy and came out blind. The experimental therapy is not the problem per se, one has to try new things from time to time. If you are going blind and someone suggests you try something that might restore your eyesight… wouldn’t you want to try it? At some point, you have little to lose and much to gain. Moreover, some people might be motivated by the fact that they help advance science. The real problems arise because the therapy was not administered in a sensible fashion. For example, they treated both eyes at the same time. That’s obviously unnecessarily dangerous. They also charged the patients for something that is an experiment. Again, that’s less than ideal. For unproven therapies, you’d really prefer not to charge your Guinee pigs for the experiment. Stem cells have great potential in medicine, but we are not yet at the point where we can do in vivo regeneration of organs. At least, we can’t do it safely.

So having children is stressful and a lot of work. It is hard. But did you know that parents live longer? Evidently, having to work hard is not a negative when it comes to health and longevity.

Is the Sun bad or good for your skin? It may depend on the wavelength. High-frequency light (UV-A and UV-B) seems to be bad (burns, cancer, wrinkles and so forth), hence the ubiquitous sunscreen, but low-frequency (infrared) light might be good for us:

In the last decade, it has been proposed that the sun’s IR-A wavelengths might be deleterious to human skin and that sunscreens, in addition to their desired effect to protect against UV-B and UV-A, should also protect against IR-A (…) IR-A might even precondition the skin–a process called photo prevention–from an evolutionary standpoint since exposure to early morning IR-A wavelengths in sunlight may ready the skin for the coming mid-day deleterious UVR. Consequently, IR-A appears to be the solution, not the problem. It does more good than bad for the skin.

So maybe you should get an infrared lamp to improve your skin? Sounds less painful than having to eat baking soda.

Stable Priority Queues?

A priority queue is a data structure that holds a set of elements and can return quickly the smallest (or alternatively the largest) element. It is usually implemented using a binary heap.

So you “add” elements to the priority queue, and then you can “poll” them out.

Suppose however that you insert elements that are equal. What happens? Because binary heaps are not stable, your elements may not come out in insertion order.

For example, suppose you add the following tuples to a priority queue:

[{ name: 'player', energy: 10},
 { name: 'monster1', energy: 10},
 { name: 'monster2', energy: 10},
 { name: 'monster3', energy: 10}
]

You could poll them back out based on their “energy” value in a different order… even though they all have the same “energy”…

[{ name: 'player', energy: 10},
 { name: 'monster3', energy: 10},
 { name: 'monster2', energy: 10},
 { name: 'monster1', energy: 10}
]

That’s not very elegant.

Thankfully, there is an almost trivial approach to get a stable priority queue. Just add some kind of counter recording the insertion order, and when you insert elements in the binary heap, just use the insertion order as to differentiate elements. Thus, for a to be smaller than b, it is enough for the value of a to be smaller than the value b or that a be the same as b in value, but with a smaller insertion counter.

For example, we might store the following:

[{ value: { name: 'player', energy: 10 }, counter: 0 }
{ value: { name: 'monster1', energy: 10 }, counter: 1 }
{ value: { name: 'monster2', energy: 10 }, counter: 2 }
{ value: { name: 'monster3', energy: 10 }, counter: 3 }]

When comparing any two objects in this example, we not only compare them by their “energy” attribute, but also by their “counter” attribute.

So I implemented it in JavaScript as a package called StablePriorityQueue.js.

Easy!

I can’t promise that the performance will be as good as a speed-optimized priority queue, however.

This lead me to a follow-up question: what is the best (most efficient) way to implement a stable priority queue?

Since the standard binary heap does not support tracking the insertion order, we chose to append an insertion counter. That’s reasonable, but is it the most efficient approach?

And, concretely, what would be the best way to implement it in a given language? (Java, JavaScript…)

The ultimate goal would be to get a stable priority queue that has nearly the same speed as a regular priority. How close can we get to this goal?

Credit: Thanks to David Ang for inspiring this question.

Science and Technology links (March 10, 2017)

In Mnemonic Training Reshapes Brain Networks to Support Superior Memory (published in Neuron, March 2017), we learned that 6 weeks of mnemonic training at a rate of 30 minutes a day lead to a large scale brain network re-organization making the brain of control subjects more like that of memory athletes.

In How worried should we be about artificial intelligence?, Andrew Ng, the chief scientist of the search engine giant Baidu is quoted as saying: “Worrying about evil-killer AI today is like worrying about overpopulation on the planet Mars.” Another famous computer scientist, Moshe Vardi, was quoted as saying “the superintelligence risk, which gets more headlines, is not an immediate risk. We can afford to take our time to assess it in depth”. The article includes lots of interesting quotes by other scientists and thinkers.

In adult mammals, damaged nerve cells do not normally regenerate and neurogenesis is barely noticeable. So a damaged brain can route around the damage, but it does not normally repair itself fully. In an October 2016 article published in the journal Neuron, Tedeschi et al. showed that treating mice with the drug Pregabalin caused damaged nerve connections to regenerate. Speculatively, this could help find cures for neurodegenerative diseases.

Yann Collet, a Facebook engineer famous for LZ4, a fast compression format, has released Zstandard is a new compressed format that has superior performance. Though the software is open source, it may be covered by patents, so check with your lawyers.

Mice age faster than human beings: they barely live a couple of years whereas human beings can live decades. We don’t know how the cells know how fast they are supposed to develop and age. In vitro, the human and mice cells develop at different rates. What happens if you put human embryonic stem cells in mice? They still develop at a human rate. This suggests that the cells themselves are equipped with some kind of clock. What this clock might be is unknown. (Source: Barry et al., Species-specific developmental timing is maintained by pluripotent stem cells ex utero)

We know how to reset cells in live mice so that they become pluripotent stem cells (e.g., using the Yamanaka factors). That is, we rewind the clock of the cells to zero, in a live animal. See In Vivo Amelioration of Age-Associated Hallmarks by Partial Reprogramming by Ocampo et al. More recently, Marión et al. showed in Common Telomere Changes during In Vivo Reprogramming and Early Stages of Tumorigenesis that this is accompanied by what appears to be a reset on the length of the telomeres. The telomeres are a component of your chromosomes that get shorter with every division. Once the telomeres become too short, your cells stop dividing and may become senescent. Anyhow, it looks more and more certain that, yes, we can reset the age of cells in a living organism.

College and inequality

Most college professors are squarely on the left ideologically. They believe that part of their mandate is to reduce inequality, by helping to provide college degrees to all who qualify.

This always felt as strange to me. Higher education is highly subsidized, but the money goes overwhelmingly to people who are better off than average. So if you are a young adult, you either go find work, in which you will probably end up paying taxes… or else you attend college, in which case you will receive net benefits.

You are much more likely to go to college, and thus receive government subsidies if you are from a wealthier family. Moreover, you are more likely to go to college, and be subsidized, if you inherited from characteristics that are likely to turn you into a sought-after professional.

It does not stop there. Subsidies overwhelmingly go to elite schools. For example, in 2008, Princeton and Harvard received $105,000 and $48,000 in tax benefits per student.

We find an insane wealth concentration among the elite universities. The top 10 schools in the USA account for 5 percent of the world’s 211,275 people worth $30 million or more.

The lowly community college receives much, much less than elite schools. It should be clear that through higher education, governments subsidize the privileged.

But, at least, some students from modest backgrounds make it to good universities. However, it is far from clear that they will reap the same benefits as the well-off students. Elizabeth Armstrong and Laura Hamilton wrote a book Paying for the Party. In their book, they examine how well college serves the poorest students. To the question “What happened to the working-class students you studied?”, they answer “On our floor, not one graduated from the university within five years.” And what about the well-off students? “They were able to recreate their parents’ success. They all graduated.” But maybe college at least allows people from a modest background to rub shoulders with well-off students? The researchers found that “cross-class roommate relationships were extremely negative for the less privileged person”.

Where do you think well-off young people fall in love? In colleges. If you care at all about inequality, then you should care about assortative mating. One of the most important factor in determining your future earnings is who you mate with. Elite colleges act as subsidized mating grounds for the privileged. So it is not just the subsidies to elite colleges go to the privileged, it also helps create and sustain assortative mating, a powerful driver for inequality.

And let us not forget age. Most elite colleges will actively discriminate against you if you are older and uneducated. If you are 40 and a truck driver, and you find yourself out of a job, don’t bother applying at Princeton, even if you had great grades in high school. Age is a mediating factor: people from more modest backgrounds tend to have false starts. If you discriminate against people because they had false starts, you are effectively discriminating against them because they come from a more modest background.

I am not concerned about inequality, but if I were and I thought that governments should subsidize higher education, then I would favor the following policies:

  • I would check that people from the lowest quintile receive benefits from higher-education subsidies that are larger than the benefits received from top 1%. Given that children from the last quintile rarely attend college at all, and when they do they rarely graduate, while children from the top 1% mostly attend college and mostly graduate, this would be a harsher requirement to meet than might appear.
  • Assuming that higher-education should be subsidized at all, then it should be subsidized in reverse ranking. Highly accessible community colleges should receive more from the state per student than elite schools. Students from the lowest quintile should receive the bulk of the support and funding.
  • Government subsidies should favor low-income students and the graduation of low-income students. I would never subsidize a school for providing to the well-off students.
  • I would subsidize more generously adult education.
  • I would not subsidize Harvard or Princeton at all. Their endowments should be taxed and the income used to pay for better community colleges.

So you would think that activists on the left would have this agenda already. If you are going to “Occupy Wall Street”, you should certainly be concerned that Harvard’s $35 billion endowment is allowed to profit tax-free, and that the benefits go mostly to the elite. Why is there no call to redistribute these endowments?

Let us look at the intellectuals on the left. We have David Graeber, who helped spur the Occupy Wall Street movement. Graeber was a professor at Yale and is now a professor at the London School of Economics. Both schools that received students from well-off families and hand degree to people who are going to join the top 1%. At the London School of Economics, we also find Thomas Piketty, who became famous for his socialist treaty, Capital in the Twenty-First Century. It is interesting that these intellectuals do not choose to work for schools where blue-collar workers are likely to send their children. They are nowhere to be seen at community colleges or vocational schools. How likely is it that Graeber and Piketty would be eager to teach at a community college or at an adult-education school?

It is not just the USA and England. In Canada, the most elite schools, the less likely to cater to children who have minimum-wage parents, also receive the best government funding.

So the idea that subsidized colleges are a force of equality is flat out wrong. At best, they do well by the middle class. Let us at least be honest about it.

The technology of Logan (2017 Wolverine movie set in 2029)

Last night I went to see Logan, the latest and maybe the last Wolverine movie with Hugh Jackman. The movie is set in 2029. The year 2029 is an interesting choice, as it is the year of the story “Ghost in the shell”, made into a movie featuring Scarlett Johansson and coming out later this year. So we shall have to sci-fi movies representing the year 2029 this year.

Writing stories about the near future is always a bit daring because you are sure to get most things wrong, and people will soon find out how wrong you were. So how does Logan represent the year 2029?

  • To my eyes, fashion, cosmetic, clothes… all appear unchanged from today.
  • People still read newspapers printed on paper, comics printed on paper.
  • People carry pictures printed on paper.
  • Augmented and virtual reality are nowhere to be seen? There are no smart glasses, no enhanced hearing, no smart displays.
  • Though people very much drive their own cars and trucks, and they have not changed very much, self-driving tractor trailers are ubiquitous on the highway.
  • Slot machines in casinos are unchanged.
  • TVs and remote controls look the same.
  • Inexpensive mobile phones look very much like they do today. We charge them up with cords, as we do today. They have cameras that are no better than what we have today. People send “texts” like they do today. A phone call works like it does today.
  • It seems that the Internet and computers work exactly like they do today, no progress or notable change.
  • Computerized artificial limbs appear to be common. So if you lose a hand, you can get a metallic replacement that looks to be as good as a “natural” hand.
  • Drones that look like the drones we have today are used for surveillance. It is unclear whether they are more advanced than the drones we have today.
  • Artificial human insemination using collected DNA (i.e., without sperm) is possible.
  • Large farms are automated, with giant metallic beasts taking care of the fields.
  • We grow genetically engineered corn that looks much larger than any corn I have ever seen.
  • It does not look like governments can track people using satellites.
  • We can grow entire human bodies (clones) in vats.
  • Convenience stores are still manned by human clerks.
  • In 2017, we have giant digital screens on the roadsides displaying ads. They are still around in 2029 but look brighter to my eyes.
  • In the movie, Professor X (Charles Francis Xavier) is suffering from some form of dementia, probably Alzheimer’s. In 2017, Alzheimer’s is basically incurable and untreatable. It looks like there is no cure in 2029. However, the main character (Wolverine) gets some medication that appears to help manage symptoms.

How many floating-point numbers are in the interval [0,1]?

Most commodity processors support single-precision IEEE 754 floating-point numbers. Though they are ubiquitous, they are often misunderstood.

One of my readers left a comment suggesting that picking an integer in [0,232) at random and dividing it by 232, was equivalent to picking a number at random in [0,1). I am not assuming that the reader in question made this inference, but it is a tempting one.

That’s certainly “approximately true”, but we are making an error when doing so. How much of an error?

Floating-point numbers are represented as sign bit, a mantissa and an exponent as follows:

  • There is a single sign bit. Because we only care about positive number, this bit is fixed and can be ignored.
  • The mantissa is straight-forward: 23 bits. It is implicitly preceded by the number 1.
  • There are eight bits dedicated to the exponent. It is pretty much straight-forward, as long as the exponents range from -126 to 127. Otherwise, you get funny things like infinity, not-a-number, denormal numbers… and zero. To represent zero, you need an exponent value of -127 and zero mantissa.

So how many “normal” non-zero numbers are there between 0 and 1? The negative exponents range from -1 all the way to -126. In each case, we have 223 distinct floating-point numbers because the mantissa is made of 23 bits. So we have 126 x 223 normal floating-point numbers in [0,1). If you don’t have a calculator handy, that’s 1,056,964,608. If we want to add the number 1, that 126 x 223 + 1 or 1,056,964,609.

Most people would consider zero to be a “normal number”, so maybe you want to add it too. Let us make it 1,056,964,610.

There are 4,294,967,296 possible 32-bit words, so about a quarter of them are in the interval [0,1]. Isn’t that interesting? Of all the float-pointing point numbers your computer can represent, a quarter of them lie in [0,1]. By extension, half of the floating-point numbers are in the interval [-1,1].

So already we can see that we are likely in trouble. The number 232 is not divisible by 1,056,964,610, so we can’t take a 32-bit non-negative integer, divide it by 232 and hope that this will generate a number in [0,1] in an unbiased way.

How much of a bias is there? We have a unique way of generating the zero number. Meanwhile, there are 257 different ways to generate 0.5: any number in between 2,147,483,584 and 2,147,483,776 (inclusively) gives you 0.5 when divided by the floating-point number 232.

A ratio of 1 to 257 is a fairly sizeable bias. So chances are good that your standard library does not generate random numbers in [0,1] in this manner.

How could you get an unbiased map?

We can use the fact that the mantissa uses 23 bits. This means in particular that you pick any integer in [0,224), and divide it by 224, then you can recover your original integer by multiplying the result again by 224. This works with 224 but not with 225 or any other larger number.

So you can pick a random integer in [0,224), divide it by 224 and you will get a random number in [0,1) without bias… meaning that for every integer in [0,224), there is one and only one number in [0,1). Moreover, the distribution is uniform in the sense that the possible floating-point numbers are evenly spaced (the distance between them is a flat 2-24).

So even though single-precision floating-point numbers use 32-bit words, and even though your computer can represent about 230 distinct and normal floating-point numbers in [0,1), chances are good that your random generator only produces 224 distinct floating-point numbers in the interval [0,1).

For most purposes, that is quite fine. But it could trip you up. A common way to generate random integers in an interval [0,N) is to first generate a random floating-point number and then multiply the result by N. That is going to be fine as long as N is small compared to 224, but should N exceed 224, then you have a significant bias as you are unable to generate all integers in the interval [0,N).

I did my analysis using 32-bit words, but it is not hard to repeat it for 64-bit words and come to similar conclusions. Instead of generating 224 distinct floating-point numbers in the interval [0,1], you would generate 253.

Further reading: The nextafter function and Uniform random floats: How to generate a double-precision floating-point number in [0, 1] uniformly at random given a uniform random source of bits.