My most popular posts in 2015 (part II)


For several years now, I have grown more optimistic about the power of human innovation. Despite the barrage of bad news, the fact is that we are richer and healthier than we have ever been. Yes, I might not be rich or healthy compared to the luckiest among us… but on the whole, humanity has been doing well.

In Going beyond our limitations, I reflected on the “coming” end of Moore’s law. Our computers are using less power and they have chips that are ever smaller… but it is seemingly more and more difficult to improve matters. I argue that some of the pessimism is unwarranted: on the whole, we are still making nice progress… But it is true that, at the margin, we are facing challenges. I think we need to take them head on because I want robots in my blood hunting down diseases and chips in my brain helping me think more clearly.

In Could big data and wearables help the fight against diseases?, I argue that information technology could massively accelerate medical research.

In What a technology geek sees when traveling abroad, I reflect on how technology has evolved, by using a recent trip I made as a vantage point. In Amazing technologies from the year 2015…, I reflect on the technological progress we made in 2015.

This year, I read Rainbows end, a famous novel by Vernor Vinge set in 2025. The novel makes some precise prediction about 2025, one of them is that we shall cure Alzheimer’s by then (at least for some individuals). Interestingly, Hilary Clinton has announced a plan to do just that, should she be elected president of the USA. In Revisiting Vernor Vinge’s “predictions” for 2025, I have looked at the novel as a set of predictions, trying to sort them out into what is possible and what is less likely.


I like to stress how blind I can be to the obvious. I have a lot of education, probably too much education… but I am routinely wrong in very important ways. For example, I took biology in college and I got excellent grades. I attended really good schools. I studied hard. I have also been a gardener for most of my life. I have also kept green plants in my home for decades. Yet, until I reached my forties, I assumed that plants took most of their mass from the soil they were planted in. How could I ever think that? It is obviously deeply wrong. (In case you are also confused, plants are mostly made out of the carbon their extract from the CO2 in the air.) I just kept on assuming, even though I had all the facts at my disposal, and presumably all the required neurons, to know better.

And up till 2015, I assumed that aging was both unavoidable and irreversible. I guess I assumed that evolution had tuned bodies for an optimal lifespan and that whatever we got was the best we could get. After all, you buy a car and it lasts more or less ten years. You buy a computer and it lasts more or less five years. It makes sense, intuitively, that all biological organisms would have an expiry date based on wear and tear.

Yet this makes no sense. For example, I keep annual plants in my basement, making them perennial by pruning their flowers just before they bloom. If evolution drove living things to live as long as possible, these annual plants would be perennial. Yet we know that evolution did the opposite. We believe that plants were originally perennial and then became annual more recently in their evolution. In Canada, we have the Pacific salmon that dies out horribly after procreation while the similar Atlantic salmon can reproduce many times. There is a worm, the Strongyloides ratti that can live 400 days in its asexual form while only 5 days in its sexual form. So the very same worm, with the same DNA, can “choose” to live a hundred of times longer, or not. Many animal species like whales, some fishes (sturgeon and rougheye rockfish), some turtles, lobster do not age like we do… and they sometimes even age in reverse… meaning that their risk of death decreases with time while their fertility might go up.

So, clearly, we age because the body does not do everything it should to keep us in good shape. There are some forms of damage that your body cannot repair, but it could do a whole lot more. There is some kind of clock ticking… it is either the case that your body “wants” you to age on a schedule (like annual plants), or else your genes are simply ill-suited for longevity (because evolution does not care about what happens to the old). Whatever the case might be, aging is mostly a genetic disease we all suffer from.

It is all nice but what does it matter to us, human beings? Ten years ago at Stanford, Irina Conboy and her collaborators showed that the blood of a young mouse could effectively “rejuvenate” an old mice. This was a significant breakthrough but it got little traction… until recently. How does it work? Conboy knew that when we do organ transplantation, it is the age of the recipient that matters. If you put a young lung into an old body, it behaves like an old lung. And vice versa: an old hearth in a young body will act young. We now know that tissues respond to signals: if told to be young, cells behave as if they were young. We have been able to take cells from centenarians and reset them so that they look like young cells. So can you tell your body that you are young again? Drinking young blood won’t work, of course… instead, we want to identify the signals and tweak them accordingly. Recently, the Conboy lab. at UC Berkeley showed that oxitocin (a freely available hormone) could rejuvenate old muscles. There are ongoing clinical trials focusing on myostatin inhibition to allow old people to have normal muscle mass and strength. The race is currently on to identify these signals and find ways to modulate them: old signals should be silenced and young signals should be amplified. There are many ways to silence or amplify a signal, but because we do not have the cipher book, it is tricky business.

Harvard geneticist George Church has other angles of attack and he claims that “in just five or six years he will be able to reverse the aging process in human beings.” Church has been studying the genes of centenarians and he wants to identify protective alleles that we could then all receive through genetic engineering. Moreover, he has plans to up-regulate (and possibly down-regulate) certain genes. Indeed, as we age, many genes that were silent, are activated, and a few that were active are down-regulated. This gene regulation is part of what we call “perigenetic”: though your genes might be set for life, which genes are expressed at any given time is a dynamic and reversible process. So cells know how to be old or young and this seems to depend a lot of which genes are expressed. The process is also fully reversible as far as we can tell. Will George Church cure aging by 2020?

As it turns out, there are many other rejuvenation therapies in the works.

As you get older, your immune system starts to fail and even turn against you. Part of this process is that you effectively lose your thymus (around age 40): it becomes atrophied. The thymus is the organ in charge of “training” your immune cells. With it gone, your immune system gradually becomes incompetent. There are many ways to restore the thymus. There is an ongoing clinical trial to make controlled used of hormones to regrow it. Gene therapies could also work, as well as various transplantation approaches. Setting the thymus aside, it is more and more common that we create immune cells in a laboratory and inject them. This could be used to boost the immune system of the very old.

Stem cells therapies are fast growing. We are able, in principle, to take some of your skin cells, turn them into stem cells and then inject them back into your body so that they go on to act as new stem cells to help repair your joints or your brain. There is an endless stream of therapies in clinical trials right now. Not everything works as expected… one particular problem is that stem cells signal and respond to signalling… this means that how a given stem cell will behave in a given environment is complicated. Just randomly dumping stem cells in your body is likely ineffective… but scientists are getting more sophisticated.

Your body produces a lot of garbage as you grow old. The garbage accumulates. In particular, amyloids clog your brain and your heart and eventually kill you. Also some of your cells reach the end of their life but instead of just self-destroying (apoptosis), they just sit around and emit toxic signals. We already had clinical trials to clear some of this garbage… the results have not been overly positive. But what is more encouraging is that we have developed the technology… should we ever need it.

In any case, for me, 2015 was the year I realized that we are aging because we lack the technology to stay young. I have qualified aging as a software bug. We have the software of life, we can tweak it, and we think we can “upgrade” the software so that aging is no longer a feature. We don’t know how long this might take… could be centuries, could be decades… but I think we will get there as a species.

In 2015, the first clinical trial for an “anti-aging” pill was approved (metformin). This pill would, at best, slow down a little bit aging… but the trial is important as a matter of principle: the American government has agreed that we could test an anti-aging therapy.

I have written about how astronauts mysteriously age in space. As far as I can tell, this remains mostly unexplained. Radiations, gravity, aliens?


In Identifying influential citations, I reported on the launch of Semantic Scholar, an online tool to search for academic references that innovates by distinguishing meaningful citations from de rigueur ones.

I also reacted to the proposal by some ill-advised researchers to ban LaTeX in favor of Microsoft Word for the production of research articles.

Math education

I think that math. education is still far from satisfying. In On rote memorization and antiquated skills and Other useless school trivia: the quadratic formula, I attempted to document how we spend a lot of effort teaching useless mathematical trivial to millions of kids, for no good reason that I can see. (I have a PhD in Mathematics and I have published research papers in Mathematics.)

My most popular posts in 2015… (part I)


If you want the world to get progressively better, you have to do your part. Programmers can’t wait passively for hardware to get better. We need to do our part.

In particular, we need to better exploit our CPUs if we are to keep accelerating our software. Early in 2015, I wrote a blog post explaining how to accelerate intersections using SIMD instructions. I also wrote about how to accelerate hashing using the new instructions for carryless multiplications with a family called CLHash, showing how it could beat the fastest legacy techniques such as Google’s CityHash. I have since shown that CLHash is twice as fast on the latest Intel processors (Skylake).

Some might point out that using fancy CPU instructions in hardly practical in all instances. But I have also been able to show that we could massively accelerate common data structures in JavaScript with a little bit of engineering.


In Secular stagnation: we are trimming down, I have argued that progress in a post-industrial society cannot be measured by counting “physical goods”. In a post-industrial society, we get richer by increasing our knowledge, not by acquiring “more stuff”. This makes measuring wealth difficult. Economists, meanwhile, keep on insisting that they can compare our standards of living across time without difficulty. Many people insist that if young people tend to own smartphones and not cars, that’s because they can’t afford cars and must be content with smartphones. My vision of the future is one where most people own far fewer physical goods. I hope to spend time in 2016 better expressing this vision.


Though programmers make up a small minority of the workforce (less than 1 in 20 employees at most places), their culture is having a deep impact on corporate cultures. I have argued that the “hacker culture” is winning. We get shorter cycles, more frequent updates, more flexibility.

In Theory lags practice, I have argued that we have to be willing to act before we have things figured out. We learned to read and write before we had formal grammars. Watt invented the engine before we had thermodynamics. The Wright brothers made an airplane without working the science out.

In Hackers vs. Academics: who is responsible for progress?, I have argued that if you want progress, you need people who are not specialist at self-promotion, but rather people who love to solve hard practical problems. In this sense, if we are ever going to cure Alzheimer’s or reach Mars, it is going to be thanks to hackers, not academics.

Your software should follow your hardware: the CLHash example

The new Intel Skylake processors released this year (2015) have been met with disappointment. It is widely reported that they improved over the two years old Haswell (2013) processors by a mere 5%. Intel claims that it is more like 10%.

Intel is able to cram many more transistors per unit area in Skylake. The Skylake die size is about 50% smaller. It also uses less power per instruction and can maybe execute more instructions per cycle.

Still. A 5% to 10% gain in two years is hardly exciting, is it?

A few weeks ago, I reported on a new family of hash functions (CLHash) designed to benefit from the new instructions available in recent processors to compute the “carryless multiplication” (or “polynomial multiplications”). This multiplication used to be many times slower than the regular integer multiplications, and required sophisticated software libraries. It was something that only cryptographers cared about. Today, it is a single cheap CPU instruction that anyone who cares about high performance should become familiar with.

I had not had a chance to test it out with Skylake processors earlier, but today I did. A Haswell processor is able to execute one carryless multiplication every two cycles, the Skylake throughput is double: one instruction per cycle.

What does it mean? It means that hash functions based on carryless multiplications can “smoke” conventional hash functions. For example, the following tables gives us the CPU cycles per input byte when hashing 4kB inputs:

Haswell Skylake Progress (Skylake vs. Haswell)
CLHash 0.16 0.096 1.7× faster
CityHash 0.23 0.23 no faster
SipHash 2.1 2.0 5% faster

So if you stick with your old generic code, there might not be anything exciting for you in the new hardware. Minute gains every few years. To really benefit from new hardware, we need to change our software.

Your software should follow your hardware.

Reference: The clhash C library.

The courage to face what we do not understand

Sadly, it is easy to forget that what we know is all but a tiny fraction of all there is to know. Human beings naturally focus on what they understand. The more you learn, the stronger this phenomenon tends to be.

Irrespective of any biological mechanism, I believe that it is a form of “aging” in the sense that it makes you increasingly inflexible. The more you know, the less open you are to new experiences. In effect, the dumber you get.

Let us call this “cognitive rigidity”.

This almost seems unavoidable, doesn’t it?

I believe that there are at least three factors driving cognitive rigidity:

  • Economics often favor specialization. Suppose that you have been programming in Java for the last 5 years. You have an enormous advantage over anyone who starts out in Java. So you have every incentive to ignore new programming languages.
  • Skills and experience are often poorly transferable. For example, I speak fluently in French and English. If I try to learn Chinese, it will take me years of hard work just to catch up with a 6-year-old child born in China.
  • The more and longer you focus, the more likely you are to hit diminishing returns. There is very little difference between spending 5 years programming Java and spending 20 years doing the same.

What is interesting is that cognitive rigidity is an entirely general process.

For example, there is no reason to believe that an artificial intelligence would not suffer from cognitive rigidity. Suppose that you train a machine for some task over many years. The machine has gotten very good at it, and any small change is likely to make it worse at its job. At some point, the machine will stop learning. It has fallen into a “local extrema”. Yet it is possible for a whole other piece of software to come in and surpass the old machine because it starts from new assumptions. The old machine could have explored new assumptions, but that would have likely provided no gain.

Organizations, communities, and even whole countries might also be subject to cognitive rigidity. For example, IBM famously missed the PC revolution… then Microsoft nearly missed the Internet revolution, and squarely missed the mobile revolution.

Where we should be really concerned is that I believe humanity as a whole can fall victim to cognitive rigidity. I have recently “reinvented” myself as an advocate for techno-optimism. I did so when I realized that even among people who should know better, there was a massive failure of imagination. Even young computer scientists fall for it. I find that people universally imagine the future as the present, with a few more gadgets. Though I do not share the pessimism, it is not what troubles me. What troubles me is that people assume we have reached worthwhile extrema. We haven’t!

  • We do not know what intelligence is. We can emulate some of the human intelligence in computers. We can “measure” intelligence using IQ tests… but we do not know what it is. Not really. We can’t even reproduce the visual recognition abilities of a rodent, despite the fact that we have more than enough CPU cycles to do it. In a very fundamental way, we do not know anything about intelligence.
  • We are nowhere close to figuring out the laws of the universe. At a high level, we have two systems, quantum mechanics and relativity, that we glue together somehow. It is an ugly hack.
  • We really do not know much about biology. We have had the (nearly) complete human genome for many years now. So we can touch the binary code of life. We can change it. We can tune it. But we have no idea how it works. Not really. We don’t know why we age. We don’t know why we get cancers while other animals don’t.

    To make matters worse, there is a hidden form of cognitive rigidity when people consider biology. There is a strong assumption that whatever natural evolution produced, it must be ideal… and so tinkering with it is dangerous. For example, I was telling my neighbor about the existence of genes that make people stronger, or more resilient to cancer. His first reaction was that these genes must come at a sinister cost, otherwise we would all have them. This is, of course, a fallacy. You could equally say that whatever product has not yet been marketed must not be profitable, otherwise someone else would have already marketed it.

  • Our best practices in politics and economics are based on debatable heuristics that work “ok”, but they are probably nowhere close to being optimal. Alastair Reynolds in his novel Revelation Space depicts a high-advanced human society where people have adopted radically different forms of politics and economics. His novel hypothesizes that this lead to a surge of prosperity never seen before. Yet almost any debate that puts into question current politics is a non-starter. People simply assume that whatever they have must be the best that can be had.

So we have these giant gaps. What really worries me is how most people do not even see these huge gaps in our knowledge. And these are just the beginning of a long list. If you drill down on any given issue, you find that we know nothing, and we often know less than nothing.

I do not think that cognitive rigidity is unavoidable. I don’t think that there is a law that says that you have to fall prey to it. For example, while Intel has had every occasion to fall the way IBM and Microsoft did, it has time and time again been able to adopt new techniques. We still look at Intel today to determine whether Moore’s law is holding, 40 years after the law was written.

The key to progress is to have the courage to face what we do not understand. But it takes courage to face the unknown.

The virtuous circle of fantasy

It has long been observed that progress depends on the outliers among us. Shaw’s quote sounds a true today as it did in the past:

“The reasonable man adapts himself to the world: the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man.”

I have never heard anyone argue against this observation. Progress depends on individuals who break free from the status quo.

But another undeniable observation has been that progress is accelerating. You can see this effect everywhere. Changes come faster and faster, and they are ever more drastic. I was taken aback this year when I read that smartphones were ubiquitous among Syrian refugees. Smartphones went from something for the rich kids in California to a necessity for the poorest among us in more or less 7 years.

Logically, if progress is accelerating, and progress depends on unreasonable people who act as outliers, then we must have more and more unreasonable people over time.

Richard Florida made a name for himself some years ago with his “creative class” theory. He hypothesized that economic progress was driven by what he called the “creatives”. There is some anecdotal evidence that the world is becoming more tolerant, and especially more so where it is also becoming richer. The number of people who declare being outside an organized religion is growing. Homosexuality is increasingly accepted, at least in the West.

I think that there is a virtuous circle at play. The richer we get, the more easily we can afford to tolerate unreasonable people, the more we can afford fantasy. In turn, fantasy makes genuine progress possible.

Think about a world where starvation and misery is around the corner. You are likely to put a lot of pressure on your kids so that they will conform. Now, think about life in a wealthy continent like North America in 2015. I know that my kids are not going to grow up and starve no matter what they do. So I am going to be tolerant about their career choices. And that’s a good thing. Had Zuckerberg been my son and had I been poor, I might have been troubled to see him dropping out of Harvard to build a “facebook” site. Dropping out of Harvard to build Facebook was pure fantasy. No parent afraid that his son could starve would have tolerated it.

This blog is also fantasy. Instead of doing “serious research”, I write down whatever comes through my mind and post it online. My blog counts for nothing as far as getting me academic currency. I have been warned repeatedly that, should I seek employment, having a blog where I freely shared controversial views could be held against me… To make matters worse, you, my readers, are “wasting time” reading me instead of the Financial times or an Engineering textbook.

The more fantasy we allow, the more progress we enable, and that in turn enables more fantasy.

There are people who don’t like fantasy one bit, like the radical islamists. I don’t think that they fear or hate the West so much as they are afraid of the increasing numbers of people who decide to be unreasonable. Unreasonable people are like dynamite, they can destroy your world view. They are disturbing.

There is one straightforward consequence of this analysis:

Fantasy is growing exponentially.

A lot of beliefs that appear insane today will be soon freely entertained. And the rate at which this happens will only accelerate to the point where the frontier between “that’s crazy” and “it might work” is going to disappear. Anyone who tries to hold the fort with a “that’s crazy” posture will soon be overrun. And, of course, this can only serve as encouragement for unreasonable people.

Amazing technologies from the year 2015…

I cannot predict the future, but I can look at the recent past. What happened in 2015 as far as technology is concerned? Many things happened that, had I predicted them in January 2015, people would have thought I was slightly mad. Let us review some of them.

  • Amazon sells a fully functional tablet for $50. It offers a decent resolution (600 x 1024 pixels) and a solid processor (1.3 GHz, 4 cores). It has a camera, Bluetooth, wifi…
  • The Raspberry Pi computer is down to $5. It includes a 1GHz processor and 512MB of RAM. You could literally cover your walls of working computers for a few hundred dollars.
  • The Tesla car company updated its car remotely with an autopilot feature. The cars are not really autonomous like Google’s self-driving cars… but they are learning from their users and they are being continuously improved remotely by Tesla. The online demos are amazing. Tesla was founded by Elon Musk.
  • When 2015 started, Ebola was a major source of worry. Companies invested millions in finding a vaccine and this was often considered a waste of money. Then we got what appears to be a robust Ebola vaccine. Hardly anyone worries about Ebola today.
  • We have landed a probe on a comet as part of the Rosetta mission. (Credit: Dimitrios Psychogios)
  • We got the first prosthetic hand that delivers touch sensations to the brain. (Credit: Christoph Nahr)
  • Though the military has used fancy drones for some time… we saw autonomous drones becoming good enough and cheap enough to be used by everybody. For example, Hexo+ sells a drone capable of following and filming you for much less than $2000. (See it in action.)
  • As far as I can tell, video games in 2015 did not make much progress compared to previous years… For example, virtual reality is still only in the laboratories. Nevertheless, massive fortunes and investments were made in the field in 2015. Candy Crush, a silly video game, sold for nearly 6 billion dollars. Minecraft, another silly video games, sold for 2.5 billion dollars. In comparison, the budget of the American National Science Foundation (NSF), the richest research funding body in science and engineering in the world, has a budget of 5.6 billion dollars.
  • Voice recognition is nothing new and has been usable for many years. But we saw qualitative progress in 2015 with Google announcing breakthrough progress soon after Apple had announced a major reduction in its error rate. There is no question that the error rates are still well above human levels, but the gap has been reduced drastically. Maybe more remarkable is that the entire world has had access to this improved technology.

    In artificial intelligence, there has been an explosion of interest for a technique called deep learning. Consequently, some have said that 2015 was a breakthrough year for artificial intelligence. What is clear is that many companies are betting the farm on machine learning with a particular emphasis on deep learning, from Facebook to Google. (Credit: Peter Turney)

    We knew that companies like Google and Facebook did a lot of research on machine learning. In 2015, we saw a drive to make this technology as open source. Google published its TensorFlow library which is reportedly both state-of-the-art and just amazing. Facebook published as open source its AI hardware design. Elon Musk and others have funded OpenAI, an organization that aims to produce advanced AI technology and make it available as open source, to ensure that the technology is not locked away by some government or corporation. (Credit: Sidharth Kashyap)

  • Going to space is amazingly expensive. NASA shuttles were meant to bring the prices down but they failed to deliver the needed cost savings. Then the owner of an Internet bookstore funded a rocket (Blue Horizon’s New Shepard) that could go to space, and come back, landing vertically, ready to be reused. The video of the landing is amazing. Another private company (Space X) attempted landing an autonomous drone ship on an ocean platform. Space X was founded by Elon Musk, a billionaire you could make movies about. Hopefully, all this work will finally bring down costs and accelerates space exploration. (Credit: Isaac Kuo and Ade Oshineye)
  • Though there has been not specific breakthrough, the cost of solar power has continued to fall at an exponential rate in 2015. It is now predicted that, at the current rate, we will never have a chance of running out of oil… rather, renewable energy like solar will get so cheap that we will not need to burn oil. (Credit: Greg Linden)
  • Quantum computing is not a new idea: we have long thought that computers using quantum computing could outdo our electronic computers. For the first time in 2015, Google described a quantum computing experiments that is orders of magnitude faster than what a regular electronic computer can do. (Credit: Dominic Amann, Venkatesh Rao)
  • The first human clinical trials for an anti-aging pill are approved. The pill is unlikely to add decades to anybody’s life, but it marks an important first step in curing aging.
    As a more extreme case, an American company, BioViva, modifies genetically a middle-aged human being in the hope of delaying or reversing aging. Yes, we have been at the point where we can modify the genes (the DNA) of a living person for quite some time. We have now reached the point where some people are ready to apply this technology to stop and reverse aging.
  • Though it was a small test, we got what might be the first successful clinical trial against Parkinson’s in the sense that we have a drug that can reverse (not just slow) cognitive impairment due to the disease. The good results come from a cancer drug marketed by Novartis. We also got a dog’s dementia reversed using stem cells. We have also an ongoing clinical trial to cure Alzheimer using the plasma of young donors.
  • Researchers made a kidney-like organ out of skin cells. In fact, it appears like we could regrow an entire body out of skin cells. (Credit: Juho Vepsäläinen)

Are we really testing an anti-aging pill? And what does it mean?

The U.S. Food and Drug Administration (FDA) has approved a clinical trial for “an anti-aging pill”. The pill is simply metformin. Metformin is a cheap drug that is safe and effective to treat type 2 diabetes, an old-age disease. While it has been a part of modern medicine for a few decades, Metformin actually comes from the French lilac, a plant used in medicine for centuries.

The study is called TAME for “Taming Aging With Metformin” or MILES for “Metformin in Longevity Study”. The trial has been driven by Nir Barzilai, a reputed professor of medicine.

What does it mean?

  • As far as I can tell, this is the first time the FDA allowed trials to treat aging as a medical condition. The people who will participate are not “sick” per se, they are just “old” and likely at risk to be soon sick because of aging. How this was approved by the bureaucrats of the FDA is beyond me.
  • You probably know people who take metformin. You probably don’t know anyone who is 120. So chances are that this clinical trial will not show that metformin can add decades to your life. However, it was observed that people who have diabetes and take metformin can live longer than otherwise healthy people who do not take metformin. Moreover, mice who take metformin live longer. So it is likely that metformin will have a small positive effect. The gain could be just a few extra months or weeks of health. Note that it is not expected that metformin would extend life by preventing death while letting aging continue. It is likely that if metformin has any effect at all, it will be by delaying diseases of old age.
  • Did I mention that metformin is cheap and safe? That means that if it works, it will be instantly available to everyone on the planet. Indeed, it is cheap enough to be affordable even in developing countries. (TAME is funded by a non-profit, the American Federation for Aging Research.)

If successful, this trial would have shocking implications. It is widely believed that aging is unavoidable and untreatable. Maybe you can lose weight and put on some cream to cover your wrinkles, but that’s about it. However, if it were shown that a cheap drug costing pennies a day (literally!) could delay the diseases of old age even just by a tiny bit… It would force people to rethink their assumptions.

Will it work? We shall know in a few years.

(For the record, I am not taking metformin nor am I planning to take any in the near future.)

Further reading: Description of the clinical trial: Metformin in Longevity Study (currently recruiting).

Related books : Mitteldorf’s Cracking the Aging Code, Fossel’s The Telomerase Revolution, de Grey’s Ending Aging, Farrelly’s Biologically Modified Justice and Wood’s The Abolition of Aging.

The mysterious aging of astronauts

When I took Physics courses in college, I learned about how astronauts should age a tiny bit slower than us. Of course, they would be exposed to a lot more radiation so they might develop more cancers. But all in all, I would have been excited about the prospect of living in space.

Then the astronauts came back and we saw them being barely able to walk. Yet these were young men selected among thousands for their physical fitness. That was explained away by saying that the lack of gravity meant a lack of exercise. All these astronauts needed was a good workout. And future astronauts would have a “space gym” so it would all be alright.

But then more results started coming back. Not only do astronauts come back with weak muscles and frail bones… But they also suffer from skin thinning, atherosclerosis (stiffer arteries), resistance to insulin and they suffer from loss of vision due to cataracts many years earlier than expected given their chronological age. These symptoms look a lot like skin aging, cardiovascular aging, age-related diabetes and so forth. In fact, it is pretty accurate to say that astronauts age at an accelerated rate. This is despite the fact that the current generation of astronauts follows a rigorous exercise program. They are also followed medically more closely than just about anyone on Earth: they don’t indulge in regular fast food.

Trudel, one of the leading researchers on this front appears to think that lack of sufficiently strenuous exercise is the problem. He observed that resting greatly accelerates aging:

“(…) after 60 days of bed rest, the marrow of the patients studied looked as if it had aged and grown by four years” (motherboard)

When not attributed to a lack of sufficient exercise, many of these effects seem to be attributed to an increased exposure to radiation. Indeed, astronauts in the International Space Station are exposed to about ten times as much ambient radiation as the rest of us. However, there is only so much you can explain away with a slight increase in radiations. For example, people exposed to radiation grow cancers, they don’t develop diabetes. And even cancer is not a given: a small increase in radiation exposure can actually make you healthier through a process called hormesis. In fact, that’s precisely what exercise does: it is a stress on your body that makes you healthier. In any case, we do not know whether astronauts are more likely to die from cancer. Certainly, they don’t all fall dead at 40 from cancer… If there is an increased rate of cancer, it is fairly modest because, otherwise, we would not be worrying about how their skin is getting thinner.

So it looks like despite short stays, and very attentive medical care, astronauts age at a drastically accelerated pace… not just in one or two ways but across a broad spectrum of symptoms.

I looked as hard as I could and I could not find any trace of medical scientists worrying about such a phenomenon a priori.

What is going on? Why does life in space accelerate aging so much?

Gravity seems to play a key role:

The more likely culprit is the relative lack of gravity in a microgravity environment, which scientists think has a negative influence on (…) metabolisms. (Peter Dockrill, Science Alert)

Further reading:

Being ever more productive… is a duty

As we work at something, we usually get better and better. Then you hit a plateau. For most of human history, people have been hitting this plateau, and they just kept working until death or retirement, whichever came first.

Today, if you ever reach the upper bound, chances are good that you should be moving to new work.

I do think, for example, that we should be investing more every year in health, education and research. And not just a bit more. However, these people have to do their part and grow their productivity.

If you have been teaching kids for ten years, on the same budget, and getting the same results… you have been short-changing all of us.

If you are treating medical conditions for the same cost and getting the same results for the last few years, you are stealing from all of us.

You have an obligation to improve your productivity with each passing year. And only if all of us keep on improving our productivity can we afford to grow everyone’s budget year after year.

If your students’ scores don’t improve year after year, if the survival rates of your patients don’t improve year after year, you should be troubled. And, ultimately, you should feel shame.

Is peer review slowing down science and technology?

Ten years ago, a team lead by Irina Conboy at the University of California at Berkeley showed something remarkable in a Nature paper: if you take old cells and put them in a young environment, you effectively rejuvenate them. This is remarkable work that was cited hundreds of times.

Their work shows that vampire stories have a grain of truth in them. It seems that old people could be made young again by using the blood of the young. But unlike vampire stories, this is serious science. So maybe there are “aging factors” in the blood of old people, or “youth factors” in the blood of young people. Normalizing these factors to a youthful level, if possible, could keep older people from getting sick. Even if that is not possible in the end, the science is certainly fascinating.

So whatever happened to this work? It was cited and it lead to further academic research… There were a few press releases over the years…

But, on the whole, not much happened. Why?

One explanation could be that the findings were bogus. Yet they appear to be remarkably robust.

So why did we not see much progress in the last ten years? Conboy et al. have produced their own answer regarding this lack of practical progress:

If all this has been known for 10 years, why is there still no therapeutics?

One reason is that instead of reporting broad rejuvenation of aging in three germ layer derivatives, muscle, liver, and brain by the systemic milieu, the impact of the study published in 2005 became narrower. The review and editorial process forced the removal of the neurogenesis data from the original manuscript. Originally, some neurogenesis data were included in the manuscript but, while the findings were solid, it would require months to years to address the reviewer’s comments, and the brain data were removed from the 2005 paper as an editorial compromise. (…)

Another reason for the slow pace in developing therapies to broadly combat age-related tissue degenerative pathologies is that defined strategies (…) have been very difficult to publish in high impact journals; (…)

If you have not been subject to peer review, it might be hard to understand how peer comments can slow down researchers so much… and even discourage entire lines of research. To better understand the process… imagine that you have to convince four strangers of some result… and the burden is entirely on you to convince them… and if only just one of them refuses to accept your argument, for whatever reason, he may easily convince an editor to reject your work… The adversarial referee does not even have to admit he does not believe your result, he can simply say benign things like “they need to run larger or more complicated experiments”. In one project I did, one referee asked us to redo all the experiments in a more realistic setting. So we did. Then he complained that they were not extensive enough. We extended them. By that time I had invested months of research on purely mundane tasks like setting up servers and writing data management software… then the referee asked for a 100x extension of the data sizes… which would have implied a complete overhaul of all our work. I wrote a fifteen-page rebuttal arguing that no other work had been subjected to such levels of scrutiny in the recent past, and the editor ended up agreeing with us.

Your best strategy in such case might be to simply “give up” and focus on producing “uncontroversial” results. So there are research projects that neither I nor many other researchers will touch…

I was reminded of what a great computer scientist, Edsger Dijkstra, wrote on this topic:

Not only does the mechanism of peer review fail to protect us from disasters, in a certain way it guarantees mediocrity (…) At the time, it is done, truly original work—which, in the scientific establishment, is as welcome as unwanted baby (…)

Dijkstra was a prototypical blogger: he wrote papers that he shared with his friends. Why can’t Conboy et al. do the same thing and “become independent” of peer review? Because they fear that people would dismiss their work as being “fringe” research with no credibility. They would not be funded. Without funding, they would quickly lose their laboratory, and so forth.

In any case, the Conboy et al. story reminds us that seemingly innocent cultural games, like peer review, can have a deep impact on what gets researched and how much progress we make over time. Ultimately, we have to allocate finite resources, if only the time of our trained researchers. How we do it matters very much.

Thankfully, since Conboy et al. published their 2005, the world of academic publishing has changed. Of course, the underlying culture can only change so much, people are still tailoring their work so that it will get accepted in prestigious venues… even if it makes said work much less important and interesting… But I also think that the culture is being transformed. Initiatives like the Public Library of Science (PLoS) launched in 2003 have shown the world that you could produce high impact serious work without going through an elitist venue.

I think that, ultimately, it is the spirit of open source that is gaining ground. That’s where the true meaning of science thrived: it does not matter who you are, what matters is whether you are proposing works. Good science is good science no matter what the publishing venue is… And there is more to science than publishing papers… Increasingly, researchers share their data and software… instead of trying to improve your impact through prestige, you can improve your impact by making life easier for people who want to use your work.

The evolution of how we research may end up accelerating research itself…

Identifying influential citations: it works live today!

Life has a way to give me what I want. Back in 2009, I wrote that instead of following conferences or journals, I would rather follow individual researchers. At the time, there was simply no good way to do this, other than visiting constantly the web page of a particular researcher. A few years later, Google Scholar offered “author profiles” where you can subscribe to your favorite researchers and get an email when they publish new work. Whenever I encounter someone who does nice work, I make sure to subscribe to their Google profile.

This week, I got another of my wishes granted.

Unsurprisingly, most academics these days swear by Google Scholar. It is the single best search engine for research papers. It has multiplied my productivity when doing literature surveys.

There have been various attempts to compete with Google Scholar, but few have lasted very long or provided value. The Allen Institute for Artificial Intelligence has launched its own Google Scholar called Semantic Scholar. For the time being, they have only indexed computer science papers, and their coverage falls short of what Google Scholar can offer. Still, competition is always welcome!

I am very excited about one of the features they have added: the automatic identification of influential citations. That is something I have long wanted to see… and it is finally here! In time, this might come to play a very important role.

Let me explain.

Scientists play this game where they try to publish as many papers as possible, and to get as many people as possible to cite their papers. If you are any good, you will produce some papers, and a few people will read and cite your work.

So we have started counting references as a way to measure a scientist’s worth. And we also measure how important a given research paper is based on how often it has been cited.

That sounds reasonable… until you look at the problem more closely. Once you look at how and why people cite previous work, you realize that most citations are “shallow”. You might build your new paper on one or two influential papers, maybe 3 or 4… but you rarely build on 20, 30 or 50 previous research papers. In fact, you probably haven’t read half of the papers you are citing.

If we are going to use citation counting as a proxy for quality, we need a way to tell apart the meaningful references from the shallow ones. Surely machine learning can help!

Back in 2012, I asked why it wasn’t done. Encouraged by the reactions I got, we collected data from many helpful volunteers. The dataset with the list of volunteers who helped is still available.

The next year (2013), we wrote a paper about it: Measuring academic influence: Not all citations are equal (published by JASIST in 2015). The work shows that, yes, indeed, machine learning can identify influential references. There is no good reason to consider all citations as having equal weights. We also discussed many of the potential applications such as better rankings for researchers, better paper recommender systems and so forth. And then among others, Valenzuela et al. wrote a follow-up paper, using a lot more data: Identifying Meaningful Citations.

What gets me really excited is that the idea has now been put in practice: as of this week, Semantic Scholar allows you to browse the references of a paper ranked by how influential the reference was to this particular work. Just search for a paper and look for the references said to have “strongly influenced this paper”.

I hope that people will quickly build on the idea. Instead of stupidly counting how often someone has been cited, you should be tracking the work that he has influenced. If you have liked a paper, why not recommend paper that it has strongly influenced?

This is of course only the beginning. If you stop looking at research papers as silly bags of words having some shallow metadata… and instead throw the full might of machine learning at it… who knows what is possible!

Whenever people lament that technology is stalling, I just think that they are not very observant. Technology keeps on granting my wishes, one after the other!

Credit: I am grateful to Peter Turney for pointing out this feature to me.

Is artificial intelligence going to wipe us out in 30 years?

Many famous people have recently grown concerned that artificial intelligence is going to become a threat to humanity in the near future. The wealthy entrepreneur Elon Musk and the famous physicist Stephen Hawking are among them.

It is obvious that any technology, including artificial intelligence, can be used to cause harm. Information technology can be turned into a weapon.

But I do not think that it is what Hawking and Musk fear. Here is what Hawking said:

It [artificial intelligence] would take off on its own, and re-design itself at an ever increasing rate, Humans, who are limited by slow biological evolution, couldn’t compete, and would be superseded.

Here is what Musk said:

With artificial intelligence we are summoning the demon. In all those stories where there’s the guy with the pentagram and the holy water, it’s like – yeah, he’s sure he can control the demon. Doesn’t work out,(…)

So far from being merely worried about potential misuse of technology, Hawking and Musk consider a scenario more like this:

  • Machines increasingly acquire sophisticated cognitive skills. They learned to play Chess better than human beings. Soon, they will pass as human beings online. It is only a matter of time before someone can produce software that passes the American Scholastic Aptitude Test with scores higher than the average human being… and soon after, better than any human being.
  • Then, at some point, machines will wake up, become conscious, reach “general intelligence” and we are facing a new species that might decide to do away with us.

The first part of this scenario is coming true. We have cars, trains and airplanes without human pilots today… in 30 years, no insurer will be willing to cover you if you decide to drive your own car. It is all going to be automated. Factories are similarly going to be automated. Goods delivery will be automated. Medical diagnostic will be automated. In 30 years, we will have computing devices that are far more powerful than the naked brain. It is likely that they will use up a lot more energy and a lot more space than a human brain… but a computing device can have instant access to all of the Internet in ways that no human brain can.

What we often fail to realize is that “artificial intelligence” is not something that will happen suddenly. It is a process, and we are far along in this process. A vending machine is very much an automated store. An automated teller, as you can find anywhere in the world, is as the term implies… a teller that has been replaced by “artificial intelligence”.

There is a myth that machines will suddenly become qualitatively smarter. They will acquire a soul or something like it.

First, there is the notion that machines will acquire consciousness in a sudden fashion. So my laptop is not conscious, but maybe the next iteration or the one after that, with the right software will “acquire” consciousness. The problem with consciousness is that it is a purely intuitive notion that appears to have no counterpart in the objective world. We “think” we are conscious because we think “I am going to open this door” and then we open the door. So the “voice” in our head is what is in charge. But that is an illusion as demonstrated by Benjamin Libert. What is clearly established is that you can describe and announce your decisions only after the fact. By the time you are ready to announce that you will open the door, your brain has already taken its decision. We think that we are conscious our of surroundings in pure real time, but that is again an illusion: our brain spends a great deal of efforts processing our senses and constructing a model. That is why there are perceptible delays in when responding to stimuli. Your brain gets some input, updates a model and reacts accordingly, just like any computer would do. So it is not clear that my laptop is any less conscious than I am though it is clear that it can react much faster than I can. Free will is also an illusion. There is simply no test, no objective measurement that you can use to establish how conscious or free is a computing device or brain.

Second, there is the notion of general intelligence. The belief is that computers are specialized devices that can do only what they have been programmed for, whereas biological beings can adapt to any new task. It is true that mammals can change their behavior when faced with new conditions. But you must first consider that this ability is quite variable in living beings. If the climate gets colder, frogs do not just decide to put on coats. Frogs do what frogs do in a very predictable manner. But even human beings have also limited abilities to adapt. The obesity epidemic is an obvious example: for thousands of years, human beings starved to death… all of a sudden, food is abundant… what do we do? We eat too much and shorten our lives in the process. We did away with most predators and live in wonderfully safe cities… Yet if you are threatened of being fired, even if you know that you are physically safe and will never go hungry… your body react as if you were being attacked by a predator, and makes you more prone to act on instinct… exactly the opposite of good sense… So the evidence is clear: we are slaves to our programming. It is true that human beings can learn to do String Theory. We can acquire new skills that were not programmed into us, to some extent. That sets us apart from frogs, as frogs are unlikely to acquire a new way to swim or to eat. But it does not set us apart from software. In fact, software is much easier to expand than the human mind. Again, this notion of general intelligence seems to be an ill-defined intuitive idea that comforts us but has little to do with objective reality.

Are computers going to become smarter than people? They already are in many ways that matter. They can plan trips across complex road networks better than we can. Computers will continue to catch up and surpass the human brain. We can make an educated guess as to how far this process will be in 30 years: very far. Nobody can know what effect this will have on humanity. But we can probably stop worrying about machines acquiring consciousness or “general” intelligence all of a sudden and turning against us as a consequence. Stop watching so many bad scifi movies!

Crazily fast hashing with carry-less multiplications

We all know the regular multiplication that we learn in school. To multiply a number by 3, you can multiply a number by two and add it with itself. Programmers write:

a * 3 = a + (a<<1)

where a<<1 means "shift the bit values by one to the left, filling in with a zero". That's a multiplication by two. So a multiplication can be described as a succession of shifts and additions.

But there is another type of multiplication that you are only ever going to learn if you study advanced algebra or cryptography: carry-less multiplication (also called "polynomial multiplication). When multiplying by powers of two, it works the same as regular multiplication. However, when you multiply numbers that are not powers of two, you combine the results with a bitwise exclusive OR (XOR). Programmers like to write "XOR" as "^", so multiplying by 3 in carry-less mode becomes:

a "*" 3 = a ^ (a<<1)

where I put the multiplication symbol (*) used by programmers in quotes ("*") to indicate that I use the carry-less multiplication.

Why should you care about carry-less multiplications? It is actually handier than you might expect.

When you multiply two numbers by 3, you would assume that

a * 3 == b * 3

is only true if a has the same value as b. And this works because in an actual computer using 64-bit or 32-bit arithmetic because 3 is coprime with any power of two (meaning that their greatest common factor is 1).

The cool thing about this is that there is an inverse for each odd integer. For example, we have that

0xAAAAAAAB * 3 == 1.

Sadly, troubles start if you multiply two numbers by an even number. In a computer, it is entirely possible to have

a * 4 == b * 4

without a being equal to b. And, of course, the number 4 has no inverse.

Recall that a good hash function is a function where different inputs are unlikely to produce the same value. So multiplying by an even number is troublesome.

We can "fix" this up by using the regular arithmetic modulo a prime number. For example, Euler found out that 231 -1 (or 0x7FFFFFFF) is a prime number. This is called a Mersenne prime because its value is just one off from being a power of two. We can then define a new multiplication modulo a prime:

a '*' b = (a * b) % 0x7FFFFFFF.

With this modular arithmetic, everything is almost fine again. If you have

a '*' 4 == b '*' 4

then you know that a must be equal to b.

So problem solved, right? Well... You carry this messy prime number everywhere. It makes everything more complicated. For example, you cannot work with all 32-bit numbers. Both 0x7FFFFFFF and 0 are zero. We have 0x80000000 == 1 and so forth.

What if there were prime numbers that are powers of two? There is no such thing... when using regular arithmetic... But there are "prime numbers" (called "irreducible polynomials" by mathematicians) that act a bit like they are powers of two when using carry-less multiplications.

With carry-less multiplications, it is possible to define a modulo operation such that

modulo(a "*" c) == modulo(b "*" c)

implies a == b. And it works with all 32-bit or 64-bit integers.

That's very nice, isn't it?

Up until recently, however, this was mostly useful for cryptographers because computing the carry-less multiplications was slow and difficult.

However, something happened in the last few years. All major CPU vendors have introduced fast carry-less multiplications in their hardware (Intel, AMD, ARM, POWER). What is more, the latest Intel processors (Haswell and better) have a carry-less multiplication that is basically as fast as a regular multiplication, except maybe for a higher latency.

Cryptographers are happy: their fancy 128-bit hash functions are now much faster. But could this idea have applications outside cryptography?

To test the idea out, we created a non-cryptographic 64-bit hash function (CLHash). For good measure, we made it XOR universal: a strong property that ensures your algorithms will behave probabilistically speaking. Most of our functions is a series of carry-less multiplications and bitwise XOR.

It is fast. How fast is it? Let us look at the next table...

CPU cycles per input byte:

64B input 4kB input
CLHash 0.45 0.16
CityHash 0.48 0.23
SipHash 3.1 2.1

That's right: CLHash is much faster that competing alternatives as soon as your strings are a bit large. It can hash 8 input bytes per CPU cycles. You are more likely to run out of memory bandwidth than to wait for this hash function to complete. As far as we can tell, it might be the fastest 64-bit universal hash family on recent Intel processors, by quite a margin.

As usual, the software is available under a liberal open source license. There is even a clean C library with no fuss.

Further reading:

Faster hashing without effort

Modern software spends much time hashing objects. There are many fancy hash functions that are super fast. However, without getting fancy, we can easily double the speed of commonly used hash functions.

Java conveniently provides fast hash functions in its Arrays class. The Java engineers like to use a simple polynomial hash function:

for (int i = 0; i < len; i++) {
   h = 31 * h + val[i];

That function is very fast. Unfortunately, as it is written, it is not optimally fast for long arrays. The problem comes from the multiplication. To hash n elements, we need to execute n multiplications, and each multiplication relies on the result from the previous iteration. This introduces a data dependency. If your processor takes 3 cycles to complete the multiplication, then it might be idle half the time. (The compiler might use a shift followed by an addition to simulate the multiplication, but the idea is the same.) To compensate for the latency problem, you might unloop the function as follows:

for (; i + 3 < len; i += 4) {
   h = 31 * 31 * 31 * 31 * h 
       + 31 * 31 * 31 * val[i] 
       + 31 * 31 * val[i + 1] 
       + 31 * val[i + 2] 
       + val[i + 3];
for (; i < len; i++) {
   h = 31 * h + val[i];

This new function breaks the data dependency. The four multiplications from the first loop can be done together. In the worst case, your processor can issue the multiplications one after the other, but without waiting for the previous one to complete. What is even better is that it can enter the next loop even before all the multiplications have time to finish, and begin new multiplications that do not depend on the variable h. For better effect, you can extend this process to blocks of 8 values, instead of blocks of 4 values.

So how much faster is the result? To hash a 64-byte char array on my machine…

  • the standard Java function takes 54 nanoseconds,
  • the version processing blocks of 4 values takes 36 nanoseconds,
  • and the version processing blocks of 8 values takes 32 nanoseconds.

So a little bit of easy unrolling almost doubles the execution speed for moderately long strings compared to the standard Java implementation.

You can check my source code.

Further reading:

See also Duff’s device for an entertaining and slightly related hack.

On the memory usage of maps in Java

Though we have plenty of memory in our computers, there are still cases where you want to minimize memory usage if only to avoid expensive cache faults.

To compare the memory usage of various standard map data structures, I wrote a small program where I create a map from the value k to the value k where k ranges from 1 to a 100,000, using either a string or integer representation of the value k. As a special case, I also create an array that contains two strings, or two integers, for each value of k. This is “optimal” as far as memory usage is concerned since only the keys and values are stored (plus some small overhead for the one array). Since my test is in Java, I store integers using the Integer class, and strings using the String class.

Class String, String Integer, Integer
array 118.4 40.0
fastutil 131.4 21.0
HashTable 150.3 71.8
TreeMap 150.4 72.0
HashMap 152.9 74.5
LinkedHashMap 160.9 82.5

The worst case is given by the LinkedHashMap which uses twice as much space as an array in the Integer, Integer scenario (82.5 bytes and 40 bytes respectively).

I have also added the fastutil library to the tests. Its hash maps use open addressing, which has reduced memory usage (at the expense of expecting good hash functions). The savings are modest in this test (10%). However, in the Integer-Integer test, I used the library’s ability to work with native ints, instead of Integer objects. The savings are much more significant in that instance: for each pair of 32-bit integers, we use only 21 bytes, compared to 74.5 bytes for the HashMap class.

Looking at these number, we must conclude that the relative overhead due to the map data structure is small. Of course, Java objects eat up a lot of memory. Each Integer object appears to take 16 bytes. Each String object appears to use at least 40 bytes. That’s for the objects themselves. To use them inside another data structure, you have to pay the price of a pointer to the object.

In Java, the best way to save memory is to use a library, like fastutil, that works directly with native types.

Conclusion: Whether you use a TreeMap or HashMap seems to have very little effect on your memory usage.

Note: Please do not trust my numbers, review my code instead.

Where are all the search trees?

After arrays and linked lists, one of the first data structures computer-science students learn is the search tree. It usually starts with the binary search tree, and then students move on to B-trees for greater scalability.

Search trees are a common mechanism used to implement key-value maps (like a dictionary). Almost all databases have some form of B-tree underneath. In C++, up until recently, default map objects were search trees. In Java, you have the TreeMap class.

In contrast to the search tree, we have the hash map or hash table. Hash maps have faster single look-ups, but because the keys are not ordered physically, traversing the keys in sorted order can be much slower. And it might require fully sorting the keys as part of the iteration process, if you want to go through the keys in order.

In any case, if technical interviews and computer-science classes make a big deal of search trees, you’d think they were ubiquitous. And yet, they are not. Hash maps are what is ubiquitous.

  • JavaScript, maybe the most widely used language in the world, does not have search trees part of the language. The language provides an Object type that can be used as a key-value store, but the keys are not sorted in natural order. Because it is somewhat bug prone to rely on the Object type to provide a map functionality, the language recently acquired a Map type, but it is again a wrapper around what must be a hash map. Maps are “sorted” in insertion order, probably through a linked list so that, at least, the key order is not random.
  • Python, another popular language, is like JavaScript. It provides an all-purpose dict type that is effectively a map, but if you were to store the keys ‘a’, ‘b’, ‘c’, it might give them back to you as ‘a’, ‘c’, ‘b’. (Try {'a': 0, 'b': 0, 'c': 0} for example.) That is, a dict is a hash map. Python has an OrderedDict class, but it merely remembers the order in which the keys were added (like JavaScript’s Map). So there is no search tree to be found!
  • The Go language (golang) provides a map class, but we can be sure there is no underlying search tree, since Go randomizes the key order by design! (To prevent users from relying on key order.)

What does it mean? It means that millions of programmers program all day long without ever using a search tree, except maybe when they access their database.

Though key-value stores are essential to programmers, the functions offered by a search tree are much less important to them. This suggests that programmers access their data mostly in random order, or in order of insertion, not in natural order.

Further reading: Scott Meyers has an article showing that hash maps essentially outperform all other look-ups except for tiny ones where a sequential search is best.

Update: It is commonly stated that a hash map uses more memory. That might not be generally true however. In a test of hash maps against tree maps in Java, I found both to have comparable memory usage.

Secular stagnation: we are trimming down

Economists worry that we have entered in a secular stagnation called the Great Stagnation. To summarize: whereas industrial productivity grew steadily for most of the XXth century, it started to flatten out in the 1970s. We have now entered an era where, on paper, we are not getting very much richer.

Houses are getting a bit larger. We can afford a few more clothes. But the gains from year to year are modest. Viewed from this angle, the stagnation looks evident.

Why is this happening? Economists have various explanations. Some believe that government regulations are to blame. Others point out that we have taken all the good ideas, and that the problems that remain are too hard to solve. Others yet blame inequality.

But there is another explanation that feels a lot more satisfying. We have entered the post-industrial era. We care less and less about producing “more stuff” and we are in a process of trimming down.

Young people today are less likely to own a car. Instead, they pay a few dozen dollars a month for a smartphone. They are not paying for the smartphone itself, they are paying for what it gives them access to.

Let us imagine the future, in 10, 20 or 30 years. What I imagine is that we are going to trim down, in every sense. People will own less stuff. Their houses won’t be much larger. They may even choose not to own cars anymore. They may choose to fly less often. If we are lucky, people will eat less. They may be less likely to be sick, and when sickness strikes, the remedy might be cheaper. They will use less power.

We are moving to a more abstract world. It is a world where it becomes harder to think about “productivity”, a concept that was invented to measure the output of factories. What is the “productivity” of a given Google engineer? The question is much less meaningful than if you had asked about the productivity of the average factory worker from 1950.

Suppose that, tomorrow, scientists discover that they have a cure for cancer. Just eat some kale daily and it will cure any cancer you have (say). This knowledge would greatly improve our lives… we would all be substantially richer. Yet how would economists see this gain? These scientists have just made a discovery that is almost without price… they have produced something of a very great value… how is it reflected in the GDP? Would you see a huge bump? You would not. In fact, you might see a net decrease in the GDP!

We won’t cure cancer next year, at least not by eating kale… but our lives are made better year after year by thousands of small innovations of this sort. In many cases, these cannot be measured by economists. And that’s increasingly what progress will look like.

Measuring progress in a post-industrial world is going to get tricky.

Predicting the near future is a crazy, impossible game

Back in 1903, the Wright brothers flew for the first time, 20 feet above ground, for 12 seconds. Hardly anyone showed up. The event went vastly unnoticed. It was not reported in the press. The Wright brothers did not become famous until many years later. Yet, ten years later, in 1914, we had war planes used for reconnaissance and dropping (ineffective) bombs. It was not long before we had dogfighting above the battleground.

Lord Kelvin, one of the most reputed and respected scientist at the time, wrote in 1902 that “No balloon and no aeroplane will ever be practically successful.”

If we could not see ten years in the future, back in 1903, what makes us think that we can see ten or twenty years in the future in 2015?

JavaScript and fast data structures: some initial experiments

Two of my favorite data structures are the bitset and the heap. The latter is typically used to implement a priority queue.

Both of these data structures come by default in Java. In JavaScript, there is a multitude of implementations, but few, if any, are focused on offering the best performance. That’s annoying because these data structures are routinely used to implement other fast algorithms. So I did what all programmers do, I started coding!

I first implemented a fast heap in JavaScript called FastPriorityQueue.js. As a programmer, I found that JavaScript was well suited to the task. My implementation feels clean.

How does it compare with Java’s PriorityQueue? To get some idea, I wrote a silly Java benchmark. The result? My JavaScript version can execute my target function over 27,000 times per second on Google’s V8 engine whereas Java can barely do it 13,000 times. So my JavaScript smokes Java in this case. Why? I am not exactly sure, but I believe that Java’s PriorityQueue implementation is at fault. I am sure that a heap implementation in Java optimized for the benchmark would fare much better. But I should point out that my JavaScript implementation uses far fewer lines of code. So bravo for JavaScript!

I also wrote a fast bitset implementation in JavaScript. This was more difficult. JavaScript does not have any support for 64-bit integers as far as I can tell though it supports arrays of 32-bit integers (Uint32Array). I did with what JavaScript had, and I published the FastBitSet.js library. How does it compare against Java? One benchmark of interest is the number of times you can compute the union between two bitsets (generating a new bitset in the process). In Java, I can do it nearly 3 million times a second. The JavaScript library appears limited to 1.1 million times per second. That’s not bad at all… especially if you consider that JavaScript is a very ill-suited language to implement a bitset (i.e., no 64-bit integers). When I tried to optimize the JavaScript version, to see if I could get it closer to the Java version, I hit a wall. At least with Google’s V8 engine, creating new arrays of integers (Uint32Array) is surprisingly expensive and seems to have nothing to do with just allocating memory and doing basic initialization. You might think that there would be some way to quickly copy an Uint32Array, but it seems to be much slower than I expect.

To illustrate my point, if I replace my bitset union code…

answer.words = new Uint32Array(answer.count);
for (var k = 0; k < answer.count; ++k) {
   answer.words[k] = t[k] | o[k];

by just the allocation…

answer.words = new Uint32Array(answer.count);

… the speed goes from 1.1 million times per second to 1.5 million times per second. This means that I have no chance to win against Java. Roughly speaking, JavaScript seems to allocate arrays about an order of magnitude slower than it should. That’s not all bad news. With further tests, I have convinced myself that if we can just reuse arrays, and avoid creating them, then we can reduce the gap between JavaScript and Java: Java is only twice as fast when working in-place (without creating new bitsets). I expected such a factor of two because JavaScript works with 32-bit integers whereas Java works with 64-bit integers.

What my experiments have suggested so far is that JavaScript’s single-threaded performance is quite close to Java’s. If Google’s V8 could gain support for 64-bit integers and faster array creation/copy, it would be smooth sailing.

Update: I ended up concluding that typed arrays (Uint32Array) should not be used. I switched to standard arrays for better all around performance.

Links to the JavaScript libraries:

Foolish enough to leave important tasks to a mere human brain?

To the ancient Greeks, the male reproductive organ was mysterious. They had this organ that can expand suddenly, then provide the seed of life itself. Today, much of biology remains uncatalogued and mysterious, but the male reproductive organ is now fairly boring. We know that it can be cut (by angry wives) and sewed back in place, apparently with no loss of function. As for providing the seed of life, artificial insemination is routine both in animals (e.g., cows) and human beings. In fact, by techniques such a cloning, we can create animals, and probably even human beings, with no male ever involved.

If we go back barely more than a century, flight was mysterious. Birds looked slightly magical. Then a couple of bicycle repairmen, who dropped out of high school, built the first airplane. Today, I would not think twice about embarking in a plane, with hundred of other people, and fly over the ocean in a few hours… something no bird could ever do.

This is a recurring phenomenon: we view something as magical, and then it becomes a boring mechanism that students learn in textbooks. I call it the biological-supremacy myth: we tend to overestimate the complexity of anything biology can do… until we find a way to do it ourselves.

Though there is still much we do not know about even the simplest functions of our body, the grand mystery remains our brain. And just like before, people fall prey to the biological-supremacy myth. Our brains are viewed as mythical organs that are orders of magnitude more complex than anything human beings could create in this century or the next.

We spend a great deal of time studying the brain, benchmarking the brain, in almost obsessive ways. Our kids spend two decades being tested, retested, trained, retrained… often for the sole purpose of determining the value of the brain. Can’t learn calculus very well? Your brain must not be very smart. Can’t learn the names of the state capitals? Your brain must be slightly rotten.

In the last few years, troubles have arisen for those who benchmark the brain. I can go to Google and ask, in spoken English, for the names of the state capitals, and it will give them to me, faster than any human being could. If I ask Google “what is the derivative of sin x”, not only does it know the answer, it can also point to complete derivation of the result. To make matters worse, the same tricks work anytime, anywhere, not just when I am at the library or at my desk. It works everywhere I have a smartphone, which is everywhere I might need calculus, for all practical purposes.

What is fascinating is that as we take down the brain from its pedestal, step by step, people remain eager to dismiss everything human-made as massively inferior:

  • “Sure, my phone can translate this Danish writing on the wall for me, but it got the second sentence completely wrong. Where’s your fantastic AI now?”
  • “Sure, I can go to any computer and ask Google, in spoken English, where Moldova is, and it will tell me better than a human being could… But when I ask it when my favorite rock band was playing again, it cannot figure out what my favorite rock band was. Ah! It is a joke!”

A general objection regarding the brain is that there is so much we do not know. As far as I can tell, we do not know how the brain transforms sounds into words, and words into ideas. We know which regions of the brains are activated, but we do not fully understand how even individual neurons work.

People assume that to surpass nature, we need to fully understand it and to further fully reproduce it. The Wright brothers would have been quite incapable of modeling bird flight, let alone reproduce it. And a Boeing looks like no bird I know… and that’s a good thing… I would hate to travel on top of a giant mechanical bird.

Any programmer will tell you that it can be orders of magnitude easier to reprogram something from scratch, rather than start from spaghetti code that was somehow made to work. We sometimes have a hard time matching nature, not because nature was so incredibly brilliant… but rather because, as an engineer, nature is a terrible hack: no documentation whatsoever, and an “if it works, it is good enough” attitude.

This same objection, “there is so much we do not know”, is used everywhere by pessimists. Academics are especially prone to fall back on this objection, because they like to understand… But, of course, all the time, we develop algorithms and medical therapies that work, without understanding everything about the problem. That’s the beautiful thing about the world we live in: we can act upon it in an efficient manner without understanding all of it.

Our puny brains may never understand themselves, but that does make our brain wonderful and mysterious… it is more likely the case that our brains are a hack that works well enough, but that is far from the best way to achieve intelligence.

Another mistake people make is to assume that evolution is an optimization process that optimizes for what we care about as human beings. For centuries, people thought that if we were meant to fly, we would have wings. Evolution did not give us wings, not as a sign that we couldn’t fly… but simply because there was no evolutionary path leading to monkeys with wings.

Similarly, there is no reason to believe that evolution optimized human intelligence. It seems that other human species had larger brains. Our ancestors had larger brains. Several monkeys have photographic memory, much better strength/mass ratios and better reflexes. The human body is nothing special. We are not the strongest, fastest and smartest species to ever roam the Earth. It is likely that we came to dominate the animal kingdom because, as a species, we have a good mix of skills, and as long as we stay in a group, we can take down any other animal because we are expert at social coordination among mammals.

Yes, it is true that evolution benefited from a lot of time… But that’s like asking a programmer to tweak a piece of code randomly until it works. If you give it enough time, the result will work. It might even look good from the outside. But, inside, you have a big pile of spaghetti code. It is patiently tuned code, but still far from optimality from our point of view.

The Wright brothers were initially mocked. This reassured the skeptics that believed that mechanical flight was a heresy. But, soon after, airplanes flew in the first world war.

In 20 years, we will have machines that surpass the human brain in every way that matters to us. It will look nothing like a human brain… probably more like a Google data warehouse at first… And then we will be stuck with the realization that, from our reproductive organs all the way to our brains, we are nothing special.

Many people refuse to believe that we will ever machines that are better than us in every way. And they are right to be scared because once you invent a machine that is smarter than you are, you have no choice: you have to put it in charge.

Human beings know that they are somehow irrelevant in the grand scheme of things. I write this blog post using a brain that consumes maybe 20 to 30 Watts, with the bulk of my neurons actually invested in controlling my body, not thinking abstractly. In a few decades, it will be trivial to outsmart me. And then I will be back to being an old, boring monkey… no longer a representative of the smartest species on Earth.

Of course, just because we do not need the male organ to procreate does not mean that people stop having sex. The birds did not stop flying when we invented the airplane. Television did not mean the end of radio. The Internet does not mean the end of the paper. Hopefully, my species will make use of its brains for many decades, many centuries… but soon enough, it will seem foolish to leave important decisions and tasks to a mere human brain.

Some of this future is already here.