The surprising cleverness of modern compilers

I wanted to know how a modern C compiler like clang would process the following C code:

#include <stdint.h>
int count(uint64_t x) {
  int v = 0;
  while(x != 0) {
    x &= x - 1;
  return v;

Can you guess?

popcntq	%rdi, %rax

That is right. A fairly sophisticated C function, one that might puzzle many naive programmers compiles down to a single instruction. (Tested with clang 3.8 using -O3 -march=native on a recent x64 processor.)

What does that mean? It means that C is a high-level language. It is not “down to the metal”. It might have been back when compilers were happy to just translate C into correct binary code… but these days are gone. One consequence of the cleverness of our compilers is that it gets hard to benchmark “algorithms”.

In any case, it is another example of externalized intelligence. Most people, most psychologists, assume that intelligence is what happens in our brain. We test people’s intelligence in room, disconnected from the Internet, with only a pencil. But my tools should get as much or even more credit than my brain for most of my achievements. Left alone in a room with a pencil, I’d be a mediocre programmer, a mediocre scientist. I’d be no programmer at all. And this is good news. It is hard to expand or repair the brain, but we have a knack for building better tools.

We are passing the Turing test right on schedule

In 1950, the brilliant computing pioneer Alan Turing made the following prediction in his paper Computing Machinery and Intelligence:

I believe that in about fifty years’ time it will be possible, to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning. The original question, “Can machines think?” I believe to be too meaningless to deserve discussion. Nevertheless I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted.

We are slightly over 50 years after Turing’s time, but I think it is fair to say, at least, that the storage capacity prediction has been reached. Turing used bits (or “binary digits”) as units of storage so 109 is 100MB. That must have sounded like an enormous storage capacity in 1950. But today, all smartphones have a lot more storage than that. In fact, it seems that Turing somewhat underestimated storage capacities. He reported the brain to have a storage capacity of less than 1015 or 100TB, when today’s estimate puts the brain storage capacity at 10 times this amount. To be fair to Turing, storage bits were tremendously precious in 1950 so programmers used them with care. Today we waste bits without second thoughts. Many programs that require gigabytes of memory could make do with 100MB if they were carefully engineered. And even today, we know too little about the engineering of our brain to appreciate its memory usage.

As for the last part, where Turing says that people will accept that machines think, if you listen to people talk, they will routinely refer to software as “thinking”. Your mobile phone thinks you should turn left at the next corner. Netflix thinks I will like this movie. And so forth. Philosophers will still object that machines cannot think, but who listens to them?

We call our phones “smart”, don’t we?

What about fooling people into thinking that the machine is human? I think that Alan Turing, as an observer of our time, would have no doubt that this prediction has come to pass.

In 2014, a computer managed to pass for a 13-year-old boy and fool 33% of the judges. But it could be dismissed as an anecdote. However, recently, Ashok Goel, a professor of computer science, used IBM Watson’s technology to create a teaching assistant called Jill Watson. The assistant apparently fooled the students. Quoting from the New York Times:

One day in January, Eric Wilson dashed off a message to the teaching assistants for an online course at the Georgia Institute of Technology. “I really feel like I missed the mark in giving the correct amount of feedback,” he wrote, pleading to revise an assignment. Thirteen minutes later, the TA responded. “Unfortunately, there is not a way to edit submitted feedback,” wrote Jill Watson, one of nine assistants for the 300-plus students. Last week, Mr. Wilson found out he had been seeking guidance from a computer. “She was the person—well, the teaching assistant—who would remind us of due dates and post questions in the middle of the week to spark conversations,” said student Jennifer Gavin. “It seemed very much like a normal conversation with a human being,” Ms. Gavin said. Shreyas Vidyarthi, another student, ascribed human attributes to the TA—imagining her as a friendly Caucasian 20-something on her way to a Ph.D. Students were told of their guinea-pig status last month. “I was flabbergasted,” said Mr. Vidyarthi.

So Turing was right. We are about 15 years after his 50-year mark, but a 30% error margin when predicting the future is surely acceptable.

Let us reflect on how Turing concluded his 1950 article…

We may hope that machines will eventually compete with men in all purely intellectual fields. But which are the best ones to start with? Even this is a difficult decision. Many people think that a very abstract activity, like the playing of chess, would be best. It can also be maintained that it is best to provide the machine with the best sense organs that money can buy, and then teach it to understand and speak English.

Chess has been definitively solved in 1997. Our brains are now obsolete as far as the game of Chess is concerned. Computers can speak English quite well. They understand us, to a point.

All in all, Turing was nearly prescient in how he imagined the beginning of the XXIst century.

Professors intentionally slow down science to make themselves look better

Recently, the president of the United States announced a big anti-cancer initiative, to be headed by his vice-president Joe Biden. Will it be fruitful? Maybe. But an important obstacle has become clear. Researchers are glad to receive funds from the government, but they do not want to share back data, software, and raw results. They are willing to publish only as a way to claim credit, but as a way to open their laboratories to the world. This means that years can be lost while other tries to reproduce research results from partial information.

Though American medical researchers are required to post results from clinical trials on, it takes only a few minutes to find out that the information is not available: only about a tenth of completed studies report results a year after the study is completed. The results of most government funded clinical trials are not shared, and when they are, they information is often limited. Chen et al. recently concluded:

Despite the ethical mandate and expressed values and mission of academic institutions, there is poor performance and noticeable variation in the dissemination of clinical trial results across leading academic medical centers.

This is far from limited to medicine. If you are trying to access some academic research results, and you are failing to get access to the data and the software, it is most likely by design.

If you ask representatives from the research community, they might tell you that there is too little funding, so academic researchers must compete fiercely for positions. If governments only gave a bit more money, people would be more open to share. But if this theory held, then established researchers who have jobs for life and a healthy track record of securing large grants should be more open to sharing than the post-docs who are hunting for a job. I have never heard of any such documented correlation. And given that the established researchers set the pace in science, it is unlikely that they are advocates of openness. In any case, according to this theory, we would see variations on how willing people are to share according to the level of government funding… and, again, no such correlation was ever established.

The other story is that if they were to share their data and software, others could benefit while they would get, at best, partial. That story is closer to the truth.

The real reason the Harvard medical researcher does not fully open his lab to the Stanford medical researcher is that he is afraid that the Stanford medical researcher will go back home, use the data to achieve a breakthrough.

To put it in clear terms, if you are an academic researcher, and you have data or software that could be useful to others… that could enable them to cure cancer, get trees to absorb CO2 faster, or crack AI… you’d rather that they do not do these things as they would cast a shadow on your own accomplishments.

But, surely, academic researchers are lot more open than people from industry? Not so.

“trials by industry were 3 times more likely to report results than were trials funded by the NIH [government funding agency]” (Law et al.)

The problem is that academic researchers are overly obsessed with their own personal social status. It is a stronger pull than the desire to advance science, the public good or even commercial interests.

Some people then object that it is not an obsession with their own social status, but rather a lack of incentives. Reward the researchers with more social status if they share and they will, says this theory. Of course, if you think it through, this objection is nothing more than a rewording of my own explanation: researchers tend to only think about their own social status and everything else is secondary.

Engineering the system so that it gives more social status to people who share is hard and open to gaming.

Instead, I think it is time that we get busy mocking academic researchers. Someone needs to go through, find all government-funded studies without reporting and openly shame the offenders.

And this would only be the start. We need to take down academia’s ego.

Next, someone needs to start issuing data, software or details requests to various government-funded laboratories. When the information is not forthcoming, we should openly report the status-obsessed professors.

Eventually, if there is enough shaming, the professors will have no choice but to share so as to protect their status.

Further reading: Michael Nielsen, Reinventing Discovery: The New Era of Networked Science, 2011.

Email: how to be polite and efficient

Email is an old platform, but it still represents the cornerstone of most of our work online. Surprisingly, many people seem to be using email poorly. Here are a few basic rules to keep us productive.

  • Long emails are inefficient because people do not read them.
  • Angry emails should be used with care as they can have devastating effects on the recipients. Long angry emails are almost never a good idea.
  • Passive-aggressive emails are just as dangerous as angry emails.
  • The object of an email should reflect its content. Most importantly, it needs to give the recipient a reason to read it.
  • Formalities (“Dear Sir, (…) Sincerely yours”) only make your emails longer and less efficient. Long signatures at the end of your email are also extraneous
  • Bandwidth is cheap and documents such as Word or PDF are already compressed. Putting them into compressed archives (zip or RAR) is inefficient. If you put documents into archives to “pass the corporate anti-viral firewall”, you are telling the world that you have an idiotic security setup.
  • We have Internet protocols to reasonably ensure email delivery without adding to the cognitive load of the human users. Unsolicited automated emails are spam that burdens our lives. Automated emails are spam unless they were solicited. It is that simple. So your “I am away” or “I got your email” emails are spam.
  • It is ok to follow-up with someone if you were expecting an answer. However, to require an immediate answer of the type “I got your email” from co-workers and professionals you interact with is abusive and impolite. Most people are busy with tasks besides email and if you expect them to put everything they are doing aside every time you are sending an email, then they are basically “on call” for you. Having someone “on call” is a luxury. Some people are indeed on call (e.g., tech support, emergency specialists, and so on) but they represent a minority and they should probably not use email in any case.

More reading: How to write me an email (and get a response) by Julian Togelius

Is software a neutral agent?

We face an embarrassing amount of information but when we feel overwhelmed, as Clay Shirky said, “It’s not information overload. It’s filter failure.” Unavoidably, we rely heavily on recommender systems as filters. Email clients increasingly help you differentiate the important email from the routine ones, and they regularly hide from your sight what qualifies as junk. Netflix and YouTube work hard so that you are mostly presented with content you want to watch.

Unsurprisingly, YouTube, Facebook, Netflix, Amazon and most other big Internet players have heavily invested in their recommender systems. Though it is a vast field with many possible techniques, one key ingredient is collaborative filtering, a term first coined in 1992 by David Goldberg (now at eBay but then at Xerox Parc). It has become known through, in part, the work done at Amazon by Greg Linden on the item-to-item collaborative filtering (“people who liked this book also liked these other books”) (patented in 1998). The general theorem underlying collaborative filtering is that if people who are like you like something, then you are more likely to like such a thing. Thus, we should not be mistaken and think that the recommender systems are sets of rules inputted by experts. They are in fact an instance of machine learning where the software learns to predict us by watching us.

But this also means that these filters, these algorithms, are in part a reflection of what we are, how we act. And these algorithms know us better than we may think. And that’s true even if you share nothing about yourself. For example, Jernigan and Mistree showed in 2009 that based solely on the profiles of the people who declared to be your friends, an algorithm can determine your sexual orientation. Using minute traces that you unavoidably leave online, we can determine your sexual orientation, ethnicity, religious and political views, your age, and your gender. There is an entire data-science industry that is dedicated to tracking what we buy, what we watch… Whether they do it directly or not, intentionally or not, recommender systems in YouTube, Facebook, Netflix, Amazon take into account your personal and private attributes in selecting content for you.

We should not be surprised that we are tracked so easily. The overwhelming majority of the Internet players are effectively marketing agents, paid to provide you with relevant content. It is their core business to track you.

However, though polls are also a reflection of our opinions, it has long been known that they influence the vote, even when pollsters are as impartial as they can be. Recommender systems therefore not neutral, they affect our behavior. For example, some researchers have observed that recommender systems tend to favor blockbusters over the long tail. This can be true even as, at the individual level, the system makes you discover new content… seemingly increasing your reach… while leaving the small content producers in the cold.

Some algorithms might be judged unfair or “biased”. For example, it has been shown that if you self-identify as a woman, you might see online fewer ads for high paying jobs than if you are a man. This could be explained, maybe, by a natural tendency for men to click on jobs for higher paying jobs, compared to women. If the algorithm seeks to maximize content that it believes is interesting to you based on your recorded behavior, then there is no need to imagine a nefarious ad agency or employer.

In any case, we have to accept software as an active agent that helps shape our views and our consumption rather than a mere passive tool. And that has to be true even when the programmers are as impartial as they can be. Once we set aside the view of software as an impartial object, we can no longer remain oblivious to its effect on our behavior. At the same time, it may become increasingly difficult to tweak this software, even for its authors, as it grows in sophistication.

How do you check how the algorithms work? The software code is massive, ever-changing, on remote servers, and very sophisticated. For example, the YouTube recommender system relies on deep learning, the same technique that allowed Google to defeat the world champion at Go. It is a complex collection of weights that mimics our own brain. Even the best engineers might struggle to verify that the algorithm behaves as it should in all cases. And government agencies simply cannot read the code as if it were recipes, assuming that they can even legally access it. But can governments at least measure the results or enable the providers to give verifiable measures? Of course, if governments have complete access to our data, they can, but is that what we want?

The Canadian government has tried to regulate what kind of personal data companies can store and how the can store it (PIPEDA). In a globalized world, such laws are hard to enforce but even if they could be enforced, would they be effective? Recall that from minute traces, software can tell more about you than you might think… and, ultimately, people do want to receive personalized services. We do want Netflix to know which movies we really like.

Evidently, we cannot monitor Netflix the same way we monitor a TV station. We can study the news coverage that newspapers and TV shows provide, but what can we say about how Facebook paints the world for us?

We must realize that even if there is no conspiracy to change our views and behavior, software, even brutally boring statistics-based software, is having this effect. And the effect is going to get ever stronger and harder to comprehend.

Further reading:

  • Datta, A., Tschantz, M. C., & Datta, A. (2015). Automated experiments on Ad privacy settings. Proceedings on Privacy Enhancing Technologies, 2015(1), 92-112.
  • Goldberg, D., Nichols, D., Oki, B. M. , and Terry, D. 1992. Using collaborative filtering to weave an information tapestry. Commun. ACM 35, 12 (December 1992), 61-70.
  • Fleder, D., & Hosanagar, K. (2009). Blockbuster culture’s next rise or fall: The impact of recommender systems on sales diversity. Management science,55(5), 697-712.
  • Jernigan, C., & Mistree, B. F. (2009). Gaydar: Facebook friendships expose sexual orientation. First Monday, 14(10).
  • Kosinski, M., Stillwell, D., & Graepel, T. (2013). Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences, 110(15), 5802-5805.
  • Linden, G., Smith, B., & York, J. (2003). Amazon. com recommendations: Item-to-item collaborative filtering. Internet Computing, IEEE, 7(1), 76-80.
  • Statt, N., YouTube redesigns its mobile apps with improved recommendations Using ‘deep neural networks’, April 26th, 2016
  • Tutt, A., An FDA for Algorithms (March 15, 2016).

We know a lot less than we think, especially about the future.

The inventors of the airplane, the Wright brothers, had little formal education (3 and 4 years of high school respectively). They were not engineers. They were not scientists. They ran a bicycle repair shop.

At the time of their invention, there was quite a bit of doubt as to whether airplanes were possible. It is hard to imagine how people could doubt the possibility of an airplane, but many did slightly over a century ago.

Lord Kelvin famously said that “heavier-than-air flying machines are impossible” back in 1895.

But that is not all. The American government had nonetheless funded an illustrious Physics professor, Samuel Langley with millions of dollars in today’s currency so that he would build an airplane. The man had written the textbook on aeronautic at the time.

Langley failed miserably. This lead the illustrious New York Times to publish this prediction:

flying machine which will really fly might be evolved by the combined and continuous efforts of mathematicians and mechanicians in from one million to ten million years

It is likely at this point that many experts would have agreed with the New York Times. Flying was just not possible. We had given large sums to the best and smartest people. They could not make a dent in the problem. We had the greatest scientists in the world stating openly that flying was flat out impossible. Not just improbable, but impossible.

Yet only a few days later, with no government grant, no prestigious degree, no credential whatsoever, the Wright brothers flew an heavier-than-air machine. That was 1903.

In the first Word War of 1914, only ten years later, both camps used war planes.

The story is worse than I make it sound because even after the Wright brothers did fly… it took years for the Americans to notice. That is, people did not immediately recognize the significance of what the Wright brothers demonstrated.

You think we are smarter now and such silliness would not happen.

Here is what Steve Ballmer, Microsoft CEO said about the iPhone when it came out…

it [the iPhone] doesn’t appeal to business customers because it doesn’t have a keyboard, which makes it not a very good email machine. Right now we’re selling millions and millions and millions of phones a year, Apple is selling zero phones a year.

That was 2007. Today Apple sells about 60 million iPhones per month. How many phones does Microsoft sell? How many Microsoft phones have you seen lately?

To be fair, it is true that most new ideas fail. We get a new cure for Alzheimer’s every week. The fact that we get a new one every week is a pretty good indication that it is all hype. But the real lesson is not that we cannot break through hard problems. The true lesson is that we know a lot less than we think, especially about the future.

Pessimism is the easy way out. Asked about any new idea, I can simply say that it is junk. And I will be right 99% of the time. We obsess about not being wrong when, in fact, if you are not regularly wrong, you are simply not trying hard enough. What matters is that you are somehow able to see the important things as they are happening. Pessimists tend to miss everything but the catastrophes.

How will you die? Cancer, Alzheimer’s, Stroke?

Before the 1950s, many of us suffered from poliomyelitis and too many ended up crippled. Then we developed a vaccine and eradicated the disease. Before the second world war, many people, even the richest, could die of a simple foot infection. Then we mass-produced antibiotics and got rid of the problem.

I have stated that it is basically a matter of time before we get the diseases of old age (cancer, stroke, dementia…) under control. It is impossible to tell when it will happen. Could be a couple of decades, could be 45 years, could be a century or a bit more. As a precaution, you should never trust anyone who says he can predict the future more than a couple of years in advance. However, progress that is not impossible in principle tends to reliably happen, on its own schedule.

Whenever we will get the diseases of aging under control, we will end up with drastically extended healthspan. Simply put, most of us end up sick or dead because of the diseases of old age. Without these diseases, we would end up healthy for much longer.

It comes down to the difference between having airplanes and not having them. Having electricity or not having it. Having the Internet or not having it. These are drastic differences.

Stating that the diseases of aging will come under control at some point in our future should not be controversial. And you would hope that people would see this as a positive outcome.

Not so.

The prospect that we may finally defeat aging is either rejected as being too improbable, or, more commonly, is rejected as being undesirable. Nick Bostrom even wrote a fable to illustrate how people commonly react.

The “improbable” part can always be argued. Anything that has never been done can always be much harder to achieve than we think. However, some progress is evident. Jimmy Carter, a 91-year-old man, was “cured” from a brain tumor recently. Not long ago, such feats were unthinkable. So it becomes increasingly difficult to argue that a few decades of research cannot result in substantial medical progress.

So we must accept, at least in principle, that the diseases of aging may “soon” become under control where by soon, I mean “this century”. This would unavoidably extend human life.

Recently, one of my readers had this very typical reaction:

As for extending human life, I’m not for it.

If you tend to agree with my reader, please think it through.

Aging does not, by itself, kills us. What kills us are the diseases that it brings, such a stroke, dementia, cancer. So if you are opposed to people living healthier, longer lives, then you are favorable to some of these diseases. I, for one, would rather that we get rid of stroke, cancers and dementia. I do not want to see these diseases in my family.

Medical research is a tiny fraction of our total spending. Medical spending is overwhelming directed toward palliative care. To put it bluntly, we spend billions, trillions, caring for people who are soon going to die of Alzheimer’s or cancer. This is quite aside from the terrible loss of productivity and experience caused by these diseases.

If we could get rid of these diseases, we would be enormously richer… we would spend much less on medical care and have people who are a lot more productive. The cost of aging are truly enormous and rising right now. Keeping people healthy is a lot cheaper than keeping sick people from dying.

Moreover, increased lifespans in modern human beings are inexorably linked with lower fertility and smaller populations. Lifespans are short in Africa and long in Europe… yet it is Africa that is going to suffer from overpopulation.

As people are more confident to have long lives, they have fewer children and they have them later. Long-lived individuals tend to contribute more and use less support relatively speaking.

If you are in favor of short human lifespans through aging, then you must be opposed to medical research on the diseases of aging such as dementia, stroke, and cancer. You should, in fact, oppose anything but palliative care since curing dementia or cancer is akin to extending lifespan. You should also welcome news that members of your family suffer from cancer, Parkinson’s and Alzheimer’s. They will soon leave their place and stop selfishly using our resources. Their diseases should be cause for celebration.

Of course, few people celebrate when they learn that they suffer from Alzheimer’s. Yet this disease is all too natural. Death is natural. So are infectious diseases. We could reject antibiotics because dying of an infection is “natural”. Of course, we do not.

Others object that defeating the diseases of aging (cancer, Alzheimer’s, stroke…) means that we become immortal and that’s clearly troubling and maybe unsustainable. But it is unfounded. Short of rebuilding our bodies with nanotechnology, the best we could probably do is make it so that people of all chronological age have the mortality rate they had when they were thirty. That’s a very ambitious goal that I doubt we have any chance of reaching in this century. And yet, people in their thirties die all the time. They simply do not tend to die of aging.

Yet others fall prey to the Tithonus error and believe that if we somehow get the diseases of aging under control, we will remain alive while growing increasingly frail and vulnerable. But, of course, being vulnerable is the gateway to the diseases of old age. You cannot control the diseases of aging without making sure that people remain relatively strong.

Others fear that only the few will be able to afford medicine to keep the diseases of old age at bay… It is sensible to ask whether some people could have earlier access to technology, but from an ethical point of view, one should start with the observation that the poorest among us are the hardest hit by the diseases of aging. Bill Gates won’t be left alone to suffer in a dirty room with minimal care. Healthy poor people are immensely richer than sick “poor” people. Like vaccines, therapies to control the diseases of old age are likely to be viewed as public goods. Once more: controlling the diseases of old age will make us massively richer.

I am sure that, initially, some people expressed concerns regarding the use of antibiotics. Who will benefit most from vaccines? Can you imagine contemplating this question when the Americans decided to mass produce antibiotics for the first time?

When the Internet came of age, many people wrote long essays against it. There were big fears that the Internet would create a digital divide, where the rich would jump ahead while the poor would remain disconnected. Yet, today, the poorest kids in the world have access to the same Wikipedia as Bill Gates’ kids. There were fears that the rise of computers would isolate people. Yet today the Internet is our social gateway.

Now that we are starting to think about getting the diseases of aging, people object. But let me assure you that when it comes down to it, if there are cures for the diseases of aging, and you are old and sick, you will almost certainly accept the cure no matter what you are saying now. And the world will be better for it.

Please, let us just say no to dementia, stroke and cancer. They are monsters.

Further reading: Nick Bostrom, The Fable of the Dragon-Tyrant, Journal of Medical Ethics, 2005.

The powerful hacker culture

In my post the hacker culture is winning, I observed that the subculture developed in the software industry is infecting the wider world. One such visible culture shift is the concept of “version update”. In the industrial era, companies would design a phone, produce it and ship it. There might be a new type of phone the following year, but whatever you bought is what you got. In some sense, both politically and economically, the industrial era was inspired by the military model. “You have your orders!”

Yet, recently, a car company, Tesla, released an update so that all its existing cars acquired new functions (self-driving on highways). You simply could not even have imagined such an update in the industrial era.

It is an example of what I called innovation without permission, a feature of the hacker culture. It is an expression of the core defining hacker characteristic: playfulness and irreverence. Hackers will install Linux on the latest PlayStation even if Sony forbid it and made it impossible. Why would any team invest months of work on such a futile project?

What is unique to hackers is that displays of expertise have surpassed mere functionality to take a life of their own. Though my colleagues in the “Arts” often roll their eyes when I point it out, the hackers are the true subversive artists of the post-industrial era.

The hacker culture has proven its strength. We got Chelsea Manning’s and Julian Assange’s WikiLeaks, the somewhat scary underground work by Anonymous, Edward Snowden’s leak, the Panama papers and so forth. Aaron Swartz scared the establishment so much that they sought to put him behind bars for life merely because he downloaded academic articles.

You might object that many of these high-profile cases ended with the hackers being exiled or taken down… but I think it is fair to say that people like Aaron Swartz won the culture war. As a whole, more people, not fewer, are siding with the hackers. Regarding the Panama Papers, there were some feeble attempts to depict the leak as a privacy violation, but it no longer carries weight as an argument. TV shows increasingly depict hackers as powerful (and often rightful) people (e.g., House of Cards, The Good Wife, and Homeland).

Who is winning ground do you think?

What makes the hacker culture strong?

  • Hackers control the tools. Google, Microsoft and Apple have powerful CEOs, but they need top-notch hackers to keep the smartphones running. Our entire culture is shaped by how these hackers think through our tools.

    The government might be building up fantastic cyberweapons, but what the Snowden incident proved is that this may only give more power to the hackers. You know who has access to all your emails? Software hackers.

    Our tools have come to reflect the hacker culture. They are more and more playful and irreverent. We now have CEOs posting on Twitter using 140 characters. No “sincerely yours”, no corporate logo.

  • Hackers are rich with time and resources. Most companies need hackers, but they can’t really tell what the best ones are up to. How do you think we ended up with Linux running most of our Internet infrastructure? It is not the result of central planning or a set of business decisions. It happened with hackers were toying with Linux while the boss was looking. When you have employees stacking crates, it is easy for an industrial-age boss to direct them. How do you direct extremely smart people who are typing on keyboards?

    Apparently, Linus Torvalds work in his bathrobe at home. He spends a lot of time swearing at other people on posting boards. He can afford all of that because it is impossible to tell Linus what to do.

I don’t think it is mere coincidence if the powerful people are embracing the hacker culture. I could kid and point out that the true hackers may not represent many people, they may not formally hold much wealth, but they metaphorically control the voting machines and hold all the incriminating pictures. But rather, I think that smart people realize that the hacker culture might also be exactly what we need to prosper in the post-industrial era. The military approach is too crude. We don’t need more factories. We don’t need more tanks. But we sure can use smarter software. And that’s ultimately where the hackers take their power: they put results into your hands.

No more leaks with sanitize flags in gcc and clang

If you are programming in C and C++, you are probably wasting at least some of your time hunting down memory problems. Maybe you allocated memory and forgot to free it later.

A whole industry of tools has been built to help us trace and solve these problems. On Linux and MacOS, the state-of-the-art has been valgrind. Build your code as usual, then run it while under valgrind and memory problems should be identified.

Tools are nice but a separate check breaks your workflow. If you are using recent versions of the GCC and clang compilers, there is a better option: sanitize flags.

Suppose you have the following C program:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv)
   char * buffer = malloc(1024);
   sprintf(buffer, "%d", argc);

Save this file as s.c. The program should simply print out how many arguments were entered on the command line. Notice the call to malloc that allocates a kilobyte of memory. There is no accompanying call to free and so the kilobyte of memory is “lost” and only recovered when the program ends.

Let us compile the program with the appropriate sanitize flags (-fsanitize=address -fno-omit-frame-pointer):

gcc -ggdb -o s s.c -fsanitize=address -fno-omit-frame-pointer

When you run the program, you get the following:

$ ./s

==3911==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 1024 byte(s) in 1 object(s) allocated from:
    #0 0x7f55516b644a in malloc (/usr/lib/x86_64-linux-gnu/
    #1 0x40084e in main /home/dlemire/tmp/s.c:6
    #2 0x7f555127eec4 in __libc_start_main (/lib/x86_64-linux-gnu/

SUMMARY: AddressSanitizer: 1024 byte(s) leaked in 1 allocation(s).

Notice how it narrows down to the line of code where the memory leak came from?

It is even nicer: the return value of the command will be non-zero meaning that if this code was run as part of software testing, you could automagically flag the code as being buggy.

While you are at it, you can add other sanitize flags such as -fsanitize=undefined to your code. The undefined sanitizer will warn you if you are relying on undefined behavior as per the C or C++ specifications.

These flags represent significant steps forward for people programming in C or C++ with gcc or clang. They make it a lot more likely that your code will be reliable.

Really, if you are using gcc or clang and you are not using these flags, you are not being serious.

How close are AI systems to human-level intelligence? The Allen AI challenge.

With respect to artificial intelligence, some people are squarely in the “optimist” camp, believing that we are “nearly there” as far as producing human-level intelligence. Microsoft co-founder’s Paul Allen has been somewhat more prudent:

While we have learned a great deal about how to build individual AI systems that do seemingly intelligent things, our systems have always remained brittle—their performance boundaries are rigidly set by their internal assumptions and defining algorithms, they cannot generalize, and they frequently give nonsensical answers outside of their specific focus areas.

So Allen does not believe that we will see human-level artificial intelligence in this century. But he nevertheless generously created a foundation aiming to develop such human-level intelligence, the Allen Institute for Artificial Intelligence Science.

The Institute is lead by Oren Etzioni who obviously shares some of Allen’s “pessimistic” views. Etzioni has made it clear that he feels that the recent breakthroughs of Google’s DeepMind (i.e., beating the best human beings at Go) should not be exaggerated. Etzioni took for example the fact that their research paper search engine (Semantic Scholar) can differentiate between the significant citations and the less significant ones. The way DeepMind’s engine works is that it looks at many, many examples and learn from these examples because they are clearly and objectively classified (we know who wins and who loses a given game of Go). But there is no win/lose label on the content of research papers. In other words, human beings become intelligent in an unsupervised manner, often working from few examples and few objective labels.

To try to assess how far off we are from human-level intelligence, the Allen Institute launched a game where people had to design an artificial intelligence capable of passing 8th-grade science tests. They gave generous prizes to the best three teams. The questions touch various scientific domains:

  • How many chromosomes does the human body cell contain?
  • How could city administrators encourage energy conservation?
  • What do earthquakes tell scientists about the history of the planet?
  • Describe a relationship between the distance from Earth and a characteristic of a star.

So how far are we from human-level intelligence? The Institute published the results in a short paper.

Interestingly, all three top scores were very close (within 1%). The first prize went to Chaim Linhart who scored 59%. My congratulations to him!

How good is 59%? That’s the glass half-full, glass half-empty problem. Possibly, the researchers from the Allen Institute do not think it qualifies as human-level intelligence. I do not think that they set a threshold ahead of time. They don’t tell us how many human beings can’t manage to get even 59%. But I think that they now set the threshold at 80%. Is this because that’s what human-level intelligence represents?

All three winners expressed that it was clear that applying a deeper, semantic level of reasoning with scientific knowledge to the questions and answers would be the key to achieving scores of 80% and beyond, and to demonstrating what might be considered true artificial intelligence.

It is also unclear whether 59% represent the best an AI could do right now. We only know that the participants in the game organized by the Institute could not do better at this point. What score are the researchers from the Allen Institute able to get on their own game? I could not find this information.

What is interesting however is that, for the most part, the teams threw lots of data in a search engine and used information retrieval techniques combined with basic machine learning algorithms to solve the problem. If you are keeping track, this is reminiscent of how DeepMind managed to beat the best human player at Go: use good indexes over lots of data coupled with unsurprising machine learning algorithms. Researchers from the Allen Institute appear to think that this outlines our current limitations:

In the end, each of the winning models found the most benefit in information retrieval based methods. This is indicative of the state of AI technology in this area of research; we can’t ace an 8th grade science exam because we do not currently have AI systems capable of going beyond the surface text to a deeper understanding of the meaning underlying each question, and then successfully using reasoning to find the appropriate answer.

(The researchers from the Allen Institute invite us to go play with their own artificial intelligence called Aristo. So they do have a system capable of writing 8th grade tests. Where are the scores?)

So, how close are we to human-level artificial intelligence? My problem with this question is that it assumes we have an objective metric. When you try to land human beings on the Moon, there is an objective way to assess your results. By their own admission, the Allen Institute researchers tell us that computers can probably already pass Alan Turing’s test, but they (rightfully) dismiss the Turing test as flawed. Reasonably enough they propose passing 8th-grade science tests as a new metric. It does not seem far-fetched to me at all that people could, soon, build software that can ace 8th-grade science tests. Certainly, there is no need to wait until the end of this century. But what if I build an artificial intelligence that can ace these tests, would they then say that I have cracked human-level artificial intelligence? I suspect that they would not.

And then there is a little embarrassing fact: we can already achieve super-human intelligence. Go back in 1975 but bring the Google search engine with you. Put it in a box with flashy lights. Most people would agree that the search engine is nothing but the equivalent of a very advanced artificial intelligence. There would be no doubt.

Moreover, unlike human intelligence, Google’s intelligence is beyond our biology. There are billions of human brains… it makes no practical sense to limit computers to what brains can do when it is obviously more profitable to build machines that can do what brains cannot do. We do not ask for cars that walk like we do or for planes that fly like birds… why would we want computers that think like we do?

Given our limited knowledge, the whole question of assessing how close we are to human-level intelligence looks dangerously close to a philosophical question… and I mean this in a pejorative sense. I think that many “optimists” looking at the 59% score would say that we are very close to human-level intelligence. Others would say that they only got 59% by using a massive database. But we should have learned one thing: science not philosophy is the engine of progress and prosperity. Until we can make it precise, asking whether we can achieve human-level intelligence with software is an endlessly debatable question akin to asking how many angels fit in a spoon.

Still, I think we should celebrate the work done by the Allen Institute. Not because we care necessarily about mimicking human-level intelligence, but because software that can pass science tests is likely to serve as an inspiration for software that can read our biology textbooks, look at experimental data, and maybe help us find cures for cancer or Alzheimer’s. The great thing about an objective competition, like passing 8th-grade science tests, is that it cuts through the fog. There is no need for marketing material and press releases. You get the questions and your software answers them. It does well or it does not.

And what about the future? It looks bright:

In 2016, AI2 plans to launch a new, $1 million challenge, inviting the wider world to take the next big steps in AI (…)