Synthesized hash functions in Swift 4.1 (and why Java programmers should be envious)

When programming, we often rely on maps from keys to values. This is most often implemented using a hash table, which implies that our keys must be “hashable”. That is, there must be a function from key objects to hash values. A hash value is a “random-looking integer” that is determined by the object value. Thus a given string (e.g., “232”) should always have the same hash value within a given context (typically within the program’s life). It should be “random” in the sense that it should be improbable that any two distinct keys have the same hash value (e.g., “232” and “231” should not hash to the same integer).

In many programming languages like Java and C++, you are expected to implement and design your own hash functions when coming up with your own classes or structs. In Java, if you do not provide a hash function (by overriding the hashCode function), then you inherit some hashCode function which is tied to the particular object instance. That’s problematic. Let me illustrate the problem. Suppose I create my own user identifier class:

public class UserID {
  int x;
  public UserID(int X) {x = X;}
}

That sounds reasonable, right?

Let me store some information corresponding to one such user identifier:

HashMap<Stupid,String> h = new HashMap<Stupid,String>();
h.put(new UserID(32321),"Your name is John");

What happens if I try to retrieve it?

System.out.println(h.get(new UserID(32321)));

Most likely, this will give a null, as in “we don’t have such a user identifier”.

So you have to remember to override hashCode properly or risk having buggy software. Annoying.

It is especially annoying because most people do not know how to design a hash function! You are lucky if you got a single class in college on the topic.

Swift, the new programming language designed by Apple is in the same boat. It is slightly better in that it will not generate a potentially misleading hash function by default. For your class or struct to be “hashable”, you need to provide a hash function.

However, Swift 4.1 fixes this annoyance by generating sensible default hash functions. These hash functions are probably not perfect but they are likely better than whatever bored programmers can conjure up.

Swift 4.1 has not yet been released, so I suppose the details could change, but here is the gist of it. Suppose that you define your own class:

struct Point:Hashable {
     var x : Int
     var y : Int
     public init(_ x:Int,_ y:Int) {
         self.x = x
         self.y = y
     }
 }

Notice that I declared that it is “Hashable” but I did not provide any hash-function implementation. On current Swift (e.g., Swift 4.0), this will not compile, but if you get early builds of Swift 4.1, it will work fine.

So what does it do? Well, the Swift compiler looks at the values stored in your class and struct, and it automagically hashes them together.

A point worth noting: the hash value only depends on the values being stored. So suppose you create a new class called MyPoint (instead of Point):

struct MyPoint:Hashable {
     var x : Int
     var y : Int
     public init(_ x:Int,_ y:Int) {
         self.x = x
         self.y = y
     }
 }

It will hash in the same manner so you can be sure that the following is always true:

Point(1,2).hashValue == MyPoint(1,2).hashValue

How does it compute the hash? It takes the hash value of each value in the struct or class (let me assume that they are hashable themselves) and it combines them using the function _mixForSynthesizedHashValue. This mysterious function could change but for the time being, it is just the following linear polynomial in simplified code:

function _mixForSynthesizedHashValue(x,y) {
  return 31 * x + y
}

It often called a compression function because it takes two values, and combines them into a single one (we go from two times 64 bits to 64 bits). For more “randomness”, the Swift compiler uses a _mixInt function which takes a 64-bit value and returns another 64-bit value with the “bits mixed” (it is just a function that appears to generate random outputs). Thus the following should print the same value twice:

print(Point(1,1).hashValue);
print(_mixInt(_mixForSynthesizedHashValue(1,1)))

What if you have more than two values in your struct or class? Let me consider a tridimensional point:

struct Point3D:Hashable {
     var x : Int
     var y : Int
     var z : Int
     public init(_ x:Int,_ y:Int, _ z:Int) {
         self.x = x
         self.y = y
         self.z = z
     }
 }

Then the result is similar except that we need to call the compression function twice, so that the following two lines will print the same value:

print(Point3D(32,45,66).hashValue);
print(_mixInt(_mixForSynthesizedHashValue(
  _mixForSynthesizedHashValue(32,45),66)))

I alluded to the fact that these automatically generated hash functions might not be perfect… In the current implementation, we get that tridimensional points can trivially collide with bidimensional points, the following being always true:

Point3D(0,32,45).hashValue == Point(32,45).hashValue

If this ends up being a problem for your application, you can always roll out your own hash functions, of course.

I did not elaborate on the _mixInt, but one nice thing that I noticed in the Swift’s source code is that they are planning for it to be randomized, that is, you will not get the same hash value for the same object or struct instance for every run of your project. This is an important security feature which I alluded to in an older blog post.

Indeed, the problem with the current implementation is that I can trivially construct lots of values that will collide (have the same hash value), thus rendering the performance of hash tables and other algorithms quite bad. So it is good to randomize the hash functions by default, to make the job of an attacker more difficult.

Further reading: If you are interested in the science of hashing, you might like the following papers…

Note about cryptography:
There is a whole different field called cryptographic hashing which seeks to map values to hash values in such a way that it is very difficult for you to ever guess what the original value was given the hash value and such that it is very hard to create a key that maps to a specific hash value. For example, passwords can be hashed so that if I only give you the hash value, you would need to work for years to find the matching password. Yet, as a system, it is enough for me to just store the hash value. But for what Swift does, this type of security is irrelevant because you are not trying to hide the information being stored, you are just trying to ensure that it is processed quickly.

Science and Technology links (October 27th, 2017)

A well-known tech company, Snapchat, has posted some art pieces throughout the world as augmented reality artifacts. You can only see the art through their software using mobile devices.

Older people frequently become frail. They lose so much muscle mass that they have difficulties moving around. There is no medical therapy for this crippling condition right now. Researchers at the University of Miami have completed a stage 2 trial showing that a stem-cell therapy is safe and can significantly reduce frailty.

The famous mathematician Timothy Gowers has written an opinion piece called The end of an error? He suggests that, in many fields, we would be better off ending the practice of formal peer review for research articles. I would add something to Gowers’ excellent piece: if we were to abolish formal peer review, we would probably get fewer papers in many instances because the practice of “counting papers” would make no sense. I am a blogger. Nobody assesses my work as a blogger by how many blog posts I write. That would be stupid because I can easily write 10,000 blog posts per day if I want… We often get a lot of junk because people scoring points merely for publishing papers… make it trivial to publish papers and people will have to compete on quality instead of volume.

Last week, I reported on a New York Times article about a famous researcher called Cuddy who had promoted the idea that “power poses” could change how you think. We now know that the science behind it was wrong and unreproducible. I found it fascinating to re-read a 2014 article from the same New York times:

Ms. Cuddy has more than 20 fieldwork studies and collaborations in development. At the New York University Polytechnic School of Engineering, computer scientists at the Game Innovation Lab are developing video games to see if power posing before exams reduces math anxiety. Ms. Cuddy’s team is collaborating with a Tufts Medical School professor to see whether the technique can prevent new surgeons from becoming too anxious and “choking” during ophthalmology procedures. An economist is assessing power posing as a tool to help impoverished women in Nairobi make better financial decisions.

Contrast this with a just-released research article. A randomized controlled study of power posing before public speaking exposure for social anxiety disorder:

No evidence for augmentative effects. (…) Power posing (compared to submissive posing/rest) did not result in changes in testosterone. (…) Power posing (compared to submissive posing/rest) did not result in changes in cortisol or fear during an exposure. (…) Power posing (compared to submissive posing/rest) did not facilitate exposure therapy outcomes for social anxiety disorder.

(Note: as I made clear earlier, I do not expect that Cuddy was dishonest. She probably fooled herself.)

There is some evidence that workers get more creative as they grow older. I certainly know a lot of elderly scholars who are can give younger scholars a run for their money.

Is free-for-the-student higher education a good thing? Maybe not:

Our analysis shows that, since England’s move from a free higher education system to a high-fee, high-aid system, university enrolment has increased substantially.

CAPTCHAs were introduced at the turn of the century to distinguish human beings from automated scripts, in a kind of reverse Turing test where human beings must prove that they are human beings. According to a Science article, the text-based ones have been definitively broken so that machines can pass them.

Our bodies keep track of our age, but we don’t know how exactly. One theory states that the hypothalamus is a central clock. By rejuvenating the hypothalamus of old mice, researchers found that the mice lived longer:

In conclusion, ageing speed is substantially controlled by hypothalamic stem cells, partially through the release of exosomal miRNAs.

Wal-Mart launches shelf-scanning robots in about 40 stores.

Fast integer compression with Stream VByte on ARM Neon processors

Stream VByte is possibly the fastest byte-oriented integer compression scheme. I presented it briefly last month when our paper came out. Our C library has been ported to Rust and Go. Our code is used by the Tantivy search engine as well as by the Trinity Information Retrieval framework. Mark Papadakis reported excellent results with Stream VByte.

The x64 code had this super simple vectorized decoding pass:

uint8_t C = lengthTable[control]; // C is between 4 and 16 
 __m128i Data = _mm_loadu_si128((__m128i *) databytes);
 __m128i Shuf = _mm_loadu_si128(shuffleTable[control]);
 Data = _mm_shuffle_epi8(Data, Shuf); // final decoded data  
 datasource += C;

It looks complicated if you are not familiar with vector intrinsic, but this code generates very few instructions.

In my initial announcement, I alluded to the fact that it would be nice to vectorize the code for ARM processors. Indeed, the processors in your iPhone (and other mobile devices) also have vector instructions (called NEON). Could we get good performance there as well?

It turns out that we can. Kendall Willets worked hard not only to vectorize the encoding steps, something we had left for future work, but he also ported the vector code to ARM, with complete vectorization.

What I find beautiful is that the 64-bit ARM NEON code is equivalent, at an abstract level, to the x64 code:

uint8x16_t dec = vld1q_u8(table + key);
uint8x16_t compressed = vld1q_u8(dataPtr);
uint8x16_t data = vqtbl1q_u8(compressed, dec);
dataPtr += length[key];
vst1q_u8(out, data);

It “looks” different because the naming conventions are not the same as Intel’s, but it ends up generating the same kind of instruction.

I should point out that this is the 64-bit version of ARM Neon (part of Aarch64). We also support 32-bit ARM systems, but the code is slightly less elegant. As a rule, you should expect most servers as well as most phones to be 64-bit systems, either now or in the near future. However, smartwatches and other tiny devices are probably going to remain 32-bit systems for some time.

So I tested the 64-bit code on a Softiron 1000 server. These machines have relatively weak (and cheap) AMD processors with A57 cores. You can copy memory at a rate of about 1.7 billion 32-bit integers per second on these cores.

Kendall produced a quick benchmark where we attempt to decompress random data. That is not a favorable case for any compression algorithm, but it gives us some idea of a typical “worst case” performance. So how fast is it? Without vectorization, we decode 260 million integers per second. That is not bad. But with vector (NEON) instructions, we reach 1.1 billion integers decode per second, or about 4 times better.

It still early days for ARM processors on servers, but it is getting easier to find them. On this note, I would like to thank Edward Vielmetti from Packet for helping us test with other server-class ARM processors to validate our work.

Science and Technology links (October 20th, 2017)

John Carmack is a famous game designer. In a recent interview, he invited programmers to shy away from trying to build more realistic games in virtual reality, because he expects that hardware capabilities will not keep up.

Fraction of Americans who are obese: 36.5%. And it keeps on rising.

Brian Wansink is a famous Cornell professor who works on nutrition and gets cited thousands of times per year in academic journals. Obviously, his fellow researchers think very highly of him. The American government is using his research to set policy. Is that warranted? He recently published a paper that might be used to justify interventions on kids, there is just one little problem… in his own words:

We made a mistake in the age group we described in the JAMA article. We mistakenly reported children ranging from 8 to 11 years old; however, the children were actually 3 to 5 years old

Doing good research, as any scientist will tell you, requires lots of attention to details, even small details. We all make mistakes, but we also all have to be careful. Getting confused by such a large margin regarding the age of the participants does not seem like a small mistake, however. It is the second time that Wansink makes this same mistake, it also happened in 2012. There are currently 50 misconduct allegations against Wansink, three of his papers have been retracted. A total of 150 statistical mistakes were found in just one of his papers. I covered the Wansink case earlier this year, predicting that Cornell would not fire him. His laboratory is still running. You can read extensive reports on the incredible amount of fraud involved. Wansink’s Wikipedia page is rather dismissive of the problems. If you think I am being hard, go read Gelman’s take on the issue.

Amy Cuddy was a professor at Harvard University who became famous for her “power poses” theory. She has been cited thousands and thousands of times in research articles. Sadly, her work could not be reproduced. Once under scrutiny, her work came apart: not only could it not be reproduced, but it also seems that the initial evidence, even when set in the best light, was flimsy. Her statistical analysis makes little sense. And she defended it. The New York Times has a piece of her that represents her as a victim. The narrative is that while it was once ok to publish work without any expectation that others could reproduce it or with strong statistical analysis, the rules changed suddenly, and she was a victim of this change. I have to disagree with this narrative. It has never been “ok” in science for others to fail to reproduce your work. Feynman wrote back in the 1970s:

We’ve learned from experience that the truth will come out. Other experimenters will repeat your experiment and find out whether you were wrong or right. (…) And, although you may gain some temporary fame and excitement, you will not gain a good reputation as a scientist if you haven’t tried to be very careful in this kind of work. And it’s this type of integrity, this kind of care not to fool yourself, that is missing to a large extent (…)

Though I would agree that anything resembling an ad hominem attack is outside the realm of science, gaining fame by taking down the results of someone else is entirely fair game. It is a good thing that we reward people who take a chance and try to be critical of the established theories. In fact, that is what science is all about. It is not a secondary pursuit that some ill-intentioned people choose to pursue. It is the very nature of science to take what you are told and to re-examine it. The comparison between Wansink and Cuddy raises questions, however. Why is Wansink allowed to go on with massive funding while Amy Cuddy had to drop her Ivy League professorship? Wansink is an all-out fraud whereas Cuddy appears to have simply fooled herself. In my view, Cuddy could go on, saying “we screwed up”. Wansink should be banned from science. He has damned himself.

While researching the Cuddy article, I ended up on Hacker News, and then on a blog post by Frank McSherry. It is a courageous blog post where McSherry takes apart some of the best papers from database research (including the best paper from SIGMOD 2017) and shows that you cannot trust their results, that they present their (often overengineered) approaches in a good light when a simple baseline would show that their results are not very good. Though McSherry chose to go after VLDB and SIGMOD (some of the best venues for database research), you should not conclude that things get better at lesser venues. Maybe it is a coïncidence but in the last week, I reviewed four submissions to journals, all of them proposing new, faster algorithms. Three of them did not even offer a benchmark. Because, apparently, it is good enough to argue that your algorithm is faster, and actually running it is just a waste of time… while the fourth one did include a benchmark, but by working backward from their results, one has to conclude that it takes 1,000 CPU cycles to add two numbers together.

There has apparently been a very large decline in the number of flying insects.

“Across several metrics, organic agriculture actually proves to be more harmful to the world’s environment than conventional agriculture.”

Oddly, it appears that using small needles on your scalp can help you grow more hair. Microneedling is apparently a thing to help diminish wrinkles, and it seems that it works on your scalp too to grow more hair. I am somewhat skeptical.

Bloomberg has a nice article on Fanuc, the leading producer of industrial robots. They are crazily secretive but also massively successful. Indirectly, they make a lot of what we all buy, through the robots that they sell to China or Tesla.

Not satisfied to have shown that they could beat the best human beings at Go, the Deep Mind engineers now report that they can do so with software that teaches itself the game without any help. It plays against itself. This is a big deal because it opens the door for general solutions. Down this path is a machine that could learn to play just about any game, as long as it can play against itself.

Dyslexia is a common condition where people experience difficulty reading. It appears that it may be caused by a defect in the eyes, as opposed to a problem in the brain. This reminds us that we don’t know what causes dyslexia.

Young blood appears to rejuvenate old kidneys, in mice. This suggests that it would make sense for young people to receive organs donated by old people. It might also suggest that your blood keeps track of how old you are. Certainly, there is something in our blood that can tell our age, but if we were to change our blood, could be become more youthful? We just do not know, but we will, hopefully, find out within a decade or so.

It appears that there are serious methodological errors in Piketty’s famous book “Capital in the Twenty-First Century”. It seems, in fact, that the data is pretty much useless. So maybe the book is best read as a very long opinion piece that tells us that economic inequality is rising fast.

Our neurons are covered and protected by a myelin layer, basically a layer of fat. When it erodes, we are in trouble. It appears that a common allergy drug could repair our myelin in some cases.

Cancer cells are characterized by the fact that they produce energy using fermentation, which turns sugar into energy somewhat inefficiently. It now seems that an abundance of sugar promotes fermentation and thus, maybe, cancer.

The CEO of the largest software hosting site (GitHub) thinks that programmers may be replaced by robots in the foreseeable future. I presume that he thinks his company will supervise the software that replaces programmers?

Caloric restriction extends lifespan in most species. It appears that ketone bodies would mimic this effect. Ketone bodies are normally produced in our bodies when we go a long time without eating sugar. It seems that ingesting ketone bodies from external sources might be an alternative to adopting a sugar-free diet.

Why virtual reality (VR) might matter more than you think….

I have heard it claimed that the famous novelist William Gibson uttered his famous quote, “the future is already here — it’s just not very evenly distributed”, for the first time after experiencing virtual reality, decades ago.

We are fast arriving at a point where virtual-reality will be dirt cheap, and it will work really well.

A core issue right now, and that might surprise you, is that most people, including those who have tried virtual-reality goggles, cannot really say what virtual-reality is.

The naïve answer is that virtual reality provides immersive three-dimensional world view. That is, when thinking about virtual reality, people think about the display. And they could be excused for doing so by the fact that the physical devices appear to be focus so much on displaying pixels. We have goggles with embedded screens, and so forth.

But, actually, I submit to you that the display is not entirely essential. Of course, you need perception for an experience to make sense, but you could have virtual reality without any light whatsoever. You would probably have to focus on sounds, touch, and smell.

Virtual reality also does not need to be realistic. It is not at all obvious that the more realistic the representation, the better it is. You could have great experiences with a cartoonish worldview. That would side-step the uncanny-valley issue. I actually suspect that some of the best applications of virtual reality will not involve photo-realistic worldviews.

What actually matters with virtual reality is that it engages your whole body. That’s the crucial point. When you use a computer, your fingers (mostly three of them on each hand) do most of the work. I can sit in my campus office working, and because the lights are automated, it might go dark just as I am finishing off a sentence… because I am hardly moving at all when I work in a traditional manner with my computer.

If you were paralyzed, virtual reality would not help you in the least. At a minimum, for virtual reality to make any kind of sense, you must be able to move your head around. It is not so with traditional computing where as long as you can move your arms and use your fingers, your head can be mostly stationary.

I believe that it explains in part how virtual reality affects our perception regarding the flow of time. Virtual reality is somewhat tiring, compared with sitting at a desk, so fifteen minutes of interaction in virtual reality feel (as far as the body is concerned) as tiring as hours sitting at a desk. Thus, time is somewhat accelerated in virtual reality.

But I also theorize that virtual reality affects how you think in a less trivial manner. It favors embodied cognition. An athlete or a chef has a particular type of intelligence where the space around them becomes an extension of their own mind.

It is easy to dismiss such ideas as verging on the mysticism. Yet it is undeniable that we think differently when our bodies are involved. I have now reached a point where I set a clear separating line between in visu meetings and videoconferences. They are drastically different experiences, resulting in very different cognitive outcomes. For example, I believe that it makes no sense to conduct job interviews using video conferencing. And I say this as a nerd that avoids social interactions whenever possible.

That is, the view that we are brains in a jar is hopelessly naïve and wrong. The idea that we “think with our brains” is, in my view, only true as a first approximation. There is a continuum between our brain cells and the objects around us. A spider without a web is a useless animal. The spider uses its web as an extension of itself, to measure distances, track directions, and even as a perception device. Human beings do not have physical webs coming out of their hands, but we are simply much more advanced spiders, with the ability to create our own webs, like the world-wide-web.

I believe that many of the paradigm shifts that we have encountered as intellectuals come about through changes that have little to do with pure reason and a lot to do with our bodies and their perception:

  • Museums often present very little textual information. Mostly, you get to see, and often touch, artifact. It is through the presentation of inanimate objects that people acquire a feeling of how things were many centuries ago. Try, as an experiment, to view a three-dimensional representation of the object on a screen. It is not the same! The idea that you should collect and display objects to convey information is not entirely trivial, and yet we take it for granted today.
  • Though we might credit much of the rise of statistics to the formal mathematical results introduced by famous mathematicians… I believe that we should rather credit authors such as Playfair for introducing the modern-day line graph (in 1786!). If that’s all you had, you could still study effectively inequalities and climate change. But plots are much less rational than it appears: if you were to present line charts to people and ask them to describe what they see, they would have a hard time elaborating beyond a first-level interpretation. And the provided linguistic description would not allow others to understand what was in the graph. There is more in a graph than we can tell. In some sense, it is also easier to lie with statistics than with a plot: try plotting your own weight over the last few months… and compare the result with whatever statistical rationalization you might come up with. Lying with a plot requires a more deliberate attitude. I believe that there is a deeper story to be told about the relationship between the emergence of science and the scientific method: it seems clear that the line graph preceded science. I believed that it might have played an important role.
  • The industrial revolution came about after we got to experience automatons, these popular toys from the Victorian era (and earlier) where one could see gears moving underneath. The physical reality of these devices and the fact that you could, as a kid, look at them and eventually hold the gears in your hands, probably made a huge difference.
  • The early computers were programmed using plugs and cards… but soon we imported the keyboard into computing… the keyboard is an obvious cognitive extension first created to help us make music more precisely. Without the keyboard we would not have modern-day programming, that much is certain. Isn’t it amazing how we went from musical instrument to software programming?

All of these examples illustrate how altering our environment even in a minute way allowed us to think better.

My theory is that there are entire threads of thoughts, that we cannot have yet, that we cannot even imagine, but that virtual reality will enable.

There are still massive challenges, however. One of them is affordance. For example, many virtual-reality games and systems use the concept of “teleportation” to move you from one point to another. In my view, this is deeply wrong: it uses your hand as a pointing device, just as you would do in conventional computing. Grabbing and moving objects, interacting with objects in general, is awkward in virtual reality. I don’t think we know how to enter text in virtual reality. There is also a bandwidth issue. The screens of current virtual-reality goggles have a relatively low resolution which makes reading small fonts difficult, and reading in general is unpleasant. Interactions are also at a relatively large scale: you cannot use fine motor control to flip a small switch. Everything has to be large and clunky.

Still. I think that chances are good that new world-changing paradigms are made possible by virtual reality. It should allow us to build better webs as the spiders that we are.

The Harvey-Weinstein scientific model

It is widely believed that science is the process by which experts collectively decide on the truth and post it up in “peer-reviewed journals”. At that point, once you have “peer-reviewed research articles” then the truth is known.

Less naïve people raise the bar somewhat. They are aware that individual research articles could be wrong, but they believe that science is inherently self-correcting. That is, if the scientific community reports a common belief, then this belief must be correct.

Up until at least 1955, well-regarded biology textbooks reported that we had 24 chromosomes. The consensus was that we had 24 chromosomes. It had to be right! It took a young outsider to contradict the consensus. In case you are wondering, we have 23 pairs of chromosomes, a fact that is not difficult to verify.

The theory of continental drift was ridiculed for decades. Surely, continents could not move. But they do. It took until 1968 before the theory could be accepted.

Surely, that’s not how science works. Surely, correct theories win out quickly.

But that’s not what science is. In Feynman’s words:

Our freedom to doubt was born out of a struggle against authority in the early days of science. It was a very deep and strong struggle: permit us to question — to doubt — to not be sure. I think that it is important that we do not forget this struggle and thus perhaps lose what we have gained.

That is, science is about doubting everything, especially the experts.

Many people struggle with this idea… that truth should be considered to be an ideal that we can’t ever quite reach… that we have to accept that constant doubt, about pretty much everything, is how we do serious work… it is hard…

Let me take recent events as an illustration. Harvey Weinstein was a movie producer who, for decades, abused women. Even though it was widely known, it was never openly reported and denounced.

How could this be? How could nobody come forward for years and years and years… when the evidence is overwhelming?

What is quite clear is that it happens, all the time. It takes a lot of courage to face your peers and tell them that they are wrong. And you should expect their first reaction to be rejection. Rejection is costly.

The scientific model is not what you are taught in school… it is what Feynman describes… an ability to reject “what everyone knows”… to speak a different truth that the one “everyone knows”.

The public consensus was that Harvey Weinstein was a respected businessman. The consensus is not about truth. It is a social construction.

Science is not about reaching a consensus, it is about doubting the consensus. Anyone who speaks of a scientific consensus badly misunderstands how science works.

Science and Technology links (October 13th, 2017)

Rodney Brooks, who commercialized robots that can vacuum your apartment, has written a great essay on artificial intelligence. It is worth reading.

There is some concern that the computers necessary to control a self-driving car will use so much power that they will significantly increase the energy usage of our cars.

Facebook will commercialize the Oculus Go, a $200 autonomous virtual-reality headset.

Bee-level intelligence

How close are we to having software that can emulate human intelligence? It is hard to tell. One problem with human beings is that we have large brains, with an almost uncountable number of synapses. We have about 86 billion neurons. This does not seem far from the 4.3 billion transistors that can be found in the last iPhone… but a neuron cannot be compared with a transistor, a very simple thing.

We could more fairly compare transistors with synapses (connections between neurons). Human males have about 150 trillion synapses and human females about 100 trillion synapses.

We do not have any computer that approaches 100 trillion transistors. This means that even if we thought we had the algorithms to match human intelligence, we could still fall short simply because our computers are not powerful enough.

But what about bees? Bees can fly, avoid obstacles, find flowers, come back, tell other bees where to find the flowers, and so forth. Bees can specialize, they can communicate, they can adapt. They can create amazing patterns. They can defeat invaders. They can fight for their life.

I don’t know about you, but I don’t know any piece of software that exhibits this kind of intelligence.

The honey bee has less than a million neurons and it has about a billion synapses. And all of that is determined by only about 15,000 genes (and most of them have probably nothing to do with the brain). I don’t know how much power a bee’s brain requires, but it is far, far less than the least powerful computer you have ever seen.

Bees don’t routinely get confused. We don’t have to debug bees. They tend to survive even as their environment changes. So while they may be well tuned, they are not relying on very specific and narrow programs.

Our most powerful computers have 100s of billions of transistors. This is clearly not far from the computing power of a bee, no matter how you add things up. I have a strong suspicion that most of our computers have far more computing power than a bee’s brain.

What about training? Worker bees only live a few weeks. Within hours after birth, they can spring into action, all of that from very little information.

What I am really looking forward, as the next step, is not human-level intelligence but bee-level intelligence. We are not there yet, I think.

Post-Blade-Runner trauma: From Deep Learning to SQL and back

Just after posting my review of the movie Blade Runner 2049, I went to attend the Montreal Deep Learning summit. Deep Learning is this “new” artificial-intelligence paradigm that has taken the software industry by storm. Everything, image recognition, voice recognition, and even translation, has been improved by deep learning. Folks who were working on these problems have often been displaced by deep learning.

There has been a lot of “bull shit” in artificial intelligence, things that were supposed to help but did not really help. Deep learning does work, at least when it is applicable. It can read labels on pictures. It can identify a cat in a picture. Some of the time, at least.

How do we know it works for real? It works for real because we can try it out every day. For example, Microsoft has a free app for iPhones called “Seeing AI” that lets you take arbitrary pictures. It can tell you what is on the picture with remarkable accuracy. You can also go to deepl.com and get great translations, presumably based on deep-learning techniques. The standard advice I provide is not to trust the academic work. It is too easy to publish remarkable results that do not hold up in practice. However, when Apple, Google and Facebook start to put a technique in their products, you know that there is something of a good idea… because engineers who expose users to broken techniques get instant feedback.

Besides lots of wealthy corporations, the event featured talks by three highly regarded professors in the field: Yoshua Bengio (Université de Montréal), Geoffrey Hinton (University of Toronto) and Yann LeCun (New York University). Some described it as a historical event to see these three in Montreal, the city that saw some of the first contemporary work on deep learning. Yes, deep learning has Canadian roots.

For some context, here is what I wrote in my Blade Runner 2049 review:

Losing data can be tragic, akin to losing a part of yourself.

Data matters a lot. A key feature that makes deep learning work in 2017 is that we have lots of labeled data with the computers to process this data at an affordable cost.

Yoshua Bengio spoke first. Then, as I was listening to Yoshua Bengio, I randomly went to my blog… only to discover that the blog was gone! No more data.

My blog engine (wordpress) makes it difficult to find out what happened. It complained about not being able to connect to the database which sent me on a wild hunt to find out why it could not connect. Turns out that the database access was fine. Why was my blog dead?

I carried with me to the event my smartphone and an iPad. A tablet with a pen is a much better supporting tool when attending a talk. Holding a laptop on your lap is awkward.

Next, Geoffrey Hinton gave a superb talk, though I am sure non-academics will think less of him than I do. He presented recent, hands-on results. Though LeCun, Bengio and Hinton supposedly agree on most things, I felt that Hinton presented things differently. He is clearly not very happy about deep learning as it stands. One gets the impression that he feels that whatever they have “works”, but it is not because it “works” that it is the right approach.

Did I mention that Hinton predicted that computers would have common-sense reasoning within 5 years? He did not mention this prediction at the event I was at, though he did hint that major breakthroughs in artificial intelligence could happen as early as next week. He is an optimistic fellow.

Well. The smartest students are flocking to deep learning labs if only because that is where the money is. So people like Hinton can throw graduate students at problems faster than I can write blog posts.

What is the problem with deep learning? For the most part, it is a brute force approach. Throw in lots of data, lots of parameters, lots of engineering and lots of CPU cycles, and out comes good results. But don’t even ask why it works. That is not clear.

“It is supervised gradient descent.” Right. So is Newton’s method.

I once gave a talk about the Slope One algorithm at the University of Montreal. It is an algorithm that I designed and that has been widely used in e-commerce systems. In that paper, we set forth the following requirements:

  • easy to implement and maintain: all aggregated data should be easily interpreted by the average engineer and algorithms should be easy to implement and test;
  • updateable on the fly;
  • efficient at query time: queries should be fast.

I don’t know if Bengio was present when I gave this talk, but it was not well received. Every point of motivation I put forward contradicts deep learning.

It sure seems that I am on the losing side of history on this one, if you are an artificial intelligence person. But I do not do artificial intelligence, I do data engineering. I am the janitor that gets you the data you need at the right time. If I do my job right, artificial intelligence folks won’t even know I exist. But you should not make the mistake of thinking that data engineering does not matter. That would be about as bad as assuming that there is no plumbing in your building.

Back to deep learning. In practical terms, even if you throw deep learning behind your voice assistant (e.g., Siri), it will still not be able to “understand” you. It may be able to answer correctly to common queries, but anything that is unique will throw it off entirely. And your self-driving car? It relies on very precise maps, and it is likely to get confused at anything “unique”.

There is an implicit assumption in the field that deep learning has finally captured how the brain works. But that does not seem to be quite right. I submit to you that no matter how “deep” your deep learning gets, you will not pass the Turing test.

The way the leading deep-learning researchers describe it is by saying that they have not achieved “common sense”. Common sense can be described as the ability to interpolate or predict from what you know.

How close is deep learning to common sense? I don’t think we know, but I think Hinton believes that common sense might require quite different ideas.

I pulled out my iPad, and I realized after several precious minutes that the database had been wiped clean. I am unsure what happened… maybe a faulty piece of code?

Because I am old, I have seen these things happen before: I destroyed the original files of my Ph.D. thesis despite having several backup copies. So I have multiple independent backups of my blog data. I had never needed this backup data before now.

Meanwhile, I heard Yoshua Bengio tell us that there is no question now that we are going to reach human-level intelligence, as a segue into his social concerns regarding how artificial intelligence could end up in the wrong hands. In the “we are going to reach human-level intelligence”, I heard the clear indication that he included himself has a researcher. That he means to say that we are within striking distance of having software that can match human beings at most tasks.

Because it is 2017, I was always watching my twitter feed and noticed that someone I follow had tweeted about one of the talks, so I knew he was around. I tweeted him back, suggesting we meet. He tweeted me back, suggesting we meet for drinks upstairs. I replied back that I was doing surgery on a web application using an iPad.

It was the end of the day by now, everyone was gone. Well. The Quebec finance minister was giving a talk, telling us about how his government was acutely aware of the importance of artificial intelligence. He was telling us about how they mean to use artificial intelligence to help fight tax fraud.

Anyhow, I copied a blog backup file up to the blog server. I googled the right command to load up a backup file into my database. I was a bit nervous at this point. Sweating it as they say.

You see, even though I taught database courses for years, and wrote research papers about it, even designed my own engines, I still have to look up most commands whenever I actually work on a database… because I just so rarely need to do it. Database engines in 2017 are like gasoline engines… we know that they are there, but rarely have to interact directly with them.

The minister finished his talk. Lots of investment coming. I cannot help thinking about how billions have already been invested in deep learning worldwide. Honestly, at this point, throwing more money in the pot won’t help.

After a painful minute, the command I had entered returned. I loaded up my blog and there it was. Though as I paid more attention, I noticed that last entry, my Blade Running 2049 post, was gone. This makes sense because my backups are on a daily basis, so my database was probably wiped out before my script could grab a copy.

What do you do when the data is gone?

Ah. Google creates a copy of my post to serve them to you faster. So I went to Twitter, looked up the tweet where I shared my post, followed the link and, sure enough, Google served me the cached copy. I grabbed the text, copied it over and recreated the post manually.

My whole system is somewhat fragile. Securing a blog and doing backups ought to be a full-time occupation. But I am doing ok so far.

So I go meet up my friend for drinks, relaxed. I snap a picture or two of the Montreal landscape while I am at it. Did I mention that I grabbed the pictures on my phone and I immediately shared them with my wife, who is an hour away? It is all instantaneous, you know.

He suggests that I could use artificial intelligence in my own work, you know, to optimize software performance.

I answer with some skepticism. The problems we face with data engineering are often architectural problems. That is, it is not the case that we have millions of labeled instances from which to learn from. And, often, the challenge is to come up with a whole new category, a whole new concept, a whole new architecture.

As I walk back home, I listen to a podcast where people discuss the manner in which artificial intelligence can exhibit creativity. The case is clear that there is nothing magical in human creativity. Computers can write poems, songs. One day, maybe next week, they will do data engineering better than us. By then, I will be attending research talks prepared by software agents.

As I get close to home, my wife texts me. “Where are you?” I text her. She says that she is 50 meters away. I see in the distance, it is kind of dark, a lady with a dog. It is my wife with her smartphone. No word was spoken, but we walk back home together.

On Blade Runner 2049

Back in 1982, an incredible movie came out, Blade Runner. It told the story of “artificial human beings” (replicants) that could pass as human beings, but had to be hunted down. The movie was derived from a novel by Philip Dick.

It took many years for people to “get” Blade Runner. The esthetic of the movie was like nothing else we had seen at the time. It presented a credible and dystopian futuristic Los Angeles.

As a kid, I was so deeply engaged in the movie that I quietly wrote my own fan fiction, on my personal typewriter. Kids like me did not own computers at the time. To be fair, most of the characters in Blade Running also do not own computers, even if the movie is set in 2019. Like in the Blade Runner universe, I could not distribute my prose other than on paper. It has now been lost.

Denis Villeneuve made a follow-up called Blade Runner 2049. You should go see it.

One of the core point that Dick made in his original novel was that human beings could be like machines while machines could be like human beings. Villeneuve’s Blade Runner 2049 embraces this observation… to the point where we can reasonably wonder whether any one of the characters is actually human. Conversely, it could be argued that they are all human.

Like all good science fiction, Blade Runner 2049 is a commentary about the present. There is no smartphone, because Blade Runner has its own technology… like improbable dense cities and flying cars… but the authors could not avoid our present even if they wanted to.

What we find in Blade Runner 2049 are companies that manage memories, as pictures and short films. And, in turn, we find that selection of these memories has the ability to change us… hopefully for the better. Yet we find that truth can be difficult to ascertain. Did this event really happen, or is it “fake news”? Whoever is in charge of managing our memories can trick us.

Blade Runner 2049 has voice assistants. They help us choose music. They can inform us. They can be interrupted and upgraded. They come from major corporations.

In Blade Running 2049, there is a cloud (as in “cloud computing”) that makes software persistent and robust. Working outside the cloud remains possible if you do not want to be tracked, with the caveat that the information can easily be permanently destroyed.

Losing data can be tragic, akin to losing a part of yourself.

Death, life, it all comes down to data. That is, while it was easy prior to the scientific revolution to view human beings as special, as having a soul… the distinction between that which has a soul and which that does not because increasingly arbitrary in the scientific (and information) age. I am reminded of Feynman’s observation:

To note that the thing I call my individuality is only a pattern or dance, that is what it means when one discovers how long it takes for the atoms of the brain to be replaced by other atoms. The atoms come into my brain, dance a dance, and then go out – there are always new atoms, but always doing the same dance, remembering what the dance was yesterday. (Richard Feynman, The value of science, 1955)