Memory-level parallelism: Intel Skylake versus Apple A12/A12X

Modern processors execute instructions in parallel in many different ways: multi-core parallelism is just one of them. In particular, processor cores can have several outstanding memory access requests “in flight”. This is often described as “memory-level parallelism”. You can measure the level of memory-level parallelism your processors has by traversing an array randomly either by following one path, or by following several different “lanes”. We find that recent Intel processors have about “10 lanes” of memory-level parallelism.

It has been reported that Apple’s mobile processors are competitive (in raw power) with Intel processors. So a natural question is to ask whether Apple’s processors have more or less memory-level parallelism.

The kind of memory-level parallelism I am interested in has to do with out-of-cache memory accesses. Thus I use a 256MB block of memory. This is large enough not to fit into a processor cache. However, because it is so large, we are likely to suffer from a virtual-memory-related fault. This can significantly limit memory-level parallelism if the page sizes are too small. By default on the Linux distributions I use, the pages span 4kB (whether on 64-bit ARM or x64). Empirically, that is too small. Thankfully, it is easy to reconfigure the pages so that they span 2MB or more (“huge pages”). On Apple’s devices, whether it be an iPhone or an iPad Pro, I believe that the pages always span 16kB and that this cannot be easily reconfigured.

Before I continue, let me present the absolute timings (in second) using a single lane (thus no memory-level parallelism). Apple makes two version of its most recent processor, the A12 (in the iPhone) and the A12X (in the iPad Pro).

Intel skylake (4kB pages)0.73 s
Intel skylake (2MB pages)0.61 s
Apple A12 (16kB pages)0.96 s
Apple A12X (16kB pages)0.97 s
Apple A10X (16kB pages)1.15 s

According to these numbers, the Intel server has the upside over the Apple mobile devices. But that’s only part of the story. What happens as you increase the number of lanes (while keeping the code single threaded) is interesting. As you increase the number of lanes, Apple processors start to beat the Intel Skylake in absolute, raw speed.

Another way to look at the problem is to measure the “speedup” due to the memory-level parallelism: we divide the time it takes to traverse the array using 1 lane by the time it takes to do so using X lane. We see that the Intel Skylake processor is limited to about a 10x or 11x speedup whereas the Apple processors go much higher.

Thoughts:

  1. I’d be very interested in knowing how Qualcomm and Samsung processors compare.
  2. It goes without saying that my server-class Skylake machine uses a lot more power than the iPhone.
  3. If I could increase the page size on iOS, we would get even better numbers for the Apple devices.
  4. The fact that the A12 has higher timings when using a single lane suggests that its memory subsystem has higher latency than a Skylake-based PC. Why is that? Could Apple just crank up the frequency of the DRAM memory and beat Intel throughout?
  5. Why is Intel limited to 10x memory-level parallelism? Why can’t they do what Apple does?

Credit: I owe much of the design of the experiment and C++ code to Travis Downs, with help from Nathan Kurz. The initial mobile app for Apple devices was provided by Benoît Maison, you can find it on GitHub along with the raw results and a “console” version that runs under macOS and Linux. I owe the A12X numbers to Stuart Carnie and the A12 numbers to Victor Stewart.

Further reading: Memory Latency Components

Science and Technology links (November 10th, 2018)

  1. It already takes more energy to operate Bitcoin than to mine actual gold. Cryptocurrencies are responsible for millions of tons of CO2 emissions. (Source: Nature)
  2. Half of countries have fertility rates below the replacement level, so if nothing happens the populations will decline in those countries” (source:BBC)
  3. According to Dickenson et al., 8.6% of us (7.0% of women and 10.3% of men) have difficulty controlling sexual urges and behaviors.
  4. A frequently prescribed drug family (statins) can increase your risk of suffering from ALS by a factor of 10 or 100.
  5. Countries were people are expected to live longest in 2040 are Spain, Japan, Singapore, Switzerland, Portugual, Italy, Israel, France, Luxembourgh, Australia. Not included in this list is the USA.
  6. Smart mirrors could monitor your mood, fitness, anxiety levels, heart rate, skin condition, and so forth.
  7. When you are trying to determine whether a drug is effective, it is tempting to look at published papers and see whether they all agree on the efficacity of the drug. This may be quite wrong: Turner et al. show a strong bias whereas negative results are never published.

    Studies viewed by the FDA as having negative or questionable results were, with 3 exceptions, either not published (22 studies) or published in a way that, in our opinion, conveyed a positive outcome (11 studies). According to the published literature, it appeared that 94% of the trials conducted were positive. By contrast, the FDA analysis showed that 51% were positive. Separate meta-analyses of the FDA and journal data sets showed that the increase in effect size ranged from 11 to 69% for individual drugs and was 32% overall.

    Simply put, it is far easier and profitable to publish positive results so that’s what you get.

    This means that, by default, you should always downgrade the optimism of the litterature.

    Simply put: don’t be too quick to believe what you read, even if it is comes in the form of a large set of peer-reviewed research papers.

  8. Richard Jones writes “Motivations for some of the most significant innovations weren’t economic“.
  9. Cable and satellite TV is going away.
  10. “What if what students really want is not to be learners, but alumni?” People will prefer an academically useless program from Harvard to a complete graduate program from a lowly school because they badly want to say that they went to Harvard.
  11. Drinking coffee abundantly protects from neurodegenerative diseases.

Measuring the memory-level parallelism of a system using a small C++ program?

Our processors can issue several memory requests at the same time. In a multicore processor, each core has an upper limit on the number of outstanding memory requests, which is reported to be 10 on recent Intel processors. In this sense, we would like to say that the level of memory-level parallelism of an Intel processor is 10.

To my knowledge, there is no portable tool to measure memory-level parallelism so I took fifteen minutes to throw together a C++ program. The idea is simple: we visit N random locations in a big array. We make sure that the processor cannot tell which location we will visit next before the previous location has been visited. There is a data dependency between memory accesses. We can break this memory dependency by dividing up the task between different “lanes”. Each lane is independent (a bit like a thread). The total number of data accesses is fixed. Up to some point, having more lane should speed things up due to memory-level parallelism. I used the term “lane” so that there is no confusion with “threads” and multicore processing: my code is entirely single-threaded.

  size_t howmanyhits_perlane 
         = howmanyhits / howmanylanes;
  for (size_t counter = 0; 
      counter < howmanyhits_perlane; counter++) {
    for (size_t i = 0; i < howmanylanes; i++) {
      size_t laneindexes = hash(lanesums[i] + i);
      lanesums[i] += bigarray[laneindexes];
    }
  }

Methodologically, I increase the number of lanes until adding one more benefits the overall speed by less than 5%. Why 5%? No particular reason: I needed a threshold of some kind. I suspect that I slightly underestimate the maximal amount of memory-level parallelism: it would take a finer analysis to make a more precise measure.

I run the test three times and check that it gives three times the same integer value. Here are my (preliminary) results:

Intel Haswell7
Intel Skylake9
ARM Cortex A575

My code is available.

On a multicore systems, there is more memory-level parallelism, so a multithreaded version of this test could deliver higher numbers.

Credit: The general idea was inspired by an email from Travis Downs, though I take all of the blame for how crude the implementation is.

Science and Technology links (November 3rd, 2018)

  1. Bitcoin, the cryptocurrency, could greatly accelerate climate change, should it succeed beyond its current speculative state.
  2. Crows can solve novel problems very quickly with tools they have never seen before.
  3. The new video game Red Dead Redemption 2 made $725 million in three days.
  4. Tesla, the electric car company, is outselling Mercedes Benz and BMW while making a profit.
  5. Three paralyzed men are able to walk again thanks to spinal implants (source: New York Times). There are nice pictures.
  6. Human beings live longer today than ever. In the developed world, between 1960 and 2010, life expectancy at birth went up by nearly 20 years. It consistently goes up by about 0.12 years per year. However, it is not yet clear how aging and death have evolved over time. Some believe that there is a “compression” effect: more and more of us reach a maximum, and then we suddenly all die at around the same age. This would be consistent with a hard limit on human lifespan and I think it is the scenario most biologists would expect. There is also the opposite model: while most of us die at around the same age, some lucky ones survive much longer. According to Zuo et al. (PNAS) both models are incorrect statistically. Instead, the curve is advancing as a wave front. This means that as far as death is concerned, being 68 today is much like being 65 a generation ago. This is surprising.

    (…) we find no support for an approaching limit to human lifespan. Nor do our results suggest that endowments, biological or other, are a principal determinant of old-age survival.

    Assuming that Zuo et al. are correct, I do not think we have a biological model at the ready to explain this statistical phenomenon.

  7. Suppose that you gave a cocktail of drugs approved for human consumption to worms. By how much do you think you could extend their lifespan? The answer is at least by a factor of two. They tried their best cocktails with fruit flies and showed benefits there as well. It is much harder to manipulate the lifespan of large mammals like human beings, but these results support the theory that drug cocktails could increase human lifespans. They may already being doing so.
  8. Amazon is hiring fewer workers, maybe because it is getting better at automation. (speculative) It seems that Amazon is mostly denying the story, hinting that they are still creating more and more jobs.
  9. No primate except for human beings, undergoes menopause. Very few animals have menopause: primarily some whales and human beings. I don’t think we know why menopause evolved.
  10. Total direct greenhouse gas emissions from U.S. livestock have declined 11.3 percent since 1961, while production of livestock meat has more than doubled.
  11. Male and female animals respond very differently to anti-aging strategies and they age very differently:

    One particularly odd thing in humans is that though women live longer, they are nonetheless more prone to miserable but non-deadly ailments such as arthritis (…) Lethal illnesses such as heart disease and cancer strike men more often. Although Alzheimer’s strikes women more than men, for unknown reasons.

    We do not know why there is such a sharp difference between males and females regarding health and longevity. However, some believe that the current historical fact that women live many years more than men is due to the fact that antibiotics disproportionally helped the health of women.

  12. Vegans more frequently suffer from bone fractures.
  13. Teaching by presenting worked examples seems to be most efficient. Students get the best grades with the least work.This appears self-evident to me. It is curious why worked examples are not more prevalent in teaching.
  14. A company called Grifols claims to have a drug that can measurably slow down the progression of Alzheimer’s. For context, we currently have no therapy to slow or reverse Alzheimer’s, so even a small positive effect would be a tremendous breakthrough. However, there has been many, many false news regarding Alzheimer’s and this report appears quite preliminary.

Science and Technology links (October 28th, 2018)

  1. If you take kids born in the 1980s, who do you think did better, the rich kids or the poor kids? The answer might surprise you:

    The children from the poorest families ended up twice as well-off as their parents when they became adults. The children from the poorest families had the largest absolute gains as well. Children raised in the top quintile did no better or worse than their parents once those children became adults.

  2. Some of our cells become senescent: they are disfunctional and create trouble. We believe that it contributes to age-related diseases. Fisetin is a drug (available a supplement) that kills senescent cells and extends (median and maximal) lifespan in mice. I do not recommend taking fisetin at this time, unless you are a mice.
  3. Vegetarians report lower self-esteem, lower psychological adjustment, less meaning in life, and more negative moods. I have no idea what to make of this, apparently robust, finding. I was a vegetarian in my 20s and I was also subject to depression. I would never think that I was depressive because I ate no meat.
  4. The sea rises at a rate of 3 mm per year. It has been rising for thousands of years. Taking into account the acceleration that we anticipate due to climate change, we can expect the sea to have risen by 65 cm in 2100. Does that mean that islands will go under? Maybe not: in a study, only 14% of islands exhibited a reduction in area whereas 43% increased in size.
  5. Most processors today, outside the tiny embedded ones, use a 64-bit architecture, which means that they can process data in chunks on 64 bits very quickly. This has all sorts of benefits. A 32-bit processor, for example, has trouble counting to 5 billion. It is difficult, if not impossible, for a 32-bit software application to use more than 4GB of memory. Microsoft still publishes Windows in two editions, the 32-bit edition and the 64-bit edition. The purpose of the 32-bit edition is to support legacy applications. The two major graphics card makers (AMD and NVIDIA) have now stopped producing drivers for 32-bit operating systems. Thus, at least as far as gaming is concerned, 32-bit Windows is dying. Microsoft has promoted a 64-bit Windows by default on new computers since at least 2009.
  6. It seems that 70% of the American soldiers are “overweight”. I find it hard to believe that 60% of all American marines are overweight. Because this was determined using the body-mass-index approach, it is also possible that American soldiers are simply very muscular. Yet another statistics tells us that nearly 40% of all soldiers have a chronic medical condition and 8.6% take sleeping pills. So maybe American soldiers are not as fit as I would expect.
  7. It is often believed that men who have more testosterone have an easier time building muscle mass. It turns out that this is false, the amount of testosterone is not relevant in healthy young men.
  8. In the USA, health care costs are predicted to continue to grow at a rate of over 4%. The economy as a whole is predicted to grow at a rate between 1.4% and 2% a year on the long term. The net result is a gap of about 2% a year. If sustained over many decades, this gap would lead to the bulk of the American economy invested in health spending. People who are 65-year old or older account for a third of all health spending while young female (19 to 44) spend twice as much as their male counterparts.
  9. Cheese and yogurt are correlated with fewer cardiovascular diseases.
  10. The Haruhi Problem seeks the smallest string containing all permutations of a set of n elements. The first known solution to this problem was published anynomously on an anime posting board. A formal analysis is being written up.
  11. Cardiorespiratory fitness is associated with longevity:

    In this cohort study of 122 007 consecutive patients undergoing exercise treadmill testing, cardiorespiratory fitness was inversely associated with all-cause mortality without an observed upper limit of benefit. Extreme cardiorespiratory fitness (≥2 SDs above the mean for age and sex) was associated with the lowest risk-adjusted all-cause mortality compared with all other performance groups.

Is WebAssembly faster than JavaScript?

Most programs running on web sites are written in JavaScript. There are still a few Java applets and other plugins hanging around, but they are considered obsolete at this point.

While JavaScript is superbly fast, some people feel that we ought to do better. That’s where WebAssembly comes in. It is a binary (“pre-compiled”) format that is made to load quickly. It still needs to get compiled or interpreted, but, at least, you do not need to parse JavaScript source code.

The general idea is that you write your code in C, C++ or Rust, then you compile it to WebAssembly. In this manner, you can port existing C or C++ programs so that they run on Web pages. That’s obviously useful if you already have the C and C++ code, but less appealing if you are starting a new project from scratch. It is far easier to find JavaScript front-end developers in almost any industry, except maybe gaming.

I think it is almost surely going to be more labor intensive to program web applications using WebAssembly.

In any case, I like speed so I was interested so I asked a student of mine (M. Fall) to work on the problem. We picked small problems with hand-crafted code in C and JavaScript.

Here are the preliminary conclusions:

  1. In all cases we considered, the total WebAssembly files were larger than the corresponding JavaScript source code, even without taking into account that the JavaScript source code can be served in compressed form. This means that if you are on a slow network connection, JavaScript programs will start faster.

    The story may change if you build large projects. Moreover, we compared against human-written JavaScript, and not automatically generated JavaScript.

  2. Once the WebAssembly files are in the cache of the browser, they load faster than the corresponding JavaScript source code, but the difference is small. Thus if you are frequently using the same application, or if the web application resides on your machine, WebAssembly will start faster. However, the gain is small. One reason why the gain is small is that JavaScript loads and starts very quickly.
  3. WebAssembly (compiled with full optimization) is often slower than JavaScript during execution, and when WebAssembly is faster, the gain is small. Browser support is also problematic: while Firefox and Chrome have relatively fast WebAssembly execution (with Firefox being better), we found Microsoft Edge to be quite terrible. WebAssembly on Edge is really slow.

    Our preliminary results contradict several reports, so you should take them with a grain of salt. However, benchmarking is ridiculously hard especially when a language like JavaScript is involved. Thus anyone reporting systematically better results with WebAssembly should look into how well optimized the JavaScript really is.

While WebAssembly might be a compelling platform if you have a C++ game you need to port to the Web, I would bet good money that WebAssembly is not about to replace JavaScript for most tasks. Simply put, JavaScript is fast and convenient. It is going to be quite difficult to do better in the short run.

It is still deserving of attention since the uptake on WebAssembly has been fantastic. For online games, it has surely a bright future.

More content: WebAssembly and the Death of JavaScript (video) by Colin Eberhardt

Further reading: Egorov’s Maybe you don’t need Rust and WASM to speed up your JS; Haas et al., Bringing the Web up to Speed with WebAssembly; Herrera et al., WebAssembly and JavaScript Challenge: Numerical program performance using modern browser technologies and devices.

Science and Technology links (October 20th, 2018)

  1. Should we stop eating meat to combat climate change? Maybe not. White and Hall worked out what happened if the US stopped using farm animals:

    The modeled system without animals (…) only reduced total US greenhouse gas emissions by 2.6 percentage units. Compared with systems with animals, diets formulated for the US population in the plants-only systems (…) resulted in a greater number of deficiencies in essential nutrients. (source: PNAS)

    Of concern when considering farm animals are methane emissions. Methane is a potent greenhouse gas, with the caveat that it is short-lived in the atmosphere unlike CO2. Should we be worried about methane despite its short life? According to the American EPA (Environmental Protection Agency), total methane emissions have been falling consistently for the last 20 years. That should not surprise us: greenhouse gas emissions in most developed countries (including the US) have peaked some time ago. Not emissions per capita, but total emissions.

    So beef, at least in the US, is not a major contributor to climate change. But we could do even better. Several studies like Stanley et al. report that well managed grazing can lead to carbon sequestration in the grassland. Farming in general could be more environmentally effective.

    Of course, if people consume less they will have a smaller environmental footprint, but going vegan does not imply that one consumes less. If you save in meat but reinvest in exotic fruits and trips to foreign locations, you could keep your environmental footprint the same.

    There are certainly countries were animal grazing is an environmental disaster. Many industries throughout the world are a disaster and we should definitively put pressure on the guilty parties. But, in case you were wondering, if you live in a country like Canada, McDonald’s is not only serving only locally-produced beef, but they also require that it be produced in a sustainable manner.

    In any case, there are good reasons to stop eating meat, but in the developed countries like the US and Canada, climate change seems like a bogus one.

    There also good reasons to keep farm animals. For example, it is difficult to raise an infant without cow milk and in most countries, it is illegal to sell human milk. Several parents have effectively killed their children by trying to raise them vegan (1, 2). It is relatively easy to match protein and calories with a vegan diet, but meat and milk are nutrient-dense food: it requires some expertise to do away with them.

    Further reading: No, giving up burgers won’t actually save the planet (New York Post).

    (Special thanks to professor Leroy for providing many useful pointers.)

  2. News agencies reported this week that climate change could bring back the plague and the black death that wiped out Europe. The widely reported prediction was made by Professor Peter Frankopan while at the Cheltenham Literary Festival. Frankopan is a history professor at Oxford.
  3. There is a reverse correlation between funding and scientific output, meaning that beyond a certain point, you start getting less science for your dollars.

    (…) prestigious institutions had on average 65% higher grant application success rates and 50% larger award sizes, whereas less-prestigious institutions produced 65% more publications and had a 35% higher citation impact per dollar of funding. These findings suggest that implicit biases and social prestige mechanisms (…) have a powerful impact on where (…) grant dollars go and the net return on taxpayers investments.

    It is well documented that there is diminishing returns in research funding. Concentrating your research dollars into too few individuals is wasteful. My own explanation for this phenomenon is that, Elon Musk aside, we have all have cognitive bottlenecks. One researcher might carry fruitfully two, three major projects at the same time, but once they supervise too many students and assistants, they become a “negative manager”, meaning that make other researchers no more productive and often less productive. They spend less and less time optimizing the tools and instruments.

    If you talk with graduate students who work in lavishly funded laboratories, you will often hear (when the door is closed) about how poorly managed the projects are. People are forced into stupid directions, they do boring and useless work to satisfy project objectives that no longer make sense. Currently, “success” is often defined by how quickly you can acquire and spend money.

    But how do you optimally distribute research dollars? It is tricky because, almost by definition, almost all research is worthless. You are mining for rare events. So it is akin to venture capital investing. You want to invest into many start ups that have a high potential.

  4. A Nature columns tries to define what makes a good PhD student:

    the key attributes needed to produce a worthy PhD thesis are a readiness to accept failure; resilience; persistence; the ability to troubleshoot; dedication; independence; and a willingness to commit to very hard work — together with curiosity and a passion for research. The two most common causes of hardship in PhD students are an inability to accept failure and choosing this career path for the prestige, rather than out of any real interest in research.

Validating UTF-8 bytes using only 0.45 cycles per byte (AVX edition)

When receiving bytes from the network, we often assume that they are unicode strings, encoded using something called UTF-8. Sadly, not all streams of bytes are valid UTF-8. So we need to check the strings. It is probably a good idea to optimize this problem as much as possible.

In earlier work, we showed that you could validate a string using a little as 0.7 cycles per byte, using commonly available 128-bit SIMD registers (in C). SIMD stands for Single-Instruction-Multiple-Data, it is a way to parallelize the processing on a single core.

What if we use 256-bit registers instead?

Reference naive function10 cycles per byte
fast SIMD version (128-bit)0.7 cycles per byte
new SIMD version (256-bit)0.45 cycles per byte

That’s good, almost twice as fast.

A common problem is that you receive as inputs ASCII characters. That’s a common scenario. It is much faster to check that a string in made of ASCII characters than to check that it is made of valid UTF-8 characters. Indeed, to check that it is made of ASCII characters, you only have to check that one bit per byte is zero (since ASCII uses only 7 bits per byte).

It turns out that only about 0.05 cycles are needed to check that a string is made of ASCII characters. Maybe up to 0.08 cycles. That makes us look bad.

You could start checking the file for ASCII characters and then switch to our function when non-ASCII characters are found, but this has a problem: what if the string starts with a non-ASCII character followed by a long stream of ASCII characters?

A quick solution is to add an ASCII path. Each time we read a block of 32 bytes, we check whether it is made of 32 ASCII characters, and if so, we take a different (fast) path. Thus if it happens frequently that we have long streams of ASCII characters, we will be quite fast.

The new numbers are quite appealing when running benchmarks on ASCII characters:

new SIMD version (256-bit)0.45 cycles per byte
new SIMD version (256-bit), w. ASCII path0.088 cycles per byte
ASCII check (SIMD + 256-bit)0.051 cycles per byte

My code is available.

Validating UTF-8 bytes (Java edition)

Strings are just made of bytes. We send and receive bytes over the network all the time. If you know that the bytes you are receiving form a string, then chances are good that it is encoded as UTF-8. Sadly not all streams of bytes can be valid UTF-8 strings. Thus you should check that your bytes can be safely parsed as strings.

In earlier work, I showed that you needed as little as 0.7 cycles per byte to do just that validation. That was in C using fancy instructions.

What about Java?

Designing a good benchmark is difficult. I keep things simple. I generate 1002-byte UTF-8 string made of random (non-ASCII) characters. Then I try to check how quickly different functions validate it.

Here are the contenders:

  1. You can use the standard Java API to entirely decode the bytes, and report false when there is an error:

    CharsetDecoder decoder = 
      StandardCharsets.UTF_8.newDecoder();
    try {
      decoder.decode(
           ByteBuffer.wrap(mydata));		           
    } catch (CharacterCodingException ex) {		        
          return false;
    } 
    return true;
    
  2. You can try a branchless finite-state-machine approach:

    boolean isUTF8(byte[] b) {
        int length = b.length;
        int s = 0;
        for (int i = 0; i < length; i++) {
          s = utf8d_transition[
                 (s + (utf8d_toclass[b[i] & 0xFF])) 
                 & 0xFF];
        }
        return s == 0;
    }
    

    … where utf8d_transition and utf8d_toclass are some arrays where the finite-state machine is coded.

  3. Finally, you can use the isWellFormed from the Guava library. It simply tries to find the first non-ASCII character and then it engages into what is a straight-forward series of if/then/else.
  4. Here are the timings in nanoseconds per 1002-byte strings. I estimate that my processor runs at about 3.4 GHz on average during the test (verified with perf).

    Java API6.7 cycles per byte
    branchless6.0 cycles per byte
    Guava’s if/then/else2.6 cycles per byte

    My code is available.

    The most obvious limitation in my benchmark is that Guava’s if/then/else approach is sensitive to branch mispredictions while my benchmark might not be rich enough to trigger difficult-to-predict branches.

    Credit: The finite-state-machine code was improved by Travis Downs, shaving 1.5 cycles per byte.

    Update: Travis Downs has shown that, indeed, Guava’s approach is much worse than my benchmark implies. The reason it does so well is that the processor learns to predict all branches perfectly well. If you increase the size of the string, or if you use many more strings, then its performance becomes worse. Meanwhile, the finite-state machine can be accelerated by processing the strings in two halves, effectively doubling the processing speed. Yet the Guava might still be the right choice in practice because when you expect the input to be mostly just ASCII characters, it will do well.

Nobel-prize winner Romer on innovation and higher education

Romer was one of the the winners of the Nobel prize in economics this year (2018). He wrote about higher education and innovation. One of his proposals is the introduction of more generous scholarships for students in science and engineering.

His starting point is that to innovate and get richer, we need more and more people doing R&D. Investing more in R&D is not sufficient. You might think that, in the long run, it could be. If firms spend more on R&D, then the wages of scientists and engineers will go up, and this will attract more scientists. However, students lack this information.

One the one hand, giving money to undergraduate students who want to pursue science and engineering is a good way to signal to these students that society wants more scientists and engineers. At the graduate level, though the current funding might be considered adequate, it is very much in the end of (older) professors who can be quite directive. The general intuition, I believe, is that Romer wants to give back to young and bright students some freedom and power in deciding what they want to do. In particular, I think Romer believes that some of these students might be interested in doing work that is less “conventional” (i.e., maybe harder to publish in conventional venues) but more “useful”. In effect, he wants to fight against stagnation in higher education.

In his own words:

Unfortunately, in the last 20 years, innovation policy in the United States has almost entirely ignored the structure of our institutions of higher education. As a result, government programs that were intended to speed up the rate of technological progress may in fact have had little positive effect.

This pattern of outcomes, increased numbers of Ph.D. recipients and steadily worsening academic job prospects, can be explained by increased subsidies for Ph.D. training.

The picture that emerges from this evidence is one dominated by undergraduate institutions that are a critical bottleneck in the training of scientists and engineers, and by graduate schools that produce people trained only for employment in academic institutions (…)

I am not exactly sure why all student unions are not pushing his ideas to governments. It sure sounds attractive: give engineering students more money and get better growth.

I love universities. I have spent almost all my life in them. But, like Romer, I fear that there is a bit too much stagnation. There is insufficient pressure to innovate and too many incentives favouring excess conservatism. Giving back power, not to younger professors, but to actual students, is probably a wise move.

Yet I am somewhat skeptical of Romer’s overall view that increasing the supply of engineers is key. Great many students trained in engineering do nothing even vaguely resembling R&D, and when they do, they do not do it for long. Simply put, graduates go where jobs and money is. It is not the case that we live in a Spiderman world where smart engineering students graduate to go work in an industrial lab developing new robotics.

We need to make the spiderman world possible: make it so that young engineers can start a small company building prototypical exoskeletons for the weaker people to walk again. Do that, and you won’t have a shortage of young new engineers for long.