Two German psychology professors, Knauff and Nejasmic, recently published a paper on the benefits of Microsoft Word over LaTeX. The paper was reported in Nature.

They show that you can copy a simple document (containing little mathematics) faster and with fewer errors when using Word. Of course, those of us using LaTeX are aware of these shortcomings. LaTeX is not suited for quick-and-dirty jobs.

I should stress that Knauff and Nejasmic did not have authors compose a document, let alone craft an actual research article, let alone a sophisticated scientific article. They do not even try to assess the tools in the scientific workflow (data generation, analysis, processing, figure generation, and so on). They compare Word and LaTeX on a data entry job akin to what you might ask from a secretary. They also make no attempt to measure how much of this type of purely secretarial work scientists do… or whether it is representative of what scientists do.

It is clear that Knauff and Nejasmic have been frustrated by their collaboration with computer scientists that expected them to use LaTeX. In this story, they are not objective observers. Their frustration runs deep as they urge publish to restrict or even ban the use of LaTeX. Their main message is in the conclusion:

We therefore suggest that leading scientific journals should consider accepting submissions in LaTeX only if this is justified by the level of mathematics presented in the paper. In all other cases, we think that scholarly journals should request authors to submit their documents in Word (…) preventing researchers from producing documents in LaTeX would save time and money to maximize the benefit of research and development for both the research team and the public.

According to Knauff and Nejasmic, scientists who use LaTeX suffer from cognitive dissonance. To help them and improve science, we should force them to use Microsoft Word.

Given the obvious methodological gaps in their manuscript… how do Knauff and Nejasmic answer? Basically by describing LaTeX users as an irrational sect:

It is astonishing how some commentators ignore the basic principles of scientific decision-making that is, collecting facts, control over variables, using systematic methods, careful measurement, connecting causes and effects, and making rational evidence-based decisions, instead of generalizing personal impressions or opinions. (…) Why do so many people disagree with our conclusions? (…) from the beginning on we were aware that the issue is a highly emotional issue for many LaTeX users. (…) we think that the passion is a special habitus of the LaTeX community.

In the answers to the comments, their main objection regarding LaTeX is collaboration. They write: Word offers the helpful track changes tool, which makes collaboration very easy and efficient. In comparison, LaTeX produces text files. Text files are silly things that do nothing on their own. Except that we know a lot about how to collaborate on text files. It is called version control. Version control allows many people to work a the same time on the same files. There are builtin conflict resolution mechanisms. Everyone has instant access to the latest version of the file and there can be no ambiguity about the revision history of the document. You should be using version control in any case, to store your data and your software. You are not relying on data that has been saved on one of your students’ hard drive, are you?

I applaud Knauff and Nejasmic for trying to improve the productivity of scientists. But I give them a falling grade because there is an unacceptable gap between their conclusion and their actual experiments. It might be that your productivity as a scientist depends critically on whether you use Word or LaTeX for producing research articles. But their experiments tell us nothing about this question. Their paper is an opinion piece, not science.

Computers store numbers in binary form using a fixed number of bits. For example, Java will store integers using 32 bits (when using the int type). This can be wasteful when you expect most integers to be small. For example, maybe most of your integers are smaller than 8. In such a case, a unary encoding can be preferable. In one form of unary encoding, to store the integer x, we first write x zeroes followed by the integer 1. The following table illustrates the idea:

 Number   8-bit binary   unary 
0 00000000 1
1 00000001 01
2 00000010 001

Thus, to code the sequence 0, 0, 1, 2, 0, we might use 1-1-01-001-1 stored as the byte value 11010011. To recover the sequence from a unary coded stream, it suffices to seek the bits with value 1. A naive decoder will simply examine each bit value, in sequence. (See code sample.) We should not expect this approach to be fast.

A common optimization is to use a table look-up. Indeed, we can construct a table with 256 entries where, for each byte value, we store the number of 1s and their position. (See code sample.) This can be several times faster.

A possibly faster alternative is to use the fact that all modern processors have Hamming weight instructions: fast instructions telling you how many bits with value 1 are present in a 64-bit word (popcnt on Intel processors supporting SSE4.2). Similarly, one can use an instruction that counts the number of trailing zeroes (lzcnt or bsf on Intel processors). We can put such instructions to good use for unary decoding. (See popcnt code sample and See lzcnt/bsf code sample.)

According to my tests, on recent Intel processors, the latter approach is much faster, decoding integers at speeds of over 800 millions integers per second, being roughly twice as fast as a table-based approach. See the next figure.

As usual, my source code is freely available.

Schools teach us theory so that we can be more productive workers. You learn grammar so that you can be a better writer. You learn about computer science, so that you may be a better programmer. You learn about electromagnetism so you can do electronics. You learn about thermodynamics so you can design engines.

It is tempting therefore to believe that theory precedes practice. We invented grammar first, and then the written language. We came up with the thermodynamics and then we invented engines. We discovered electromagnetism and then we could build electric circuits. We studied computer science and then we started programming.

But this is backward. Watt was certainly a knowledgeable technician when he designed the engine. Flowers was a skilled electrical engineer when he computerized the post office in the 1930s. However, clean principles only emerge after the fact.

There is a big difference between putting theory in practice, and trying something out while lacking the theory. My claim is that most inventions are of the latter sort, even if academics much prefer to imagine that the former plays a central role.

I believe that Torvalds put it best:

Don’t ever make the mistake [of thinking] that you can design something better than what you get from ruthless massively parallel trial-and-error with a feedback cycle. That’s giving your intelligence much too much credit. (Linus Torvalds)

As you invent something new, the existing theory is lacking. The principles do not quite apply. It is only after the fact that someone smart can finally derive clear principles.

The real-world is a complicated place. Our minds are limited to relatively simple models. We evolved as tinkerers. Sure, we are smart tinkerer in that we can trade ideas and designs, but our fundamental R&D strategy remains tinkering.

That is why we invented the written language long before scholars wrote grammars. That is why Watt invented the engine long before scientists conceived thermodynamics. We built electric circuits before scientists founded electromagnetism. We hacked computers together and then founded computer science. We toyed with uranium (and got sick) long before we could build an atomic bomb.

There is an important practical implication to my claim: if you want to invent something new, you need to be willing to go where our theory no longer applies. Sadly, this means you will get it wrong at times through no fault of your own.

Further reading: Problem solvers and theory builders by John D. Cook, quoting from Mathematics without Apologies by Michael Harris.

I am convinced that much of the gap between the best college students and the worst is explained by study habits. Frankly, most students study poorly. To make matters worse, most teachers are incapable of teaching good study habits.

Learning is proportional with effort

Sitting in a classroom listening to a professor feels like learning… Reading a book on a new topic feels like learning… but because they are overwhelming passive activities, they are inefficient. It is even worse than inefficient, it is counterproductive because it gives you the false impression that you know the material. You can sit through lecture after lecture on quantum mechanics. At some point you will become familiar with the topics and the terminology. Alas you are fooling yourself which is worse than not learning anything.

Instead, you should always seek to challenge yourself. If some learning activity feels easy, it means that it is too easy. You should be constantly reminded of how little you know. Great lectures make it feels like the material is easy: it probably is not. Test yourself constantly: you will find that you know less than you think.

Some students blame the instructors when they feel confused. They are insistent that a course should be structured in such a way that it is always easy, so that they rarely make mistakes. The opposite is true: a good course is one where you always feel that you will barely make it. It might not be a pleasant course, but it is one where you are learning. It is by struggling that we learn.

On this note, Learning Style theory is junk: while it is true that some students have an easier time doing things a certain way, having it easier is not the goal.

There are many ways to challenge yourself and learn more efficiently:

  • Seek the most difficult problems, the most difficult questions and try to address them. It is useless to read pages after pages of textbook material, but it becomes meaningful if you are doing it to solve a hard problem. This is not news to Physics students who have always learned by solving problems. Always work on the toughest problems you can address.
  • Reflect on what you have supposedly learned. As an undergraduate student, I found that writing a summary of everything I had learned in a class was one of the best ways to study for an exam. I would just sit down with a blank piece of paper and try to summarize everything as precisely as possible. Ultimately, writing your own textbook would be a very effective way to learn the material. Teaching is a great way to learn, because it challenges you.
  • Avoid learning from a single source. Studying from a single textbook is counterproductive. Instead, seek multiple sources. Yes, it is confusing to pick up a different textbook where the terminology might be different, but this confusion is good for you.

If sitting docilely in a classroom is inefficient and even counterproductive, then why is it so common a practice? Why indeed!

Interleaved study trumps mass study

When studying, many people do not want to mix topics “so as not to get confused”. So if they need to learn to apply one particular idea, they study to the exclusion of everything else. That is called mass (or block) practice.

Course material and textbooks do not help: they are often neatly organized into distinct chapters, distinct sections… each one covering one specific topic.

What researchers have found is that interleaved practice is far superior. In interleaved practice, you intentionally mix up topics. Want to become a better mathematician? Do not spend one month studying combinatorics, one month studying calculus and so on. Instead, work on various mathematical topics, mixing them randomly.

Interleaved practice feels much harder (e.g., “you feel confused”), and it feels discouraging because progress appears to be slow. However, this confusion you feel… that is your brain learning.

Interleaved practice is exactly what a real project forces you to do. This means that real-world experience where you get to solve hard problems is probably a much more efficient learning strategy than college. Given a choice between doing challenging real work, and taking classes, you should always take the challenging work instead.

Further reading: Make It Stick: The Science of Successful Learning by Peter C. Brown et al. and Improving Students’ Learning With Effective Learning Techniques by Dunlosky et al.

Colleges and universities, left and right, are launching Massive open online courses (MOOC). Colleges failing to follow are “behind the times”.

Do not be fooled by how savvy MOOC advocates sound. They do not understand what they are doing.

Let us start with how they do not even understand what a MOOC is, or should be. MOOCs are supposed to be open platforms. It is right there in the name. Downes’ original MOOCs were indeed open. Yet the actual MOOCs that colleges publish are closed platforms, as per Wikipedia’s definition:

A closed platform is a software system where the carrier or service provider has control over applications, content, and media, and restricts convenient access to non-approved applications or content. This is in contrast to an open platform, where consumers have unrestricted access to applications, content, and much more.

The word “open” has been perverted beyond belief, but let us be clear: Facebook is not an open platform. It is public, certainly, in the sense that everyone can join… but it is a closed platform. The content is locked up. If search engines cannot index the content, then it is closed. It is that simple. If your course requires that prospective students “register” to access the content, then it is not an open course. It might be an online course, it might even be massive, but it is not open.

There is nothing wrong with closed platforms per se. The ancient Greek philosophers made a living by selling their lectures to paying customers. But most modern college campuses are remarkably open in contrast. In all likelihood, I can just show up for class on campus in most colleges in North America and attend lectures, for free. I do not need to provide an email address or a password. If there is room in the class, I can generally sneak in. Nobody will care. Why is that? Because we have learned that selling lectures is a tough business. It was different for the Greeks because so little was written down… but we live in an era where Amazon can deliver a textbook on any topic directly to your door within 48 hours. In this era, it is much better to sell diplomas and degrees. Unlike lectures, they have tangible financial value for the students. Some colleges also serve as meeting places, others provide an experience.

What colleges do not do, at least on campus, is to make money off course content. As it is, you can easily order all the textbooks you could possibly read on Amazon. You can join discussion groups about them. You sneak into lectures, or find tons of them online. There is simply little value in the course content.

Do not believe me? Run the following experiment. Make all courses tuition free. Students can enrol for free and if they pass the exam, they get the credit. However, they must pay $20 for each hour of lecture they choose to attend. You know what is going to happen? Nobody but the instructor will show up. How do I know? Because, as it is, with free lectures once you have enrolled in a class, most students never show up for class unless they are compelled to do so. Why would anyone think that it is going to be somehow different with pre-recorded lectures online? You know, the lectures colleges like so much? The truth is that there is only value at the margin for course content.

It is probably harder to make a living selling lectures than it is as a journalist, and it has become nearly impossible to live off journalism. The volume of great free stuff is just too high.

Colleges that try to lock down course content, let alone the content of their MOOCs, are signalling that they have no clue about the business that they are in.

Next Page »

Powered by WordPress