Reflecting on 2012

The new year (2013) is here. So, it is time to reflect on what I have done and seen in 2012.

As a researcher, one of the most interesting innovations in 2012 has been the emergence of the Google Scholar profiles. They are pages where Google aggregates the work of a given researcher. I have long advocated that we should pursue an author-centric model and I think we are finally getting there: Google Scholar allows you to subscribe to the new publications of an author. Unfortunately, Google has focused on the profile pages on citations: this encourages people to game the system by boosting artificially their citation counts. We will need better metrics if we want to openly assess researchers. To improve matters, I have started a small project described in my post From counting citations to measuring usage.

What is rather remarkable is that Google is actually disrupting the academic publication business… even though it is almost certainly not profitable for Google to do so. They are definitively creating value: it was far easier to find scientific references in 2012 than it was before thanks to the progress Google is making with Google Scholar.

Similarly, Google has kept pushing the state-of-the-art in database research with papers such as Processing a Trillion Cells per Mouse Click. This is remarkable especially if you consider that in academic circles, the physical design of databases has long been considered a solved problem. Clearly, academic wisdom was wrong.

Academics like myself often like to pretend that innovation starts with academic research which then migrates to industry where the ideas are implemented. Google puts a dent in this theory, as they are clearly driving research in Computer Science. Over the Christmas break I finally got around to reading Kealey’s Sex, Science And Profits who argues convincingly that this is true in general: the magical knowledge transfer from academic research to industry would account to only about 10% of all industrial innovations, and not the most valuable 10%.

I think that 2012 was a year when many academics like myself thought about the future. We have seen the emergence of massive open online courses from prestigious American universities and a few start-ups. It is becoming clear that governments are under increasing financial stress while universities have failed to become more efficient. It is hard to imagine that universities will remain undisrupted for another 20 years.

Though blogging was supposed to be dying in 2012, I am still writing and reading blog posts almost every day. One of the interesting innovation on this my blog has been the use of the social coding platform GitHub for hosting the code related to my software posts. I really like it because it allows people to not only comment on my posts, but also review and change easily my code. This has lead to more interesting discussions.

I have also started using GitHub for some of my research projects. And I have been lucky enough to get feedback from people who were interested in my code for their own reasons. The fact that GitHub makes it easy to contribute to a stranger’s code is a blessing for me as a researcher. In fact, GitHub has encouraged me to focus even more of my research on programming by making it easier to have an impact through software.

Though I want to resist making predictions, I think that in 2013, more of my time will be spent programming openly. I will probably move closer to an ideal of open scholarship where research papers are only one of visible outputs of my research. I also hope to move closer to an ideal where much of my research is actually useful. It is maybe worth repeating here that most of what researchers write is never read by anyone (even when it is cited). We have focused on publishing so much that we ended up believing that it is an end in itself (it is not!).

I no longer measure the popularity of my blog using statistics. I feel that it is rather pointless given how many bots and dead subscriptions there are. However, I still assess the value of the blog based on the interactions it generates. Subjectively, 2012 was a good year: I got a lot of very interesting feedback.

Several of my blog posts were tied with my ongoing research. In an ideal world, I’d be able to decompose much of my research into short blog posts: I think I am getting closer to this model. In this respect, I am inspired by the famous Edsger Dijkstra who published little in journals and conferences: he thought that the formal peer review system was counterproductive. I am no Dijkstra, of course, but I find his model very compelling. I still value the feedback I get from a good journal review, and it certainly helps me improve my work (and the work of my graduate students), but I now see it as just one tool among many.

My blog publication rate has come down. This is part of a long term trend caused by the increased usage of the social platforms such as Google+ and Twitter and the collapse of RSS as a platform. That is, much of what I would have published on this blog back in 2005, I now publish on Google+ or Twitter. I tend to stick with my blog for the more substantial pieces.

Here are some popular blog posts in 2012:

  1. Do we need patents?
  2. What happens when you get more Ph.D.s?
  3. I’m an introvert. And that’s ok.
  4. Computer scientists need to learn about significant digits
  5. Data alignment for speed: myth or reality?
  6. On the quality of academic software
  7. Is C++ worth it?
  8. To improve your intellectual productivity
  9. Fast integer compression: decoding billions of integers per second
  10. When is a bitmap faster than an integer list?
  11. Should you follow the experts?
  12. A simple trick to get things done even when you are busy

Of course, my life is not just about writing software, grading papers and blogging. In 2012, I kept making things by hand:

  • I made 5 tables that are nice enough that my wife put them in the living room. There is almost a piece of furniture I made myself in every room of our house. Two years ago I would have been unable to build a bird house (if only because I lacked the tools and material).
  • I started square foot gardening and had generally good luck.
  • I got much better at making bread. I now use a couple of variations on Lahey’s recipe. It is ridiculously easy and it works every time! I get beautiful results and everyone likes my bread. I have also much improved my pizza bread. So I am now making my own yogourt, wine, port, beer and bread, all from scratch.

Maybe I should conclude with some of the mistakes I made in 2012:

  • I designed almost from scratch an RC crawler. It probably costed 5 times what it should have costed because I did not know what I was doing. It was fun in the end, but I am happy my wife does not know how much it costed.
  • I tried fixing an iPad with a broken glass by myself. I ordered the pieces from Hong Kong and did almost everything correctly, but when I put the iPad back together, half of the screen was darker so the iPad became unusable. I also started fixing a broken glass on an Asus tablet but I got discouraged and I stopped. I wasted a lot of money with parts.
  • I overcommitted professionally. A friend told me that what he likes about my academic record is that I do few things, but I do them well. But 2012 was a bad year for me in this respect. I just agreed to do too many things. Maybe it is my introverted personality, but I just do poorly under pressure: I tend to get more and more exhausted and I achieve less and less as the pressure builds up. I find that I need to focus on 3 or 4 important things at most. Now that the Christmas break is ending, I feel a bit anxious about everything I need to complete. I hope than in 2013, I will refuse to overcommit and just focus on doing things well. It is harder than it sounds.

Daniel Lemire, "Reflecting on 2012," in Daniel Lemire's blog, January 1, 2013.

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

15 thoughts on “Reflecting on 2012”

  1. What is rather remarkable is that Google is actually disrupting the academic publication business… even though it is almost certainly not profitable for Google to do so.

    Sometimes I wonder if Google, being in the position of influence and resources it is in, does things just to see what will happen and what new opportunities such disruption will create for it.

    Thanks for the book recommendation!

  2. Just wanted to say I enjoy reading your blog and it has also motivated me. I have recently starting placing some of my own research code on GitHub. With the majority of work accessible online, it only makes sense to provide working code when applicable. If nothing else, it should at least make other researchers more likely to cite your work as it is easy to use and compare against.

  3. Now that I have access to Scopus, I can see that Scopus, probably, has a worse scope than Google Scholar (for computer science). Microsoft Academics search suffers from the same problem as Scopus: their assortment of journal and conferences is very limited. It is true that Google does index technical reports, which some people argue is bad, but it is actually useful as well.

  4. Enjoy hearing about things academic and otherwise. Again recommend you try sourdough and Tartine Bread by Chad Robertson is a good start. I don’t like dealing with a hot pot out of the oven, so I shape the dough into boules in lined baskets (California Baking Institute), and put them onto a thick stone in the oven with a pizza peel.

    As for woodworking you should take a look at a fellow Canadian at
    He, as I, are really into hand tool work.

    Thanks for keeping me alert. At 72 I need it.


  5. I agree that Google Scholar is great, but I’m curious what you mean by “disrupt.” The basic definition from Christensen doesn’t apply in any strict sense. What is being disrupted and how?

  6. @John Regehr

    You are right that I don’t use the word disrupt in the pure Christensen sense, but I don’t think I am far off. Your school probably pays to grant you access to major indexing systems such as Scopus, Web of science, the ACM digital library and so on. These tools are professionally curated and considered “high quality”. Meanwhile, Google provides a cheaper alternative that is, formally, not nearly as good… but ends up being, in practice, much better.

    Would you seriously consider using Scopus to do a lit. survey?

  7. @Itman

    From an information retrieval point of view, what the Web taught us is that coverage is more important than precision, at least in open world settings. I prefer to have some junk in the results, if it means I’ll get more of the important results.

  8. @Don Boys

    I do have an oven stone, and it works great for pizzas, but for bread, I have found that cooking the bread in a pot gives me a better crust and more oven rise. I see very little downside to using a hot pot. Sure, you could burn yourself, but I’ll gladly sacrifice my health for good bread.

    Thanks for the woodworking reference. I do use simple power tools, but I would love to do away with them. For now though, in all honesty, I mostly do simple and practical things. I do want to get into more sophisticated things, but there are so many simple things that I need to do first… for example, I still need to build a bread box.

  9. Hi Daniel, of course Scopus would be an absurd choice for a CS literature search. In fact I just gave it a try and it was terrible.

    I’ll buy that Google Scholar can help disrupt the scholarly indexing business.

    Mostly, however, I just use Google for literature searches, not Google Scholar. The latter is, as you point out, a nice way to track authors.

    Actually all I really want is an RSS feed for every researcher. This feed should show one entry every time the author publishes a paper.

  10. My stone is 1″ thick and has sides. I have no trouble with oven rise and crust. I know that most pizza stones are rather thin.
    I usually make 2 kg of dough and bake in two loves, one after the other. Use a wood peel to place loaves into oven, metal to handle once in the oven.

    Bread Baking an artesian’s perspective by DiMunzio is a great book if you want a good understanding of the process from a more scientific view. It really is a textbook, but it helps you understand how the parts go together.

    The New Traditional Woodworker by Jim Tolpin is a great start for getting into hand tools.

  11. @John Regehr

    Google integrates the results from Google Scholar… I just tested it by googling one of your highly cited papers (HLS: A framework for composing soft real-time schedulers), and the regular Google result sets tells me how often it was cited. So I would argue that if you are using the regular Google, you still benefit from Google Scholar.

    I think it is fantastic that Google is able to out-compete librarians and publishers with respect to science indexing. For one thing, there is no money in it for Google… For another, publishers are almost certainly resisting Google as much as they can.

    It is my understanding that Google can’t provide RSS feeds for researchers due to licensing issues. Maybe Google would not offer RSS feeds even if they could, but I am quite sure they would offer something better than email notifications.

  12. Daniel, agreed, the Google / Google Scholar integration is cool.

    Let’s face it, almost everything about the academic publication industry and process sucks. Pretty much any kind of disruption is good.

    “Why is Google doing this?” is an interesting question. My guess is they simply have so many PhDs that this makes sense as a low-resource pet project.

Leave a Reply to Muigai Cancel reply

Your email address will not be published.

You may subscribe to this blog by email.