A common answer to my post on the reliability of science, was that fraud was marginal and that, ultimately, science is self-correcting. That is true on one condition: that the science in question is bona fide science. Otherwise, I disagree that institutional science is self-correcting. It is self-correcting about as much as human beings are rational. That is, not often. A lot of what passes for science is actually cargo cult science. What looks like rigorous science, may not be, no matter what the experts tell you. Don’t fool yourself: science is not the process of getting published in prestigious journals or a tool to get a tenured job. Richard Feynman defined science as the belief in the ignorance of experts.

Institutional science can be wrong or not even wrong for decades without any remorse:

  • Economists failed to predict or explain the last financial crisis. Yet they can’t put into questions their models. Philip Mirowski explains why: “The range in which dissent happens is so narrow. (…) The field got rid of methodological self-criticism.”
  • A large fraction of AI researchers have convinced themselves that intelligence must emerge from Prolog-like reasoning engines. This gave us twenty years of predictions that the future was in expert systems, and the last ten years spent predicting the rise of the Semantic Web. This ever-growing community of AI researchers are oblivious to their own failure to produce any useful result.
  • Like Fred Brooks, I’m amazed that in 2010, the waterfall method is taught in software engineering school as the reference model. There is no evidence that it is beneficial and, in fact, much evidence that it is hurtful. That is, students would be better off learning nothing rather than learning to use the waterfall method. Yet, entire Ph.D. thesis are still built on the assumption that the waterfall method is sound. Accordingly, criticizing the waterfall method on campus is a risky business.
  • The dominant paradigm of modern Theoretical Physics is String theory, which is not even a scientific theory.

We should not trust that self-correction will happen. Instead, biases are often self-reinforcing. Rather, we must ask how self-correction can happen. I think that all science must be verified by independently designed and reproduced experiments. For example, it is insufficient to verify the speed of light with one reproducible experiment. It must be possible for different researchers to come up independently with different experiments, which are all reproduced independently several times. And if everyone is working from the same data, the limitations of the data may never be revealed. And if there is no experiment, you are doing Mathematics or art, not science.

Peer review does not lead to self-correction. Peer review increases quality, but it can also reinforce biases. In Information Retrieval, we often talk about the trade-off between precision and recall. Peer review improves precision, but degrades recall. If your primary goal is to please your peers, you won’t be tempted to point out the flaws in their research!

However, I am optimistic for the future. The rise of Open Scholarship will allow outsiders to participate in the research process and keep it more honest.

Last week, the Register announced that Google moved “away from MapReduce.” Given that several companies adopted MapReduce (hence copying Google), is Google moving a step ahead of its copycats? Moreover, Tony Bain is asking today whether Stonebraker was right in stating that MapReduce was a “a giant step backward.” Is MapReduce itself any good?

As reported by the Register, one problem with MapReduce is that it is essentially batch-processing oriented. Once you start the process, you can’t easily update the input data and expect the output to be sane. Thus, MapReduce is poor at real-time processing. Yet, it will remain fine for latence-oblivious applications such as Extract-Transform-Load or number crunching.

We now expect Google to index my blog post within minutes after I post them. Google had to update its batch-oriented architecture for a real-time indexing approach. However, it is unclear whether this puts Google technologically ahead of, say, Microsoft Bing.

The big picture is maybe more interesting. We used to view the Web as a large collection of documents—as a library. Indexes updated daily were just fine. We now view the Web as an endless stream of data—like a live meeting between billions of people.

Further reading: Julian Hyde, Data in Flight, ACM Queue, 2009.

It is not difficult find instances of fraud in science:

  • Ranjit Chandra faked medical research results. He pocketed the money meant for running the experiments.
  • Woo-suk Hwang faked human cloning, among other terrible things.
  • Jan Hendrik Schön faked a transistor at the molecular level.

How did these people fare after being caught?

  • Ranjit Chandra still holds the Order of Canada, as far as I can tell. According to Scopus, his 272 research papers were cited over 3000 times. As for his University? Let me quote wikipedia: University officials claimed that the university was unable to make a case for research fraud because the raw data on which a proper evaluation could be made had gone missing. Because the accusation was that the data did not exist, this was a puzzling rationale.
  • According to Scopus, Woo-suk Hwang has been cited over 2000 times. Despite having faked research results and having committed major ethics violations, he has kept his job and… he is still publishing.
  • Despite all the retracted papers, Jan Hendrik Schön has still 1,200 citations according to Scopus. He lost his research job, but found an engineering position in Germany.

Conclusion: Scientific fraud is a low-risk, high-reward activity.

What is more critical is that we still equate peer review with correctness. The argument usually goes as follows: if it is important work, work that people rely upon, and it has been peer reviewed, then it must be correct. In sum, we think that conventional peer review + citations means validation. I think we are wrong:

  • Conventional peer review is shallow. Chandra, Hwang and Schön published faked results for many years in the most prestigious venues. The truth is that reviewers do not reproduce results. They usually do not have access to the raw data and software. And even if they did, they are unlikely to be motivated to redo all of the work to verify it.
  • Citations are not validations. Chandra, Hwang and Schön were generously cited. It is hardly surprising: impressive results are more likely to be cited. And doctored results are usually more impressive. Yet, scientists do not reproduce earlier work. Even if you do try to reproduce someone’s result, and fail, you probably won’t publish it. Indeed, publishing negative results is hard: journals are not interested. Moreover, there is a risk that it may backfire: the authors could go on the offensive. They could question your own competence.
  • There are many small frauds. Even without making up data, you can cheat by misleading the reader, by omission. You can present the data in creative ways, e.g. turn meaningless averages into hard facts by omitting the variance (see the fallacy of absolute numbers). These small frauds increase the likelihood that your paper will be accepted and then generously cited.

How do we solve the problem? (1) By trusting unimpressive results more than impressive ones. (2) By being suspicious of popular trends. (3) By running our own experiments.

Further reading: Become independent of peer review, The purpose of peer review and Peer review is an honor-based system.

Source: Seth Roberts.

Powered by WordPress