- There is a pre-existing problem P.
- There are few relatively simple but effective solutions to problem P. Among them is solution X.
- We came up with a new solution X+ which is a clever variation on X. It looks good on paper.
- We ran some experiments and tweaked our results until X+ looked good. We found a clever way to avoid comparing X+ and X directly and fairly, as it might then become obvious that the gains are small, or even negative! We would gladly report negative results, but then our paper could not be published.
It is a very convenient story for reviewers: the story is simple and easy to assess superficially. The problem is that sometimes, especially if the authors are famous and the idea is compelling, the results will spread. People will adopt X+ and cite it in their work. And the more they cite it, the more enticing it is to use X+ as every citation becomes further validation for X+. And why bother with algorithm X given that it is older and X+ is the state-of-the-art?
Occasionally, someone might try both X and X+, and they may report results showing that the gains due to X+ are small, or negative. But they have no incentive to make a big deal of it because they are trying to propose yet another better algorithm (X++).
This process is called citogenesis. It is what happens when the truth is determined solely by the literature, not by independent experiments. Everyone assumes, implicitly, that X+ is better than X. They beauty of it is that you do not even need for anyone to have claimed so. You simply need to say that X+ is currently considered the best technique.
Some claim that science is self-correcting. People will stop using X+ or someone will try to make a name for himself by proving that X+ is no better and maybe worse than X. But in a business of science driven by publications, it is not clear why it should happen. Publishing that X+ is no better than X is an unimpressive negative result and those are rarely presented in prestigious venues.
John Regehr made a similar point about our inability to address mistakes in the literature:
in many cases an honest retrospective would need to be a bit brutal, for example to indicate which papers really just were not good ideas (of course some of these will have won “best paper” awards). In the old days, these retrospectives would have required a venue willing to publish them, (…), but today they could be uploaded to arXiv. I would totally read and cite these papers if they existed (…)
But there is hope! If problem P is a real problem, for example, a problem that engineers are trying to solve, then you can get actual and reliable validation. Good software engineers do not trust research papers: they run experiments. Is this algorithm faster, really? They verify.
We can actually see this effect. Talk to any Computer Scientist and he will tell you of clever algorithms that have never been adopted by the industry. Most often, there is an implication that industry is backward and that it should pay more attention to academic results. However, I suspect that in a lot of cases, the engineers have voted against X+ and in favor of X after assessing them, fairly and directly. That is what you do when you are working on real problems and really need good results.
Credit: This blog post was inspired by a comment made by Phil Jones on Google+.