David Donoho was among the first researchers to promote reproducible research through software publicationÂ (seeÂ Buckheit and Donoho, 1995).Â Fifteen years later, Donoho and his collaborators are even more insistent :
Scientific computation is emerging as absolutely central to the scientific method. Unfortunately, it’s error-prone and currently immatureâ€”traditional scientific publication is incapable of finding and rooting out errors in scientific computationâ€”which must be recognized as a crisis. An important recent development and a necessary response to the crisis is reproducible computational research in which researchers publish the article along with the full computational environment that produces the results. (Donoho et al., 2009)
Their 2009 paper on reproducibility is insightful and well worth reading. I agree that sharing software is good for science, and Â for scientists.
Unfortunately, I fear we might lose sight of why we must publish our software.
- In theory, scientists should be constantly checking each other’s results. But that is not how science is done. You are rewarded for finding something new, not for checking someone’s results. So hardly anyone will ever download your code to check whether you cheated.
- Reproducibility and repeatability are not the same thing. It is great that I can rerun your code. But it does not follow that your code and results are right or useful.
Share your source codeÂ to spread your ideas:
- Keep your packages simple. People need a few key pieces of code that they can integrate in their own software.
- Use popular languages. Remember that repeatability is not enough: people are likely to tear apart your software to reconstruct their own.
- Go beyond academia. Why assume academic researchers are the people who matter? Spreading your ideas among engineers is important as well.
The reproducibility that matters is getting people to use your ideas. Merely proving you are honest falls short of your potential!