Almost all software I write for my research is open sourced. Some fellow researcher argued today that I risk reducing the gap between and my pursuers. Similarly, I should keep my data to myself (and avoid listing good sources of research data).
Here is my take on this issue.
- Sharing can’t hurt the small fish. Almost nobody sets out to beat Daniel Lemire at some conference next year. I have no pursuer. And guess what? You probably don’t. But if you do, you are probably doing quite well already, so stop worrying. Yes, yes, they will give you a grant even if you don’t actively sabotage your competitors. Relax already!
- Sharing your code makes you more convincing. By making your work easier to reproduce, you are instantly more credible. Trust is important in science. Why would anyone trust that I actually wrote the code and ran the experiments? Because I published my code, that’s why!
- Source code helps spread your ideas faster. On the long run, you should not care about getting papers accepted at some hot conference. What matters is the impact you have had. Make it easy for me to use your ideas! Help yourself!
- Sharing raises your profile in industry. Having open source software makes your more attractive to software engineers.
- You write better software if you share it. While not all code I publish is bug-free, documented or even usable, I care slightly more about my code because I publish it.
Finally, does sharing code works? Do people download and use my software? Here are download statistics for my latest source-code publications:
|A compressed alternative to the Java BitSet class||>280|
|Rolling Hash C++ Library||>200|
|Lemur Bitmap Index C++ Library||>2 000|
|Fast Nearest-Neighbor Retrieval under the Dynamic Time Warping||>1400|
Update: Joachim Wuttke pointed out another potential benefit: your users will debug your code.
Update 2: This post appeared on slashdot.