Bryan Caplan, an economist, raised an interesting question on Twitter: why aren’t people celebrating the fact that tools like GPT might soon allow us to produce many more research papers at the same cost, without sacrificing quality?
The American government is funding research at the tune of 138 billion dollars a year. It includes 45 billion dollars solely for the National Institutes of Health. There are governmental laboratories, special projects, the National Science Foundation, and so forth. And, of course, you have military research. For comparison, a large corporation like Google, with hundreds of thousands of employees, makes about 20 billion dollars in profit each year. We pour a significant effort into scientific research.
The number of research articles published per year follows an exponential curve. We publish millions of research articles. We never published so many research papers, and we will publish many more next year. Thousands of researchers publish at least a research paper a week.
At a glance, the objective of much of this research is to produce research articles. It seems certain that if we could produce more of them for the same cost, the national research productivity would go up.
The problem, as pointed out by Caplan, is that nobody cares about these research articles. In fact, if you asked engineers to pay the authors to access their research articles, they almost certainly would not be willing to pay.
To be clear, Caplan does not mean ‘all of it’. Some of the work is widely read and influential. He estimates that the worthwhile publication make up 2%. Nevertheless, most published articles are not contributing anything at all. So our ability to write more of them will not help us in any way.
As people like Stonebraker have remarked, we have lost our way: people publish to get jobs and promotions, and no longer to advance science. the focus on peer-reviewed publications has more to do with a status competition between researchers and academics, than a genuine desire to advance science. The metrics have been gamed. Researchers, generally, no longer have customers: nobody uses the result of the work, except maybe other researchers. Peer-reviewed papers are getting increasingly boring
It does not mean that actual scientific progress does not happen. Nor does it mean that we should stop writing papers and books. The research done at OpenAI is changing the world, and they are writing research papers. However, if you go on OpenAI’s web site, under Research, you find a list of links to arXiv, a freely accessible repositories of papers that are often not formally peer reviewed. In other words, the researchers at OpenAI seem more concerned with advancing the science than by playing a “publication game”.
Effectively, we have a severe misalignment problem. What gets you a job, a promotion or a research grant, no longer has much to do with whether you are advancing science.
Sometimes, when I sit on a recruiting committee for new professors, I ask the candidate to tell us about their most important scientific contribution. I then ask them why it is an important contribution. Almost invariably, some people will say “it was published in this important conference”. A large fraction of academia is thoroughly confused about what constitute progress.
What will tools like GPT do to the business of science? It is hard to tell. However, it is possible that the net outcome could be greatly positive. If “publishing a paper” is no longer a significant achievement thanks to artificial intelligence, it is possible that it could stop being a status-seeking game. This would leave the field to people who seek to contribute real advances.
I think that it is conceivable that this new breakthrough in artificial intelligence could bring us to a new golden age of scientific research. Of course, people will still seek to game prestige signals, but a fast changing technological landscape could reset the metrics and re-align the incentives.
2 thoughts on “Will emerging artificial intelligence boost research productivity?”
I largely agree with what you said, even though it’s hard to feel the optimistic outlook at this point in time. We’ll have to struggle through a lot of watered down articles before the research community shifts the values.
I speculate that there are more pragmatic reasons why OpenAI publishes on Arxiv rather than, say, NIPS. Partly, this is a way to avoid the need to adhere to the conference/journal’s timelines, which is a big deal if they want to outpace other big tech companies. Partly, in consequence of the above, to minimize the risk of rejection and the need to re-submit and wait again. Some of their work (e.g. Whisper) is not super rigorous and it could harm their image to see reviewers pushing back over non-reproducibility and lack of relevant details. Partly, to keep the control over the publicity that accompanies their recent releases.
I have no doubt the researchers of OpenAI are interested in advancing the science. I am not sure that, if given the choice, they would prefer Arxiv over prestigious journals in 100% of the cases.
Course correction takes a long time, but the criticism against our model started a long time ago and they have been intensifying with every passing year.
Regarding your disagreement, I bring you back to my statement which is: ” In other words, the researchers at OpenAI seem more concerned with advancing the science than by playing a “publication game”.” I think that I am likely correct.
You may subscribe to this blog by email.