Since the second world war, science has relied on what I call traditional peer review. In this form of peer review, researchers send their manuscript to journal. An editor then reviews the manuscript and if it is judged suitable, the manuscript is sent to reviewers who must make a recommendation. If the recommendations are good, the editor might ask the authors to revise their work. The revised manuscript can be sent back to the reviewers. Typically, the authors do not know who the reviewers are.

In computer science and engineering, we often rely also on peer reviewed conferences: they work the same way except that the peer review process is much shorter and it typically only involves one round. That is, the manuscript is either accepted as is or rejected definitively.

Governments attribute research grants according to the same peer review process. However, in this case, a research proposal is reviewed instead and there is typically only one round of review. You can theoretically appeal of the decisions, but by the time you do, the funds have been allocated and the review committees might have been dismissed.

Researchers trust peer review. There is a widely held belief that this process can select the best manuscripts reliably, and that it can detect and reject nonsense.

However, most researchers have never looked at the evidence.

So how well does peer review fare? We should first stress that a large fraction (say 94%) of all rejected manuscripts are simply taken to another journal and accepted there. The editor-in-chief of a major computer science journal once told me: you know Daniel, all papers are eventually accepted, don’t forget that. That is, even if you find that some work is flawed, you can only temporarily sink it. You cannot denounce the work publicly in the current system.

Another way to assess the reliability of peer review is to look at inter-reviewer agreement. We find that as soon as one reviewer feels that the manuscript is acceptable, the inter-reviewer agreement falls to between 44% and 66% (Cicchetti, 1991). That is, consensus at the frontiers of science is as elusive as in other forms of human judgment (Wessely, 1998). Who reviews you is often the determining factor: the degree of disagreement within the population of eligible reviewers is such that whether or not a proposal is funded depends in large proportion of cases upon which reviewers happen to be selected for it (Cole et al., 1981).

12 Comments

  1. I have once seen the review scores of papers submitted at a major engineering conference. It looked almost random: out of 3 reviewers, one would give the lowest score, one the highest and one the middle score. And those were not isolated incidents, far from it.

    As a reviewer for journal papers, I found out that it can be near impossible to have a bad paper rejected if just another reviewer finds it OK (with revisions). The editor will go with the average decision (accept with revisions) and I would find myself reviewing the next revision of a paper that should have been rejected outright. What to do then?

    Comment by Benoit — 19/9/2012 @ 13:43

  2. The 94% figure that you mention caught my attention, so I downloaded the paper that you linked to the 94% figure (after manually fixing the broken link).

    The paper points out that, although agreement among reviewers is somewhat low, the final decision (accept/reject) is a good predictor of future citation counts. That is, the papers that are accepted get many more citations than the 94% that get submitted elsewhere.

    The 94% that get submitted elsewhere end up in less prestigious venues, so perhaps their lower citation counts are caused by their less prestigious venues. That is, perhaps they are less cited because they were rejected by the prestigious journal; it is not necessarily true that lower quality caused both the original rejection and the eventual low citation rate.

    Nonetheless, the paper actually argues against your position.

    In my own experience with journals (but not with conferences), I believe that the revisions suggested by the reviewers are much more important that the binary accept/reject decision. Reviewers’ comments have helped me to greatly improve my papers. Also, their comments help me improve my writing and my research skills.

    I think the comments are the most important and helpful part of reviewing. The accept/reject decision is only one bit of information. The comments are kilobytes of information.

    Comment by Peter Turney — 19/9/2012 @ 15:35

  3. @Peter

    Sorry about the broken link. I fixed it.

    Nonetheless, the paper actually argues against your position.

    My point was that, as a reviewer, even if you sink a paper because it fabricated its data or used a flawed methodology, chances are good that it will come back elsewhere if the flaw is not immediately obvious. The fact that the system is not transparent is a problem, in my mind.

    I think the comments are the most important and helpful part of reviewing. The accept/reject decision is only one bit of information. The comments are kilobytes of information.

    Agreed. None of what I wrote goes against this.

    Comment by Daniel Lemire — 19/9/2012 @ 16:12

  4. I think the system works pretty well. We all know (in our own fields) which are the good journals. If I read a paper from a poor journal I realise that I shouldn’t trust it in the same way I would one from a good journal. The same goes when I am assessing someone’s publications as a grant reviewer. People who publish in the poor journals are just fooling themselves.

    Comment by David Taylor — 20/9/2012 @ 3:24

  5. My point was that, as a reviewer, even if you sink a paper because it fabricated its data or used a flawed methodology, chances are good that it will come back elsewhere if the flaw is not immediately obvious. The fact that the system is not transparent is a problem, in my mind.

    I didn’t get that your point was about transparency.

    One solution would be for every journal and conference to ask the authors if their paper (or an earlier version of their paper) was submitted for review elsewhere in the past. If the answer is positive, then the journal/conference could require that the authors provide a copy of the past reviews and a list of how the paper was revised in the light of the past reviews.

    To encourage honesty, there could be a central database for each field (e.g., computer science), where authors, titles, and abstracts are recorded for all submitted papers, whether accepted or rejected. The database might be public or it might be restricted to journal and conference editors. It might be open for browsing or it might only permit limited queries. It might even contain copies of the actual reviews for each paper, which would be released to the journal or conference editor with the permission of the authors.

    Third-rate venues might not want to do this, but first- and second-rate venues would likely be proud to say that they enforce this policy.

    It would also cut down on work for reviewers, because there would be fewer cases where the exactly identical paper is reviewed by six, nine, or twelve reviewers, at two, three, or four different venues.

    Comment by Peter Turney — 20/9/2012 @ 9:40

  6. @Peter great point.

    I also agree that peer-reviewing is, indeed, useful, but some problems can be fixed.

    For instance, the problem with conferences. I cannot understand why poorly reviewed papers (and we all know that conference-level peer-review is not especially good) should be valued more than journal papers? Note that this is not universal, it is just CS and engineering.

    One may argue that your publication should improve as you resubmit from conference to conference. In practice, however, people just keep resubmitting until accepted with minor changes in content. Note that they don’t resubmit to lower-tier conferences! They resubmit to a comparable venue. At some point, if your work is not fully flawed, the stars will align and you will get accepted.

    Comment by Itman — 20/9/2012 @ 11:21

  7. What is the alternative to peer review? In the frontiers of science, who could be better qualified to review research than peers?

    Comment by Diane Oyen — 20/9/2012 @ 12:02

  8. @Diane

    We have a peer review process. People think it is reliable and they typically do not question it. I think that the evidence shows that it has faults. There could be other, better systems.

    The healthy thing is to look at the facts and think: how can we make things better.

    Please see Peter Turney’s comment for one proposal.

    Comment by Daniel Lemire — 20/9/2012 @ 12:31

  9. The problem as I see it is that there is little or no cost to trying over and over again. That is related to your transparency argument.

    Keep submitting the same bad paper again until you get lucky with less knowledgable, less thorough, or less picky reviewers.

    In other fields:

    Keep submitting the same structured finance security to the rating agency and tweak it (based on the very model used to assess it) until you obtain a AAA rating.

    Keep submitting the same patent, making minor changes or narrowing the claims based on the examiner’s responses, until said examiner runs out of arguments or patience and your patent is granted.

    Comment by Benoit — 20/9/2012 @ 13:00

  10. There have been many complaints lately about the state of the peer review process. One major complaint is the randomness of acceptance/rejection of papers that you addressed here; the other is that truly novel research that considers new problems often gets dismissed. The second problem can also be stated that only papers that incrementally push the frontier on established problems get published. Many computer science conferences have been tweaking their review processes in an attempt to remedy one problem or the other – with no clear improvement in any direction.

    Perhaps the problem is not with the peer review process, but the publishing process. In particular, what is the motivation for publishing? Are we publishing because we have groundbreaking ideas that we feel the rest of the research community should know about? Or are we publishing to impress tenure committees, hiring committees, funding agencies, and so on?

    The main argument I’ve heard for the conference peer review process that currently exists is to get new ideas out fast. Of course, if the goal is speed then quality will likely be a tradeoff. But what is the motivation behind getting published fast? For the author, the motivation is obvious. What about for the field in general? If an author wants to publish fast for fear of getting scooped then how novel or brilliant can the idea really be? The idea will get into the research literature one way or another. If a researcher truly has something new, exciting and groundbreaking, it is likely to be controversial. It is worth taking the time to make sure that it is evaluated thoroughly. But the current system for publishing would not reward such an approach. Instead it would reward a series of short papers presenting incremental work.

    Comment by Diane Oyen — 21/9/2012 @ 21:45

  11. @Diane

    Perhaps the problem is not with the peer review process, but the publishing process. In particular, what is the motivation for publishing? Are we publishing because we have groundbreaking ideas that we feel the rest of the research community should know about? Or are we publishing to impress tenure committees, hiring committees, funding agencies, and so on?

    Obviously, we are mostly publishing for the second reason: to impress. And that’s founded on the fact that publishing is made artificially hard.

    The main argument I’ve heard for the conference peer review process that currently exists is to get new ideas out fast.

    It is hard to beat the speed of posting the content online. Anyone can do it in minutes. For better archival, you can post it on arXiv. You can even prepare a nice YouTube video. You cannot beat the speed, cost and reach of online distribution.

    So I would say that the goal of conferences is to impress people quickly.

    Instead it would reward a series of short papers presenting incremental work.

    The problem in the current system is that the increments do not add up. If every paper improved some technique by 5%, over 20 years, you’d have a 2.6x gain. There are many fields where people have improved the same technique for 20 or 30 years. Yet, often, you are nowhere near twice as good as the original technique. Where did the increments go?

    The problem is that the increments often do not stand under scrutiny.

    Comment by Daniel Lemire — 23/9/2012 @ 16:31

  12. As can be found explicitly in the criteria posted on journals’ websites, the decision to accept/reject a paper is normally based upon both quality (rigour, completeness, lack of errors, quality of writing, etc.) and relevance (originality, perceived importance).

    I don’t know the fraction of all rejected papers for which rejection is based on the latter criterion, which is certainly more subjective and prone to disagreement between reviewers, but one should consider the middle road chosen by some journals, notably PLoS One, that drop it altogether, with the result of much higher acceptance rates (70 % for PLoS One, compared to around 10 % for so-called top-tiered journals like Nature or even other PLoS journals).

    Comment by Marc Couture — 28/9/2012 @ 17:54

Sorry, the comment form is closed at this time.

« Blog's main page

Powered by WordPress