Double-blind peer review is a bad idea

When you submit a manuscript to a journal or to a conference, you do not know who reviews your manuscript. Increasingly, due to concerns with biases and homophily, journals and conferences are moving to a double-blind peer review where you have to submit your paper without disclosing your identity. There is also a competing move toward more openness where everyone’s identity is disclosed.

The intuition behind double-blind review is that it is harder to discriminate against people if you do not know their name and affiliation. Of course, editors and chairs still get to know your identity. The intuition behind open peer review is that if your reviews are published, you will be kept in check and may get punished if you are too biased. But people are concerned about their reviews or the reviews of their papers being published.

There are many undesirable biases involved in a professional setting. Of course, there are undesirable biases against some minorities and women. There are other biases as well. There are indications that the prestige of the author can be a determining factor when judging a piece of work. People generally tend to review people who are like themselves more highly. There are undesirable orthodoxy biases as well: uncommon ideas are far more difficult to defend even when the most common ideas have not been revisited lately. Conventional affiliations are more highly rated than unconventional affiliations.

Yet we should not immediately accept that hiding the identity of the author is the solution. The mere fact that we recognize a problem, and that there is some action related to the problem, does not imply that we must proceed with that action. Our tendency to do so relies on a fallacy known as the politician’s syllogism.

The Australian government, motivated by a study that claim blind auditions helped women, conducted an extensive evaluation of blind interviews and found the following:

This study assessed whether women and minorities are discriminated against in the early stages of the recruitment process for senior positions in the Australian Public Service (APS). It also tested the impact of implementing a ‘blind’ or de-identified approach to reviewing candidates. Over 2,100 public servants from 15 agencies participated in the trial. They completed an exercise in which they shortlisted applicants for a hypothetical senior role in their agency. Participants were randomly assigned to receive application materials for candidates in standard form or in de-identified form (with information about candidate gender, race and ethnicity removed). Overall, the results indicate the need for caution when moving towards ’blind’ recruitment processes in the APS, as de-identification may frustrate efforts aimed at promoting diversity.

To be clear, what they found was the reverse of what they were expecting: blinding interviews made things slightly worse for women.

Ersoy and Pate find that the current non-blind peer review process favours women:

Our results suggest that male economists at top institutions benefit the most from non-blind evaluations, followed by female economists (regardless of their institution).

They find a bias against males at non-elite institutions.

And this study that shows that blind interviews helped women get hired by orchestra? Its statistical analysis does not stand up to scrutiny. And the left-leaning New York Times has recently published an essay arguing that blind interviews make orchestra less diverse.

Clearly, we believe that we can effectively combat undesirable prejudices in hiring since most employers do not hire based on a double-blind process. PhD students submit their thesis for review without hiding their name. Nobody is advocating that research papers be published anonymously as a rule. Nobody is advocating that we stop broadcasting the name of our employers, where we got our degrees and so forth. Nobody is advocating that when we report on a research result, we hide the name of the journal… Yet if we wanted to present pure research results, that is what we would do: hide affiliations, journal names, author names.

So why would we not want to hide the identity of the researchers during peer review despite the apparent advantages?

Firstly, the evidence for the benefits of double-blind peer reviews is a set of anecdotes. Double-blind experiments can bring biases to light the same way a microscope can show you a bacteria: they are great inquiry tools, but not necessary cures. What is scientific fact is that people have biases, homophily, and that you can, up to a point, anonymize content. However, the evidence for benefits is mixed. It is not clear that it helps women, for example. Do we get more participation from people outside the major universities over time under double-blind peer review? We do not know. Major conferences that did switch to double-blind peer review, like NeurIPS, are heavily dominated by a few elite institutions with almost no outsiders.

Secondly, telling someone from a poorly known organization, from a poor or non-English country or from non-dominant gender identity that they need to hide who they are to be treated fairly is not entirely a positive message. I certainly want to live in a world where a woman can publish her work as a woman. Stressing biases without properly addressing them can render fields unattractive to those who might suffer from these biases.

Another concern is that double-blind renders open scholarship difficult. I have been posting most my papers online, prior to peer review on arXiv or others servers, sometimes years before they are even submitted. I write all my software openly, engaging freely with multiple engineers and researchers. I practice what I call open scholarship. Obviously, it means I cannot reasonably take part in double-blind venues. Making open scholarship more difficult like seems a step backward. You can argue that you can still anonymize your contributions, in a bureaucratic manner, for the few days that the review last. But such a proposal dismisses the fact that open scholarship is primarily a cultural practice founded on the idea that the research happens in free and open networks.

And what happens after the work has been accepted? When the referees are biased, why would the readers not be biased as well? What is more important, the readers or the reviewers? Do we write papers to be published or to be read? I vote for the latter without hesitation. Yet, at best, double-blind peer review might help with getting papers accepted, but it does nothing for post-publication assessment. It is almost as if we thought that the end goal of the game was to get the research published in prestigious venues. Are we all about maximizing the impact factor or do we care to produce impactful research? If you are to be consistent with your beliefs, then if you promote double-blind peer review, you should also demand that we stop cataloguing and broadcasting affiliations. At a minimum, we should downplay the names of the authors: if we include them at all, they should be at the end of the paper, in small characters. If you are consistent with your beliefs, you should never, ever, give lists of names with affiliations. It seems logically incoherent for someone from an elite institution to be arguing for double-blind peer review while visibly broadcasting their elite institution. In part, I believe that they end up with such an illogical result because they start from a fallacy, the politician’s syllogism.

The San Francisco Declaration on Research Assessment tells us: “When involved in committees making decisions about funding, hiring, tenure, or promotion, make assessments based on scientific content rather than publication metrics.” Focusing on how papers get accepted misses the point of what we want to value. Yet a direct consequence of double-blind peer review is to make highly selective paper acceptance socially and politically more sustainable.

There is no free lunch. Double-blind peer review is not without cost.

Blank reported that authors from outside academia have a lower acceptance rate under double-blind peer review presumably because reviewers, when they can, tend to give a chance to outsiders despite the fact that outsider do not conform to the field’s orthodoxy as well as insiders may. Moreover, Blank indicates that double-blind peer review is overall harsher.

This “harsh” nature has been replicated and quantified. Double-blind peer review manuscripts are less likely to be successful than single-blind peer review manuscripts.

So there are unintended consequences to double-blind peer review. Having hasher reviews and lower acceptance rates may not be a positive. A student may think: “Why continue to seek approval, when you can leave science and do something else where you’ll be appreciated?”

And is the harsh nature entirely a side-effect? The introduction of double-blind peer review is partly justified by the mission we give the reviewers: select only the very best work. Once we relax this constraint on reviewers, double-blind peer review becomes much less necessary. In some sense, double-blind peer review is a way to make socially acceptable an elitist system.

If we want, for example, to increase the representation of women, there are potentially other means that are less intrusive and more positive, like, for example, including more women in the peer review process as reviewers, editors and so forth. The same applies to other biases. For example, you should ensure that people from small colleges are represented, or from poorer or non-English countries. And what about including people who have less orthodox ideas? What about including more outsiders? What about what Stonebraker might call “consumers of the research”? Look at the most desirable conferences in computer science that have adopted double-blind peer review. How many are chaired by people from non-elite institutions? When they organize plenary talks, how many are from non-elite institutions?

At a minimum, if we want to get more constructive reviews, we should give serious consideration to the demand that pre-publication peer reviews be published. Transparency is a good, practical strategy to fight undesirable biases and get people to be more constructive. We should be mindful that blinding a process, everything else being equal, makes it less transparent. In an open system, if I give raving reviews to my friends, and harsh reviews to ideas that I hate, I risk being exposed. In a fully blinded process, I can always claim impartiality. But if everyone is blinded bureaucratically, people with unacceptable biases can maintain plausible deniability should they ever be caught.

And here is another idea. Do we need the crazy low acceptance rates? In computer science, it is common that fewer than 15% of all papers are accepted. Do we realize that the outcome is unavoidably a power hierarchy controlled by a select few who pick the winners. By accepting more papers, we would necessarily make biases in peer review less harmful. We would reduce the power of the select few. Open source journals like PLOS One have shown that you can turn peer review away from a selection of the winners to a pruning of the bad research, with good results. The argument used to be that the conference was to be held in a hotel with only so many rooms, but zoom and youtube have millions of rooms. Of course, the downside then is that hiring and promotion committees cannot simply count the number of papers at prestigious venues and they must read the papers and discuss them. It is hard work. And the candidate can no longer just offer a list of papers, they have to explain why their work matters in a way that we can understand.

I do not think that the initial submission is the right time to judge the importance of a piece of work. If you look at even the best venues, most of the accepted papers are not impactful. That’s not the authors’ fault. It is just that really impactful work is rare and unpredictable. And it often takes time before we can recognize it. And different people will value different papers. By insisting that referees can reliably select the very best work, we fail to take into account the thoroughly documented limitations of pre-publication peer review. In some sense, by making it look more objective, we make things worse. We should just acknowledge that pre-publication reviews are intrinsically limited and build the system with these limitations in mind.

Though the problems that double-blind peer review seeks to address are real and significant, double-blind peer review is itself a rather crude and pessimistic solution that has several undesirable consequences. We can do better.

(Presented at the ACM Publications Board Meeting, November 19th 2020)

Further reading: Gender and peer review

Update: I love Peer Review: Implementing a “publish, then review” model of publishing

Appendix: Some selected reactions from twitter…

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

31 thoughts on “Double-blind peer review is a bad idea”

  1. I will try to summarise your argument against double-blind:

    there is no proven evidence
    Telling people they have to hide who they are is not a positive message
    double-blind renders open scholarship difficult (e.g. arXiv)
    If the reviewers are not biased the readers can be
    double-blind seems to lead to harsher reviews and higher rejection rates

    I lived in conducted research in South America. This is made me keenly aware of the privilege that we enjoy in North America, Europe and East Asia in terms of publishing and attending conferences. I remember that when I was there, the condescension that I would get from reviewers would be really bad. This is led me to think that as bad double-blind is, it is certainly an improvement over single-blind.

    What I do now is: kindly ask a colleague to remove the names of the authors, their affiliations, acknowledgment of funding, code repository or any kind of information that could lead to the identity of the authors. The way I have no idea if the paper comes from a high-income country or a well-established lab. Regarding your arguments:

    While I agree there’s no evidence, if the reviewers make a bona fide effort to not determine who the authors are, I think it will lead to a fairer reviewing process from low-income countries or smaller labs.
    I wish we lived in a world where Slovenia had the same scientific reputation as Switzerland and the same paper from either country had the same chance of being published, this is not going to happen tomorrow. We have to acknowledge that biases do exist and will keep existing for a while and double-blind is one way to go about it.
    That is true, open scholarship is difficult. But that is why I refuse to review any paper for which I know the identity of the authors because I saw the article on arXiv. That said not everyone might do this. So yes that is a problem.
    That is also true, readers will be biased as well but there is absolutely no reviewing system where that will not happen. But please double-blind lessens the biases inherent to the reviewing process.
    Double-blind has led to harsher reviews. Perhaps but the nuance is that they are harsh on everyone.

    While I will concede that double-blind is not a panacea, I think it is a step above single-blind. Another possibility is to have the reviews fully open, public and signed and have a publication model like PLoS ONE. I would be probably more in favor of the latter, I would need to think about this. But I would still pick double-blind over single-blind.

    1. Thank you Gabriel for your excellent comment.

      While I agree there’s no evidence, if the reviewers make a bona fide effort to not determine who the authors are, I think it will lead to a fairer reviewing process from low-income countries or smaller labs.

      That is a falsifiable hypothesis. Remember that Blank and others have reported that outsiders have actually a lower acceptance rate under a double-blind review process.

      But I submit to you that even if it is true that people from Slovenia do better under the new system, and we don’t know whether it is true, it does not follow that we should adopt double-blind peer review because double-blind has negative consequences of its own.

      Let me summarize some other counterpoints that you do not address.

      Why would you care about the biases at the acceptance stage, but not about the biases after the acceptance? If done right, acceptance should only have to do with whether the science is correct. If venue A makes a mistake, you have venue B and venue C, and so forth. Acceptance is a minor issue, unless you disregard the San Franscico declaration and accept that we should assess researchers by the prestige of their venues as opposed to their work itself.

      Now, if you are consistent with your belief that people from Slovenia are suffering large prejudices (and they may, to be clear), then you should be very concerned about what happens to them after the double-blind peer review has completed.

      So I submit to you that you should demand that the published work itself be either anonymized, or that, at least, we hide from view the affiliation of the authors.

      It makes no sense to me to argue that people from Slovenia are being disregarded and then to turn around and broadcast everywhere the affiliation of the authors of the accepted paper.

      That is, it makes no sense unless we consider that acceptance is the end game, the prize to win. I strongly object to this view. Regarding my own work, I do not much care about getting it accepted, I care about whether it is good. Getting my work accepted, if it is good, is not a big challenge. There are hundreds of venues. Of course, getting it accepted at a highly prestigious venue could be a problem… but then, again, I turn back to the San Franscico declaration: do we assess people based on the prestige of the venues? Many of us think we should not. It turns research into a numbers game. I reject that.

      We have to acknowledge that biases do exist and will keep existing for a while and double-blind is one way to go about it.

      Biases exist. This is a scientific fact. How strong they are and in which direction they go and what the trends are… it is a difference story. I submit to you that prejudices against women, black authors, Chinese authors… are far more significant today than they were 50 years ago.

      That is also true, readers will be biased as well but there is absolutely no reviewing system where that will not happen. But please double-blind lessens the biases inherent to the reviewing process.

      In some sense, double-blind peer review gives even more power to the reviewers. You say that you can count on the honor of the reviewers… that they won’t hunt down the identity of the authors. This is certainly true in general, but it also gives them plausible deniability if they want to be bad actors. I spot this paper which I recognize is from my friends. I argue strongly in favor of it, something I would not be allowed to do under the previous system.

      Furthermore, it may prevent people from controlling their biases. Is this paper just poorly written by a lazy grad student, or are the authors reasonable people from a non-English country?

      Double-blind has led to harsher reviews. Perhaps but the nuance is that they are harsh on everyone.

      Please consider the long term effect. Our current system chase away some personality types. If you dislike politics and harsh reviews, you are more likely to leave the field. People like myself, who are hard to discourage, are favoured under the current system, but I am not entirely sure I like that. You end up getting a field dominated by people with very strong egos. In turn, this reinforces the field as highly competitive one.

      There is some irony in trying to use double-blind peer review along with the words diversity and inclusion. That is, I see no evidence that double-blind peer review is not, if anything, anchoring the usual strongholds. In my experience, it is not, generally, the people from Slovenia clamoring for double-blind peer review. It is the people from the top research schools.

      Given that people are naturally self-interested, and assuming that the hypothesis holds (people from big schools are being displaced by double-blind peer review), would you not expect some resistance from them? You may argue that they are saints that somehow want to do good… but I submit to you that double-blind peer review is not, in the least, harmful to the conventional power hierarchy. It does not displace it. People from Slovenia are not moving in.

      Remember: we have reasons to believe that double-blind peer review is harmful to outsiders.

      Another possibility is to have the reviews fully open, public and signed and have a publication model like PLoS ONE. I would be probably more in favor of the latter, I would need to think about this. But I would still pick double-blind over single-blind.

      I strongly favor a PLoS ONE model. I submit to you that it is the real threat to the established power hierarchy, not double-blind peer review.

      Let us put all good research on a level playing field.

      1. I do not think that you and I disagree by much, what I am arguing for is that double-blind is a step above the ubiquitous single-blind. Both however do not take care of:

        1) biases post-acceptance in citations, exposure (how many tweets, articles, blog posts about it) etc.

        2) Vitriolic reviews.

        1) is problematic under any system. There is no fixing that unless you have anonymous papers which would be a quagmire to claim authorship.

        Under double-blind, whether 2) would lead to a significantly higher increase of narcissists in academia, that remains to be demonstrated. Double-blind is not impervious to nefarious reviewers who hunt down the identity of the authors.

        I really like the arXiv system. However, this system merely invites comments and people do not have an incentive to review. What would be interesting is to have a reward for reviewers for arXiv which would lead to many versions of the article à la f1000. I am arguing double-blind vs fully open. I favor a fully open model. I am arguing for double-blind against single-blind.

      2. Why would you care about the biases at the acceptance stage, but not about the biases after the acceptance?

        We should obviously care about both – but why should A is bad imply don't even try to fix B? Also, I don’t think these two are not independent from each other. A biased is often grounded in what you are used to see. You never see paper by women from South America? You will not approach such a paper in the same way when reviewing it, compared to a paper with five male authors from North America. So – to counter this bias (in reviewers and in readers), we should try to have equal representation publication-wise. But for that, we first need to fix the reviewing bias.

        Acceptance is a minor issue, unless you disregard the San Franscico declaration and accept that we should assess researchers by the prestige of their venues as opposed to their work itself.

        I don’t accept that we should assess researchers by the prestige of the venues, but I think it is unrealistic to assume that it doesn’t happen. I just completed my PhD and thought about staying in science (which I won’t), and the pressure to publish at highly regarded venues was definitively a factor that contributed to this decision. And I’m a white male at a European university. I’ve seen how professors are hired at my institution, and “where did that person publish” is definitely one of the major contributing factors.

        Maybe we’ll be at some point where “acceptance is a minor issue” will be true for virtually all researchers, but I don’t think we’re at that point yet.

        If done right, acceptance should only have to do with whether the science is correct.

        This is a very weird sentence in a comment arguing that there should be names attached to papers under review. 😉 I fully agree, though. This also counters your argument that outsiders have a better chance in single-blind review – whether their papers are accepted or not should be decided based on the science, not on their outsider status.

        1. Thank you for your great comment Lukas.

          We should obviously care about both – but why should A is bad imply don’t even try to fix B?

          The same people who, rightly, observe that there are flaws in the review process, never talk about the other flaws which should be much more significant. That is, it is much more important how reader react than how a given publication reacts. So why are we concerned solely about the former minor problem? Implicitly, it is because we want peer review to pick winners. That’s what I am calling out.

          I am saying that we have the wrong focus.

          Suppose you have a mole on your arm and you are going blind. You go to the doctor and he totally ignores the fact that you are going blind. Wouldn’t you be curious as to why he focuses on the secondary problem instead of the primary one?

          Evidently, we think that paper acceptance is what matters.

          This goes against me deeply held values.

          I don’t accept that we should assess researchers by the prestige of the venues, but I think it is unrealistic to assume that it doesn’t happen. I just completed my PhD and thought about staying in science (which I won’t), and the pressure to publish at highly regarded venues was definitively a factor that contributed to this decision. And I’m a white male at a European university. I’ve seen how professors are hired at my institution, and “where did that person publish” is definitely one of the major contributing factors. (…) Maybe we’ll be at some point where “acceptance is a minor issue” will be true for virtually all researchers, but I don’t think we’re at that point yet.

          You have to distinguish between what people think is true from what is actually true. To have impact as a scientist, I do not think you need to be from a prestigious institution, to publish in highly selective venues or to publish many papers.

          I understand that it is a commonly held belief, but it does not make it true.

          This is a very weird sentence in a comment arguing that there should be names attached to papers under review. 😉 I fully agree, though. This also counters your argument that outsiders have a better chance in single-blind review – whether their papers are accepted or not should be decided based on the science, not on their outsider status.

          I do not think it is weird. If you stop picking winners, then you do not have to worry so much about biases.

          If instead of tasking the reviewers with picking this year’s top 10 papers, you just ask them “is this good science”… then you do not have to worry so much about biases. I am not dismissing the existence of unwanted prejudice, but I am saying that you can lower drastically the effect.

  2. Very interesting discussion! Before reading this blog post, I strongly supported double-blind reviewing. Now I’m not so sure.

    I recently (2019) published a paper in PLOS ONE, and it was a pleasant experience. The reviewers gave helpful comments and the time from submission to publication was short.

    PLOS ONE publication policy is:

    https://journals.plos.org/plosone/static/publish

    “We evaluate research on scientific validity, strong methodology, and high ethical standards—not perceived significance.”

    This makes a lot of sense to me. “Perceived significance” is extremely subjective. I don’t think anybody can reliably predict the future impact (significance) of a paper. This is not something we should be asking reviewers to do. Reviewers tend to err on the side of conservatism, so truly novel work will tend to be rejected.

    If we drop “perceived significance” as a requirement for acceptance, suddenly many issues go away. Reviewers are forced to focus on (objective) correctness instead of (subjective) significance. There is no reason for secrecy. Why should a reviewer want to be anonymous when they are merely making a statement of fact (not a judgment of significance) about an error in a paper? Why should an author want to be anonymous when they know their paper will be accepted, after all factual errors are corrected?

    Do I, as a reader, want a reviewer to tell me whether a paper is significant, in their opinion? No. I can make that decision on my own.

    Do I, as a reader, want a reviewer to point out an error in a paper? Yes.

    1. As a devil’s advocate — let me cite our post on the topic :

      Functions of conferences.
      For the author, conferences provide:

      Knowledge dissemination: “I want people to know about the new knowledge I discovered.” The conference promises a certain minimum level of attention one’s work gets. In other words, if you don’t publish at top conferences, nobody would read your work.
      Feedback: I would like people to check my results through the review process and discussion.
      Formal goodies: checking boxes, required to defend Ph.D., for tenure package, performance review, to put into grant report, etc.
      Certification. Publishing at CVPR is hard, therefore valuable.
      Reputation-building: Listing certain conferences on C.V. as a way of building one’s name as a scientist.
      Networking: meeting with peers, potential employers, etc.
      Out the six functions, only 2.5 (dissemination, feedback, and part of certification) are related to the science as a knowledge mining process, or the first definition of science. We will get back to it shortly, now functions for the audience:

      For the audience:

      Prefiltering: time is limited, so we outsource the selection of what we are reading to the reviewers.
      Certification: time is limited, so we outsource the quality control and result check to the reviewers. We create a basic classifier: “If the paper is published by a top-conference, it is true.”
      Special case of certification: for people outside the field without the basic qualifications to select work that meets basic quality guarantees.
      Authors promise to answer our questions (symmetrical to “attention for the author from audience,” and audience gets the guarantee that questions about the work will be answered at the talk or poster session).

      Partly “certification” serves science as a knowledge mining process, reducing a barrier to build on top of others’ work. The rest of the functions serve the science-as-implemented current model of professional scientific work and help the community to cope with resources (time, money, attention) scarcity.

      END OF QUOTE.

      The problems with reviews arise because of those “Prefiltering” and “Formal goodies” functions of the conference. On the one hand, they are there for a reason. On the other hand — they are not related to “science as a science”, but the “business of science”.

      The current community “solution” to the problem is arXiv, as it allows the reader to do the pre-filtering or no filtering, or whatever they want.

      https://amytabb.com/ts/2020_08_21/

    1. I debated Claire so yes, I know her points. I agree with her facts, but I come to a different conclusion. Note that she conceded that the evidence regarding the benefits to women was mixed, something that her blog post may not reflect.

  3. There is also a competing move toward more openness where everyone’s identity is disclosed.

    This seems to hint at a common misconception about open peer reviewing, which is repeated in your comment: “I think you cannot have both double-blind and open.”

    Open peer reviewing in fact means that peer reviews are posted online for everyone to see, and everyone (official reviewers, authors, readers) can intervene in the discussion about a submission. But it is actually not incompatible with double-blind peer reviewing: the official reviewers can be pseudonymous, and the identity of paper authors hidden from them, even if the discussion is otherwise held in the open. For instance, the OpenReview.net platform does open peer reviewing, but it features venues that are completely public, single-blind (the identity of the official reviewers remains hidden if they wish), or even double-blind (single-blind plus the identity of paper authors is hidden from official reviewers during the reviewing process).

    Clearly, we believe that we can effectively combat undesirable prejudices in hiring since most employers do not hire based on a double-blind process.

    It does not seem clear to me at all, to put it mildly, that undesirable prejudices are effectively avoided when hiring nowadays. And searching on the web, there seems to be a trend towards blinding resumes, for instance I found https://medium.com/@sprintcv/anonymize-cv-how-to-do-it-efficiently-using-sprintcv-1cc03d94a23b or https://www.beapplied.com/post/how-to-anonymise-cvs. See also https://en.wikipedia.org/wiki/Blind_audition for an example in the musical domain. As less ambitious measures, resumes are often required not to include unnecessary info about the candidate, e.g., not have a photo, or sometimes not indicate the first name to avoid gender biases, which I hope we can all agree is a good idea (though it does not prevent introducing bias in favor of underrepresented minorities independently from resume evaluation).

    The main argument against ubiquitous resume blinding seems to be that in many areas it is challenging to do it while making it possible to evaluate the applicant. By contrast, in academia, the identity of paper authors is trivial to hide from submissions and completely irrelevant to the merits of the paper.

    Firstly, the evidence for the benefits of double-blind peer reviews is a set of anecdotes

    The biases of a single-blind conference in terms of author fame, nationalities, gender (from first names), etc., have in fact been quantified during reviewing for the WSDM’17 conference. Here is the study: https://www.pnas.org/content/early/2017/11/13/1707323114.full.

    This does not mean that double-blind reviewing eliminates these biases (it only shows biases at a single-blind conference), but clearly indicates that showing author information to reviewers is at least dangerous.

    Telling someone from a poorly known organization, from a poor or non-English country or from non-dominant gender identity that they need to hide who they are to be treated fairly is not entirely a positive message

    The simple way to apply this is to require double-blind for all submissions at a given venue (not have optional double-blinding); and as far as I know this is in fact how many venues work.

    I certainly want to live in a world where a woman can publish her work as a woman.

    I certainly do too, but just because double-blind reviewing doesn’t solve the deeper problem doesn’t mean that it isn’t a valuable interim solution.

    I practice what I call open scholarship. Obviously, it means I cannot reasonably take part in double-blind venues.

    This may refute strong versions of double-blind reviewing where authors are required to ensure that reviewers cannot unblind them even if they try (e.g., there should be no arXiv preprint, etc.); like in Dmytro Mishkin’s comment. I agree that these implementations are impractical and an obstacle to open scholarship (and I believe that they are undesirable).

    However, open scholarship is not an obstacle at all against lighter double-blind reviewing where authors are just required to omit author information and anonymize self-citations in the submitted article. Reviewers are expected not to try to unblind them (but it’s OK if they accidentally do, e.g., if they remember the work from an earlier preprint). This does not avoid all biases, but it is already very helpful as it works most of the time. This light double-blind reviewing is in fact very common, it is what is done at STACS’21 among many other venues.

    Yet, at best, double-blind peer review might help with getting papers accepted, but it does nothing for post-publication assessment.

    Just because double-blind reviewing doesn’t solve all problems, doesn’t mean it isn’t the right thing to do to solve part of the problem, right?

    Blank reported that authors from outside academia have a lower acceptance rate under double-blind peer review presumably because reviewers, when they can, tend to give a chance to outsiders despite the fact that outsider do not conform to the field’s orthodoxy as well as insiders may.

    I wasn’t aware of this study, and I agree it may be a valid argument. That said, relying on a biased system just to profit from some of the favorable biases doesn’t seem ideal. If there is a goal to judge outsider papers more favorably, having deliberate efforts in this direction (special tracks, quotas, or adding this information as input at a later reviewing stage) would seem like a better idea.

    Moreover, Blank indicates that double-blind peer review is overall harsher. This “harsh” nature has been replicated and quantified. Double-blind peer review manuscripts are less likely to be successful than single-blind peer review manuscripts.

    The solution is simple: impose double-blind reviewing to all submissions at a venue to ensure that they are treated equally.

    Having hasher reviews and lower acceptance rates may not be a positive.

    I agree that there is a huge problem of the reviewing culture being harsh and unwelcoming, especially to newcomers, and especially to people from underrepresented groups who do not feel legitimate in academia. It is urgent to fix this problem, but it has nothing to do with double-blind reviewing; even if double-blind reviewing removes the mitigating factor where reviewers will be kinder with people that they know.

    The introduction of double-blind peer review is partly justified by the mission we give the reviewers: select only the very best work.

    This is far from being the only justification. Fighting bias is a much better justification for double-blind reviewing. Even in a system which wants to publish all minimally interesting papers, biases can always mean, e.g., that famous authors will always get a free pass because reviewers trust them, or outsider authors will attract more scrutiny.

    [points about having more diverse PCs and about the stupidity of low acceptance rates]

    I completely agree. On the second point, see also: https://games-automata-play.github.io/blog/confVSjournal/.

    double-blind peer review is itself a rather crude and pessimistic solution that has several undesirable consequences. We can do better.

    The way I see it, light double-blind reviewing across an entire venue is a very simple solution, which we can expect to help avoid measurable biases, and is trivial to implement. Of course, it doesn’t solve all of academia’s numerous other problems, but that’s not a valid argument against it, I believe. Also, there are also many people who mean something different and less convenient when they say “double-blind reviewing”, so indeed one has to be careful to distinguish the different possible implementations.

    For the kind of double-blind reviewing I’m defending, the only argument that I personally found convincing in your post is the one about outsiders faring less well. But let’s turn the system around: if the standard were for all venues to practice double-blind reviewing, and we wanted to bias the system towards accepting more papers by outsiders (which may indeed be a very reasonable kind of bias to introduce), would the right solution be to completely disclose the full author names and affiliations to reviewers on papers from the very beginning, and hope that they’d implicitly factor it in, precisely how we’d like? This wouldn’t be what we’d do, right?

    I understand the argument that, in fully open scholarship, double-blind peer reviewing may be impractical and no longer desirable. But even if we’re very optimistic about academic practices evolving, the process of reviewing papers for “acceptance” in some sense is still going to stay for a long time I believe: even with open reviews à la OpenReview.net, even with epijournals, even with more welcoming reviews and more reasonable acceptance rates, etc. Even if we finally move to a fully open system with open platforms having completely overthrown traditional conferences and journals, you’ll probably want to keep a system to have people vouch for the correctness and interest of papers, or to give awards to the very best papers. And for such systems, which is what reviewing does nowadays (admittedly in an imperfect, harsh, and excessively selective fashion), it probably makes sense to hide the identity of reviewers and of authors — to avoid bias and because it’s completely irrelevant information that’s really not complicated to remove.

    1. Thanks for the comment.

      In the context of my post, open is by opposition to double-blind, and single-blind.

      I do not view double-blind as a partial fix. I view it as a way to justify the continued existence of an elitist system. A conference like NeurIPS has a double-blind peer review system. The bulk of the papers are from a small set of elite institutions. Even if you go toward the very end of the list, to the institutions with very few papers, you are still in elite territory. You will not find anyone from Senegal or from a small college. The board is made almost entirely with people from elite institutions. But they are very inclusive, aren’t they, because they use double-blind peer review? No. they are not inclusive. They are elitist.

      I believe you will find that biases under a system like PLoS One are much less of a concern. Once you lower the stakes for peer review, you have less concerns about biases.

      The WSDM 2017 study is one data point and not representative of the literature. We still agree that homophily and prestige biases are very real: the evidence is overwhelming, but it does not follow that blinding will make things better.

      1. Let me make this comment more precise. I do not mean that the people publishing at these venues are elitist. I mean that the venue itself is elitist. That is, it is an elitist institution. The people in it may not be.

        My point is that it is very hard to break in if you are not already an insider (part of an associated elite institution).

        (I chose NeurIPS deliberately because I have no relation with it. I should disclose that I have published papers at similar venues.)

        1. Thanks for your answers. I completely agree that reviews like NeurIPS and others are elitist in the way you describe. That said, I’m afraid they will remain a defining part of our work and of research for the years to come, with researchers like you and me perpetuating this system by submitting our work there, reviewing there, organizing them, etc., and being evaluated based on this.

          Knowing this, and while campaigning to fix the broader problem that this system is elitist and generally broken, I do think it makes sense to wonder what’s the best way to make the system work better (or probably better) with trivial adjustments. I doubt people seriously believe that NeurIPS and others are not elitist at all just because they are double-blind — and if some people think that, this is the problem, not double-blind reviewing itself.

          Choosing single-blind or double-blind is one such simple adjustment. And it doesn’t seem plausible to me that reverting to single-blind reviewing and adding back unfiltered author information to NeurIPS reviewing would give a better outcome, as opposed to other more deliberate solutions like quotas or using author information after scientific reviewing has been done. And of course single-blind venues comparable to NeurIPS are also elitist.

          The title of your post is “Double-blind peer review is a bad idea”. I’d agree that it’s not a perfect idea, or not going far enough, or maybe not so useful, or not addressing the right problem. But I fail to see why existing venues shouldn’t take the trivial step of switching to it right now, or why existing double-blind venues would benefit from reverting to single-blind.

          1. But I fail to see why existing venues shouldn’t take the trivial step of switching to it right now

            The mere fact that we recognize a problem, and that there is some action related to the problem, does not imply that we must proceed with that action. Our tendency to do so relies on a fallacy known as the politician’s syllogism. 

  4. I believe you will find that biases under a system like PLoS One are much less of a concern. Once you lower the stakes for peer review, you have less concerns about biases.

    This comment made me think about the publication fees, which can be a barrier for some authors. I looked on the PLoS One site, and they have addressed this issue very thoroughly:

    https://plos.org/publish/fees/

    Another reason to support this model of publication.

  5. To be fair, since I was comparing NeurIPS to PLoS One, it seems like PLoS One is much more expensive. NeurIPS 2020 was very cheap (100$).

    Isn’t that due to Covid, forcing conferences to go virtual?

    1. Yes. The cost of attending a conference in person is typically much higher than publishing in a journal.

      Of course, nothing beats posting the paper online but I want people paid to maintain good journals.

      1. It needn’t be the authors who pay to maintain good journals. Some open-access journals are free to both authors and readers, with the hosting costs paid by universities, public research institutes, or sponsors. LMCS in my field is one such example, and there are of course other free services for related things, like arXiv. The hosting cost of a platform running a Web app to manage reviewing plus serve some static PDFs is simply not that high.

        I don’t think the author-pays model, with APCs over $1000 like PLOS does, is the right model, even when trying to plug the gaps with exemptions for underrepresented countries. (Compare this to LIPIcs’s 60 EUR per paper fee.)

        But indeed this criticism about publication costs also applies to essentially all pre-COVID conferences.

  6. This is one of the worst-argued essays I’ve ever read.

    The bias within peer review tends to be on the basis of institutional prestige and fame – i.e. a paper from Deepmind is probably better than a paper from a school in Nova Scotia. I think that it would be very rare for a reviewer to find the name of an author, search around to figure out their race/gender identity, and then have such indiscriminate rage at a stranger that they’d try to get their paper rejected. Some fraction of the time, a person’s name tells you their ethnicity, so maybe some dumb Hong Kong supremacist protestor would try to get a Chinese person’s paper rejected. However most of these dumb hong kong protestors are too focused on terrorism for research.

    The institutional bias is a serious issue, because a lot of junior reviewers will be biased by seeing a famous name.

    1. The institutional bias is a serious issue, because a lot of junior reviewers will be biased by seeing a famous name.

      The prestige bias is absolutely real. It does not follow that the cure is double-blind peer review.

Leave a Reply to Daniel Lemire Cancel reply

Your email address will not be published.

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

You may subscribe to this blog by email.