NeurRealism: Should we crowd-source peer review?

Peer review has been the gold standard for judging the quality of scientific work since World War II. However, it is a time consuming and error-prone process. Now, both lay and academic work is questioning whether the peer review system should be ditched in favor of a crowd-sourced model.

Currently, a typical situation from an author’s perspective is to send out a paper, and receive 3 reviews about three months later. Typically, the reviewers will not completely agree with one another, and it is up to the editor to decide what to do with, for example, two mostly-positive and one scathingly negative review. How can the objective merit of a piece of work accurately be judged on such limited, noisy data? Were all of the reviewers close experts in the field? Were they rushed into doing a sloppy job? Did they feel the need for revenge against an author that unfairly judged one of their papers? Did they feel like they were in competition with the authors of the paper? Did they feel irrationally positive or negative towards the author’s institution or gender?

And from the reviewer’s point of view, reviewing is a thankless and time-consuming job. It is often a full day’s work to read, think about, and write a full and fair review of a paper. It requires accurate judgment on all matters from grammar and statistics to a determination of future importance to the field. And the larger the problems the paper has, the more time is spent in the description of and prescription for these problems. So, at the end of the day, you send your review and feel 30 seconds of gratitude that it’s over and you can go on to the rest of your to-do list. In a couple of months, you’ll be copied on the editor’s decision, but you almost never get any feedback about the quality of the review from an editor, and very little professional recognition of your efforts.

The peer review process is indeed noisy. A study of reviewer agreement of conference presentations found that the rate of reviewer agreement was not different from chance. In a study described here, women’s publications in law reviews were shown to have more citations than mens’. A possible interpretation of this result is that women are treated harsher in the peer review process, and as a consequence publish (when they can publish) better quality articles than men who do not have the same level of scrutiny.

In peer review, one must also worry about competition and jealousy. In fact, a perfectly "rational" (Machiavellian) reviewer might reject all work that is better than his own for the purpose of advancing his career. In a simple computational model of the peer review process, it was found that the ratio of either "rational" or random reviewers needed to be kept below 30% for the system to beat chance. It also concludes that the refereeing system works the best when only the best papers are published. One can easily see how the “publish or perish” system hurts science.

It is a statistical fact that averaging over many noisy measurements provides a more accurate answer than any one answer. Francis Galton discovered this when asking individuals in a crowd to estimate the weight of an ox. Pooling over noisy estimates works when you ask for one measurement from many people, or when you ask the same person to estimate multiple times. A salient modern example of the power of crowd-sourcing is, of course,Wikipedia.

In a completely crowd-sourced model of publication, everything that is submitted gets published, and everyone who wants to can read and comment. Academic publishing would be quite similar to the blogosphere, in other words. The merits of a paper could then be determined by the citations, track backs, page views, etc.

On one hand, there are highly selective journals such as Nature who reject more than half of submitted papers before they even get to peer review and finally publish 7% of submissions. In this system, too many good papers are getting rejected. On the other hand, a completely crowd-sourced model means that there are too many papers for any scientist in the field to keep up with, and too many good papers won’t be read because it’s not worth one’s time to find diamonds in the rough. Furthermore, although the academy far from settled on the matter of how to rate professors for hiring and tenure decisions, it is more unclear what a “good” paper would be in this system as more controversial topics would get more attention.

The one real issue I see is that without editors seeking out reviewers to do the job, I worry that the only people reviewing a given paper will be the friends, colleagues and enemies of the authors, and this could make publication a popularity contest. Some data bear out this worry. In 2006, Nature conducted an experiment on the addition of open comments to the normal peer review process. Of the 71 papers that took part in the experiment, just under half received no comments at all, and half of the total comments were on only eight papers!

So, at the end of the day, I do believe that with good editorial control over comments, that a more open peer-reviewing system would be of tremendous benefit to authors, reviewers and science.

NeurRealism

Monday, September 20, 2010

Should we crowd-source peer review?

No comments:

Post a Comment