You are currently browsing the tag archive for the ‘peer review’ tag.

I don’t often go to empirical talks, but when I do, I fall asleep. Recently, while so engaged, I dreamt of the `replicability crisis’ in Economics (see Chang and Li (2015)). The penultimate line of their abstract is the following bleak assessment:

`Because we are able to replicate less than half of the papers in our sample even with help from the authors, we assert that economics research is usually not replicable.’

Eager to help my empirical colleagues snatch victory from the jaws of defeat, I did what all theorists do. Build a model. Here it is.

The journal editor is the principal and the agent is an author. Agent has a paper characterized by two numbers {(v, p)}. The first is the value of the findings in the paper assuming they are replicable. The second is the probability that the findings are indeed replicable. The expected benefit of the paper is {pv}. Assume that {v} is common knowledge but {p} is the private information of agent. The probability that agent is of type {(v,p)} is {\pi(v,p)}.

Given a paper, the principal can at a cost {K} inspect the paper. With probability {p} the inspection process will replicate the findings of the paper. Principal proposes an incentive compatible direct mechanism. Agent reports their type, {(v, p)}. Let {a(v, p)} denote the interim probability that agent’s paper is provisionally accepted. Let {c(v, p)} be the interim probability of agent’s paper not being inspected given it has been provisionally accepted. If a provisionally accepted paper is not inspected, it is published. If a paper subject to inspection is successfully replicated, the paper is published. Otherwise it is rejected and, per custom, the outcome is kept private. Agent cares only about the paper being accepted. Hence, agent cares only about

\displaystyle a(v, p)c(v,p) + a(v, p)(1-c(v,p))p.

The principal cares about replicability of papers and suffers a penalty of {R > K} for publishing a paper that is not replicable. Principal also cares about the cost of inspection. Therefore she maximizes

\displaystyle \sum_{v,p}\pi(v,p)[pv - (1-p)c(v,p)R]a(v,p) - K \sum_{v,p}\pi(v,p)a(v,p)(1-c(v,p))

\displaystyle = \sum_{v,p}\pi(v,p)[pv-K]a(v,p) + \sum_{v,p}\pi(v,p)a(v,p)c(v,p)[K - (1-p)R].

The incentive compatibility constraint is
\displaystyle a(v, p)c(v,p) + a(v, p)(1-c(v,p))p \geq a(v, p')c(v,p') + a(v, p')(1-c(v,p'))p.

Recall, an agent cannot lie about the value component of the type.
We cannot screen on {p}, so all that matters is the distribution of {p} conditional on {v}. Let {p_v = E(p|v)}. For a given {v} there are only 3 possibilities: accept always, reject always, inspect and accept. The first possibility has an expected payoff of

\displaystyle vp_v - (1-p_v) R = (v+R) p_v - R

for the principal. The second possibility has value zero. The third has value { vp_v -K }.
The principal prefers to accept immediately over inspection if

\displaystyle (v+R) p_v - R > vp_v - K \Rightarrow p_v > (R-K)/R.

The principal will prefer inspection to rejection if { vp_v \geq K}. The principal prefers to accept rather than reject depends if {p_v \geq R/(v+R).}
Under a suitable condition on {p_v} as a function of {v}, the optimal mechanism can be characterized by two cutoffs {\tau_2 > \tau_1}. Choose {\tau_2} to be the smallest {v} such that

\displaystyle p_v \geq \max( (R/v+R), ((R-K)/R) ).

Choose {\tau_1} to be the largest {v} such that {p_v \leq \min (K/v, R/v+R)}.
A paper with {v \geq \tau_2} will be accepted without inspection. A paper with {v \leq \tau_1} will be rejected. A paper with {v \in (\tau_1, \tau_2)} will be provisionally accepted and then inspected.

For empiricists the advice would be to should shoot for high {v} and damn the {p}!

More seriously, the model points out that even a journal that cares about replicability and bears the cost of verifying this will publish papers that have a low probability of being replicable. Hence, the presence of published papers that are not replicable is not, by itself, a sign of something rotten in Denmark.

One could improve outcomes by making authors bear the costs of a paper not being replicated. This points to a larger question. Replication is costly. How should the cost of replication be apportioned? In my model, the journal bore the entire cost. One could pass it on to the authors but this may have the effect of discouraging empirical research. One could rely on third parties (voluntary, like civic associations, or professionals supported by subscription). Or, one could rely on competing partisan groups pursuing their agendas to keep the claims of each side in check. The last seems at odds with the romantic ideal of disinterested scientists but could be efficient. The risk is partisan capture of journals which would shut down cross-checking.


There is not a single paper I published that I wouldn’t have changed in retrospect. Usually it’s because I regret giving in too quickly to bad “suggestions” from referees. But even when I was happy with the final version of the paper when I submitted it, I see things differently after a couple of months. Luckily, the journal system, with all its faults, at least don’t let us keep rewriting our papers. Otherwise, we might have got this

 

I enjoyed reading Graham Cormode’s guide for the adversarial reviewer in computer science (pdf)

h/t Haris

Peer review is a painful process both to authors and to referees, and I gather from Eilon’s post that editors are not thrilled with it either. This, I believe, is in part because we game theorists expect the referee not only to judge the paper, but also to discuss the paper more than is necessary to justify the judgment, and, even worse, to suggest ways to improve the paper.

This is why we produce long reports, sometimes of low quality because we force ourselves to write something even when we have nothing valuable to say. Sure, the referee’s feedback can be useful — As Eilon says, some authors believe that it is actually worth the delay in publication even when the paper is bound to be rejected. But even in these cases I think the effort to write the report is not worth the small audience that the report will receive. If the referee has some interesting observation or criticism about the paper, why hide it from the rest of community ?

This is why I suggest that the profession adopt the guidlines of The Annals of Statistics, which to my knowledge are common in math and physics journals

you are not expected to rewrite the paper or to suggest major revisions or avenues for further research. Your role is simply to recommend whether or not the paper should be published.

I am getting too many requests to referee `quantum games’ papers, and while I am very enthusiastic about the interface of game theory and quantum physics, there is a certain strand of this literature which in my view misses the point of game theory. Since I find myself copy-pasting from previous reports I wrote, I thought I should make my stance public. This post is a generic referee report. If you think my criticism shows that I am too narrow minded to understand your paper then I recommend that you ask the editor not to send it to me. If you have already read this post in a rejection letter then I hope we can still be friends. I am mostly going to rely on EWL’s paper which is a seminal paper in this literature (350 citations in google scholar) and the most mathematically coherent that I know.

Read the rest of this entry »

When I come across a theory in physics or biology or philosophy that is discredited by the leading figures in its field, I am not automatically dismissive. But, without opening Nico Benschop’s book in which he provides elementary proofs of FLT and Goldbach’s Conjecture, I am certain that the proofs are incorrect. Sorry dude, I admire your audacity and I would love to see a story about a misunderstood genius in mathematics comes true, but I am a Bayesianist, and my prior of you being right in this matter is precisely zero. Double standard ? You bet. Read the rest of this entry »

Follow

Get every new post delivered to your Inbox.

Join 204 other followers