I don’t often go to empirical talks, but when I do, I fall asleep. Recently, while so engaged, I dreamt of the `replicability crisis’ in Economics (see Chang and Li (2015)). The penultimate line of their abstract is the following bleak assessment:
`Because we are able to replicate less than half of the papers in our sample even with help from the authors, we assert that economics research is usually not replicable.’
Eager to help my empirical colleagues snatch victory from the jaws of defeat, I did what all theorists do. Build a model. Here it is.
The journal editor is the principal and the agent is an author. Agent has a paper characterized by two numbers . The first is the value of the findings in the paper assuming they are replicable. The second is the probability that the findings are indeed replicable. The expected benefit of the paper is . Assume that is common knowledge but is the private information of agent. The probability that agent is of type is .
Given a paper, the principal can at a cost inspect the paper. With probability the inspection process will replicate the findings of the paper. Principal proposes an incentive compatible direct mechanism. Agent reports their type, . Let denote the interim probability that agent’s paper is provisionally accepted. Let be the interim probability of agent’s paper not being inspected given it has been provisionally accepted. If a provisionally accepted paper is not inspected, it is published. If a paper subject to inspection is successfully replicated, the paper is published. Otherwise it is rejected and, per custom, the outcome is kept private. Agent cares only about the paper being accepted. Hence, agent cares only about
The principal cares about replicability of papers and suffers a penalty of for publishing a paper that is not replicable. Principal also cares about the cost of inspection. Therefore she maximizes
The incentive compatibility constraint is
Recall, an agent cannot lie about the value component of the type.
We cannot screen on , so all that matters is the distribution of conditional on . Let . For a given there are only 3 possibilities: accept always, reject always, inspect and accept. The first possibility has an expected payoff of
for the principal. The second possibility has value zero. The third has value .
The principal prefers to accept immediately over inspection if
The principal will prefer inspection to rejection if . The principal prefers to accept rather than reject depends if
Under a suitable condition on as a function of , the optimal mechanism can be characterized by two cutoffs . Choose to be the smallest such that
Choose to be the largest such that .
A paper with will be accepted without inspection. A paper with will be rejected. A paper with will be provisionally accepted and then inspected.
For empiricists the advice would be to should shoot for high and damn the !
More seriously, the model points out that even a journal that cares about replicability and bears the cost of verifying this will publish papers that have a low probability of being replicable. Hence, the presence of published papers that are not replicable is not, by itself, a sign of something rotten in Denmark.
One could improve outcomes by making authors bear the costs of a paper not being replicated. This points to a larger question. Replication is costly. How should the cost of replication be apportioned? In my model, the journal bore the entire cost. One could pass it on to the authors but this may have the effect of discouraging empirical research. One could rely on third parties (voluntary, like civic associations, or professionals supported by subscription). Or, one could rely on competing partisan groups pursuing their agendas to keep the claims of each side in check. The last seems at odds with the romantic ideal of disinterested scientists but could be efficient. The risk is partisan capture of journals which would shut down cross-checking.