I hear from Ehud Kalai that the NSF intends to a pilot a new mechanism design approach to peer review.

It is based on a proposal by Merrifield and Saari. This is not the only proposal for revamping peer review based on mechanism design ideas. See Robinson. Solomon provides a brief summary of other proposals to upend peer review and how they fared.

Merrifeld & Saari assert that an ideal peer review process must solve the following problems ( quote from the paper):

a) Some incentive should be put in place to reduce the pressure toward ever more applications.
b) The workload of assessment should be shared equitably around the community.
c) The burden on each individual should be maintained at a reasonable level so that it is physically
possible to do a fair and thorough job.
d) There should be some positive incentive to do the task well.
e) The ultimate ranked list of applications should be as objective a representation of the
community’s perception of their relative scientific merits as possible.

Their proposed mechanism (with extraneous but important practical details omitted) is as follows:
1) There are N agents each of whom submits a proposal.
2) Each agent receives m < N proposals to review (not their own).
3) Each agent compiles a ranked list of the m she receives, placing them in the order in which she thinks the community would rank them, not her personal preferences.
4) The N ranked lists are combined to produce an optimized global list ranking all N applications.
5) Failure to carry out this refereeing duty by a set deadline leads to automatic rejection of one's proposal.
6) Individual rankings are compared to the positions of the same m applications in the globally-optimized list. If both lists appear in approximately the same order, then proposer is rewarded by having his proposal moved a modest number of places up the final list relative to those from proposers who have not matched the community’s view as well.

What is missing from the paper or the summary on the NSF site is a clear specification of an environment in which such a mechanism is any sense the best such mechanism satisfying (a-e). In this sense it is not a `mechanism design' approach to peer review. One could dismiss this as pettiffogging, but it would be mistaken. To illustrate, lets posit an environment. Suppose N=3 and m=2. So each agent reviews the proposals by the other two. Suppose also each proposal is either good or bad, and conditional on its state anyone reviewing it receives a signal of its quality. So, good proposals generate a high quality signal with probability 1 and bad proposals generate a low quality signal with probability 1. The signals are independent across reviewers. Finally, the cost of effort to review a proposal is zero.

Proposers read the proposals assigned to them, and report their signals. If two proposers disagree on the same proposal, both are shot (and extreme version of item (6) above). Thus, truthfully reporting one's signal is an equilibrium. However, it is not the only equilibrium. Randomizing one's report would also be an equilibrium……..which may obtain if there is a non-negligble cost of effort. Now, Merrifield & Saari might argue that the environment I've set up presumes there are objectively good proposals which is ruled out by them. They write `……it should be borne in mind that there is no objective right answer in this kind of peer review process.' I would argue, this is semantic. The `true' quality is simply the commonly known community perception of quality.

There may a way a round the multiplicity of equilibria problem here using an idea explicated by David Rahman in his aptly titled paper which I render in latin: Quis custodiet ipsos custodes? What is it? Insert proposals with known quality into the review, i.e., test the agents.

I have also assumed the cost of effort is zero, and one goal of the Merrifield & Saari proposal is to encourage effort because it is costly. However, this suggests a trade-off between costly effort and information revealed. Might Merrifield and Saari’s proposal encourage too much effort?

By the by, why exclude agents reviewing their own proposals? Presumably item (6) will discourage grossly inflated rankings of one’s own proposals. It does bring to mind David Lodge’s parlour game `Humiliation’ in Changing Places. Players name classics of literature that they have not read, the winner being the one who confesses the most embarrassing omission. (In the book an untenured professor desperate to impress his colleagues admits to skipping Hamlet and for this lacunae is subsequently denied tenure.)

Step away, now, from the proposal and focus on the desidarata (a-e) that Merrifield and Saari identify. Some of it has to do with the moral hazard problem. Effort is costly, how can I ensure that you exerted effort? The interesting twist in the present context, is that there is no obvious signal of effort that can be relied upon. Thus, any mechanism that meets criteria (a-e) must simultaneously elicit private information, and induce the right level of effort without the injection of outside resources. I think, this has to be impossible. So, there is an impossibility result lurking here to be formulated and proved.