You are currently browsing rvohra’s articles.
I don’t often go to empirical talks, but when I do, I fall asleep. Recently, while so engaged, I dreamt of the `replicability crisis’ in Economics (see Chang and Li (2015)). The penultimate line of their abstract is the following bleak assessment:
`Because we are able to replicate less than half of the papers in our sample even with help from the authors, we assert that economics research is usually not replicable.’
Eager to help my empirical colleagues snatch victory from the jaws of defeat, I did what all theorists do. Build a model. Here it is.
The journal editor is the principal and the agent is an author. Agent has a paper characterized by two numbers . The first is the value of the findings in the paper assuming they are replicable. The second is the probability that the findings are indeed replicable. The expected benefit of the paper is . Assume that is common knowledge but is the private information of agent. The probability that agent is of type is .
Given a paper, the principal can at a cost inspect the paper. With probability the inspection process will replicate the findings of the paper. Principal proposes an incentive compatible direct mechanism. Agent reports their type, . Let denote the interim probability that agent’s paper is provisionally accepted. Let be the interim probability of agent’s paper not being inspected given it has been provisionally accepted. If a provisionally accepted paper is not inspected, it is published. If a paper subject to inspection is successfully replicated, the paper is published. Otherwise it is rejected and, per custom, the outcome is kept private. Agent cares only about the paper being accepted. Hence, agent cares only about
The principal cares about replicability of papers and suffers a penalty of for publishing a paper that is not replicable. Principal also cares about the cost of inspection. Therefore she maximizes
The incentive compatibility constraint is
Recall, an agent cannot lie about the value component of the type.
We cannot screen on , so all that matters is the distribution of conditional on . Let . For a given there are only 3 possibilities: accept always, reject always, inspect and accept. The first possibility has an expected payoff of
for the principal. The second possibility has value zero. The third has value .
The principal prefers to accept immediately over inspection if
The principal will prefer inspection to rejection if . The principal prefers to accept rather than reject depends if
Under a suitable condition on as a function of , the optimal mechanism can be characterized by two cutoffs . Choose to be the smallest such that
Choose to be the largest such that .
A paper with will be accepted without inspection. A paper with will be rejected. A paper with will be provisionally accepted and then inspected.
For empiricists the advice would be to should shoot for high and damn the !
More seriously, the model points out that even a journal that cares about replicability and bears the cost of verifying this will publish papers that have a low probability of being replicable. Hence, the presence of published papers that are not replicable is not, by itself, a sign of something rotten in Denmark.
One could improve outcomes by making authors bear the costs of a paper not being replicated. This points to a larger question. Replication is costly. How should the cost of replication be apportioned? In my model, the journal bore the entire cost. One could pass it on to the authors but this may have the effect of discouraging empirical research. One could rely on third parties (voluntary, like civic associations, or professionals supported by subscription). Or, one could rely on competing partisan groups pursuing their agendas to keep the claims of each side in check. The last seems at odds with the romantic ideal of disinterested scientists but could be efficient. The risk is partisan capture of journals which would shut down cross-checking.
When analyzing a mechanism it is convenient to assume that it is direct. The revelation principle allows one to argue that this restriction is without loss of generality. Yet, there are cases where one prefers to implement the indirect version of a mechanism rather than its direct counterpart. The clock version of the English ascending auction and the sealed bid second price auction are the most well known example (one hopes not the only). There are few (i.e. I could not immediately recall any) theorems that uniquely characterize a particular indirect mechanism. It would be nice to have more. What might such a characterization depend upon?
1) Direct mechanisms require that agents report their types. A concern for privacy could be used to `kill’ off a direct mechanism. However, one would first have to rule out the use of trusted third parties (either human or computers implementing cryptographic protocols).
2) Indirect mechanism can sometimes be thought of as an extensive form game and one might look for refinements of solution concepts for extensive form games that have no counterpart in the direct version of the mechanism. The notion of obviously dominant strategy-proof that appears here is an example. However, indirect mechanisms may introduce equilibria, absent in the direct counterpart, that are compelling for the agents but unattractive for the designers purposes.
3) One feature of observed indirect mechanisms is that they use simple message spaces, but compensate by using multiple rounds of communication. Thus a constraint on message spaces would be needed in a characterization but coupled with a constraint on the rounds of communication.
From Kris Shaw, a TA in for my ECON 101 class, I learnt that the band Van Halen once required that brown M&M’s not darken their dressing room door. Why? Maybe it was a lark. Perhaps, a member of the band (or two) could not resist chuckling over the idea of a minor factotum appointed to the task of sorting the M&Ms. When minor factotum is asked what they did that day, the response was bound to elicit guffaws. However, minor factotum might have made it a point to not wash their hands before sorting the M&Ms. Then, who would be laughing harder?
A copy of the M&M rider can be found here. Along with van Halen’s explanation of why the rider was included:
……the group has said the M&M provision was included to make sure that promoters had actually read its lengthy rider. If brown M&M’s were in the backstage candy bowl, Van Halen surmised that more important aspects of a performance–lighting, staging, security, ticketing–may have been botched by an inattentive promoter.
So the rider helps screen, apparently, whether the promotor pays attention to detail. I think the explanation problematic. It suggests that it is hard to monitor effort expended by promoter on important things like staging for example. So, monitor something completely irrelevant. The strategic promoter should shirk on the staging and expend effort on the M&Ms.
Duppe and Weintraub date the birth of Economic Theory, at June 1949. It was the year in which Koopmans organized the Cowles Commission Activity Analysis Conference. It is also counted as conference Zero of the Mathematical Programming Symposium. I mention this because the connections between Economic Theory and Mathematical Programming and Operations Research had, at one time been very strong. The conference, for example, was conceived of by Tjalling Koopmans, Harold Kuhn, George Dantzig, Albert Tucker, Oskar Morgenstern, and Wassily Leontief with the support of the Rand corporation.
One of the last remaining links to this period who straddled, like a Colossus, both Economic Theory and Operations Research, Herbert Eli Scarf, passed away on November 15th, 2015.
Scarf came to Economics and Operations Research by way of Princeton’s mathematics department. Among his classmates was Gomory of the cutting plane method Milnor of topology fame and Shapley. Subsequently, he went on to Rand ( Dantzig, Bellman, Ford & Fulkerson). While there he met Samuel Karlin and Kenneth Arrow who introduced him to inventory theory. It was in this subject that Scarf made the first of many important contributions: the optimality of (S, s) polices. He would go on to establish equivalence of the core and competitive equilibrium (jointly with Debreu), identify a sufficient condition for non-emptiness of the core of a NTU game (now known as Scarf’s Lemma), anticipated the application of Groebner basis in integer programming (neighborhood systems) and of course his magnificent `Computation of Economic Equilibria’.
Exegi monumentum aere perennnius regalique situ pyramidum altius, quod non imber edax, non Aquilo impotens possit diruere aut innumerabilis annorum series et fuga temporum. Non omnis moriar…….
I have finished a monument more lasting than bronze and higher than the royal structure of the pyramids, which neither the destructive rain, nor wild North wind is able to destroy, nor the countless series of years and flight of ages. I will not wholly die………….
You shouldn’t swing a dead cat, but if you did, you’d hit an economist doing data. Wolfers wrote:
“…...modern microeconomists are more likely to spend their days knee-deep in large-scale data sets describing the real-world decisions made by millions of people, and less likely to be mired in Greek-letter abstractions.”
Knee-deep usually goes with shit, while mired with bog. I’ll pick bog over shit, but suspect that that was not Wolfers’ intent.
The recent paper by Chang and Li about the difficulty of replicating empirical papers does rather take the wind out of the empirical sails. One cannot help but wonder about the replicability of replicability studies. No doubt, a paper on the subject will be forthcoming.
Noah Smith on his blog wrote:
So the supply of both good and mediocre empirics has increased, but only the supply of mediocre theory has increased. And demand for good papers – in the form of top-journal publications – is basically constant. The natural result is that empirical papers are crowding out theory papers.
Even if one accepts the last sentence, the first can only be conjecture. One might very well think that the supply of mediocre empirical papers is caused entirely by an increase in the supply of mediocre theory papers whose deficiencies are glossed over with a patina of empirics. Interestingly, when reviewers could find nothing nice to say about Piketty’s theories they praised his data instead. Its like praising the author of a false theorem by saying while the proof is wrong, it is long.
The whole business has the feel of tulip mania. Empirical papers as abundant as weeds. Analytics startups as plentiful as hedge funds. Analytics degree programs spreading like herpes. Positively Gradgrindian.
“THOMAS GRADGRIND, sir. A man of realities. A man of facts and calculations. A man who proceeds upon the principle that two and two are four, and nothing over, and who is not to be talked into allowing for anything over.”
In empirical econ classes around the world I imagine (because I’ve never been in one) Gradgrindian figures laying down the law:
“Facts alone are wanted in life. Plant nothing else, and root out everything else. You can only form the minds of reasoning animals upon Facts: nothing else will ever be of any service to them.”
I have nothing against facts. I am quite partial to some. But, they do not speak for themselves without an underlying theory.
Chu Kin Chan, an undergraduate student from the Chinese University of Hong Kong, has collected the placement statistics of the top 10 PhD programs in Economics from the last 4 years. You can find the report here. In it you will find the definition of top 10 as well as which placements `counted’. Given that not all PhD’s in economics who get academic positions do so in Economics departments, you can expect some judgement is required in deciding if a placements counts as a `top 10′ or `top 20′.
The results are similar to findings in other disciplines (the report refers to some of these). The top 10 departments place 5 times as many students in the top 20 departments as do those ranked 11 through 20. If you score a top 10 placement as +1, any other academic placement as a 0 and a non-academic placement as a -1, and then compute an average score per school, only one school gets a positive average score: MIT.
Chan also compares ranking of departments by placement with a ranking based on a measure of scholarly impact proposed by Glen Ellison. What is interesting is that departments that are very close to each other in the scholarly impact rating can differ quite a lot in terms of placement outcomes.
Trump’s rise in the republican polls puzzles many. It shouldn’t. He is the Putin that some republicans have longed for. Here is a sampling:
I looked the man in the eye. I found him to be very straight forward and trustworthy and we had a very good dialogue.
Mike Rogers, GOP chairman of the House Intelligence Committee:
Putin is playing chess while Obama is playing marbles.
Look it, people are looking at Putin as one who wrestles bears and drills for oil. They look at our president as one who wears mom jeans and equivocates and bloviates.
But he makes a decision and he executes it, quickly. Then everybody reacts. That’s what you call a leader.
If you think the comparison to Putin far fetched, here is Putin:
For the first time in the past 200–300 years, it (Russia) is facing the real threat of slipping down to the second, and possibly even third, rank of world states.
Now, compare with Trump’s slogan to make America great again.
Chicago’s Booth school surveys select Economics faculty (the IMG panel) on a variety of questions. Panelists are emailed a question and respond electronically, if so moved. They are asked to state whether they agree, strongly agree, disagree, are uncertain etc. as well as provide a level of confidence and, if they wish, some words of explanation. Here is one of the questions:
Using surge pricing to allocate transportation services — such as Uber does with its cars — raises consumer welfare through various potential channels, such as increasing the supply of those services, allocating them to people who desire them the most, and reducing search and queuing costs.
The correct answer to this question is: it depends. See below for the explanation. Back to the IMG panel. What is its purpose? According to the web site:
This panel explores the extent to which economists agree or disagree on major public policy issues. To assess such beliefs we assembled this panel of expert economists. Statistics teaches that a sample of (say) 40 opinions will be adequate to reflect a broader population if the sample is representative of that population.
Yes, but what is the underlying population? The IMG site does not say, instead it summarizes the cv’s of the sample:
The panel members are all senior faculty at the most elite research universities in the United States. The panel includes Nobel Laureates, John Bates Clark Medalists, fellows of the Econometric society, past Presidents of both the American Economics Association and American Finance Association, past Democratic and Republican members of the President’s Council of Economics, and past and current editors of the leading journals in the profession. This selection process has the advantage of not only providing a set of panelists whose names will be familiar to other economists and the media, but also delivers a group with impeccable qualifications to speak on public policy matters.
This is the high table of Economists, a group so select that the sample probably is the population. Why bother with the remarks about sampling?
How did the panelists respond to the surge pricing question? One strongly agreed with the statement but with a level of confidence of 1 (which I think is the lowest). This panelist also provided an explanation that makes clear that the reported confidence level was incorrect. Another, offers an `Agree’ with level of confidence of 3. Why not declare `uncertainty’? Or is the panelists trying to say: generally true but with some exceptions. The other responses suggest busy people trying to be helpful (recall Truman) on a task that is low priority for them.
Only one panelist provides an answer that can be interpreted as `it depends’. That panelist reports being uncertain with a level of confidence of 10. This panelist also provides an explanation:
`Consumer plus producer surplus should rise but in the absence of competition consumer surplus may not. With competition consumers will gain.’
Two make things concrete, consider a monopolist who faces two states of the world characterized by two demand curves: peak and off-peak, with off-peak state occurring most of the time. Now compare consumer surplus in two scenarios: same price in both states of the world, different price in each state. In which scenario will consumer surplus be higher? Which is a lovely intermediate micro question! In addition, if buyers are liquidity constrained, a price mechanism will not efficiently match rides to riders who value them the most.
I think the answer to the question posed reveals less about agreement on policy than the default assumption of the responder about the nature of the underlying market (passenger transportation).
Because I have white hair and that so sparse as to resemble the Soviet harvest of 1963, I am asked for advice. Just recently I was asked about `hot’ research topics in the sharing economy. `You mean a pure exchange economy?, said I in reply. Because I have white hair etc, I sometimes forget to bite my tongue.
Returning to topic, the Economist piece I linked to above, gets it about right. With a fall in certain transaction costs, trades that were otherwise infeasible, are realized. At a high level there is nothing more to be said beyond what we know already about exchange economies.
A closer looks suggests something of interest in the role of the mediator (eBay, Uber) responsible for the reduction in transaction costs. They are not indifferent Walrasian auctioneers but self interested ones. eBay and Uber provide an interesting contrast in `intrusiveness’. The first reduces the costs with respect to search, alleviates the lemons problem and moral hazard by providing information and managing payments. It does not, however, set prices. These are left to participants to decide. In sum, eBay it appears, tries to eliminate the textbook obstacles to a perfectly competitive market. Uber, also does these things but more. It chooses prices and the supplier who will meet the reported demand. One might think eBay does not because of the multitude of products it would have to track. The same is true for Uber. A product on Uber is a triple of origin, destination and time of day. The rider and driver may not be thinking about things in this way, but Uber certainly must in deciding prices and which supplier will be chosen to meet the demand. Why doesn’t Uber allow riders to post bids and drivers to post asks?
Lamar Smith’s new bill to ensure that NSF research advances the national interest does not go far enough. Smith who is Chairman of the House Science, Space and technology committee writes:
We must set funding priorities that ensure America remains first in the global marketplace of basic research and technological innovation, while preventing misuse of Americans’ hard-earned tax dollars. Unfortunately, in the past NSF has funded too many questionable research grants – money that should have gone to projects in the national interest. For example, how does the federal government justify spending $220,000 to study animal photos in National Geographic? Or $50,000 to study lawsuits in Peru from 1600 – 1700? Federal research agencies have an obligation to explain to American taxpayers why their money is being used on such research instead of on more worthy projects.
To ensure that the NSF is not profligate, the bill requires that each grant award
“be accompanied by a non-technical explanation of the project’s scientific merits and how it serves the national interest.”
Why stop with the NSF? Public education consumes an even larger share of my tax dollars. Why must I support the good for nothing offspring of my neighbors who grow up to be actors, musicians and worse, number theorists? If they want their children to be artsy-fartsy pseudo intellectuals they should do it on their own dime. Would be parents should be required to submit, a grant proposal justifying their desire for children. Each successful award should be accompanied by an explanation of how their child will serve the national interest.