Nicolas Copernicus’s de Revolutionibus, in which he advocated his theory that Earth revolved ar ound the sun, was first printed just before Copernicus’ death in 1543. It therefore fell on one Andreas Osiander to write the introduction. Here is a passage from the introduction:

[An astronomer] will adopt whatever suppositions enable [celestial] motions to be computed correctly from the principles of geometry for the future as well as for the past…. These hypotheses need not be true nor even probable. On the contrary, if they provide a calculus consistent with the observations, that alone is enough.

In other words, the purpose of the astronomer’s study is to capture the observed phenomena — to provide an analytic framework by which we can explain and predict what we see when we look at the sky. It turns out that it is more convenient to capture the phenomena by assuming that Earth revolved around the sun than by assuming, as the Greek astronomers did, a geo-centric epicyclical planet motion. Therefore let’s calculate the right time for Easter by making this assumption. As astronomers, we shouldn’t care whether this is actually true.

Whether or not Copernicus would have endorsed this approach is disputable. What is certain is that his book was at least initially accepted by the Catholic Church whose astronomers have used Copernicus’ model to develop the Gregorian Calendar. (Notice I said the word model btw, which is probably anachronistic but, I think, appropriately captures Osiander’s view). The person who caused the scandal was Galileo Galilei, who famously declared that if earth behaves as if it moves around the sun then, well, it moves around the sun. Yet it moves. It’s not a model, it’s reality. Physicists’ subject matter is the nature, not models about nature.

What about economists ? Econ theorists at least don’t usually claim that the components of their modeling of economic agents (think utilities, beliefs, discount factors, ambiguity aversions) correspond to any real elements of the physical world or of the cognitive process that the agent performs. When we say that Adam’s utility from apple is log(c) we don’t mean that Adam knows anything about logs. We mean — wait for it — that he behaves as if this is his utility, or, as Osiander would have put it, this utility provides a calculus consistent with the observations, and that alone is enough.

The contrast between theoretical economists’ as if’ approach and physicists’ and yet it moves’ approach is not as sharp as I would like it to be. First, from the physics side, modern interpretations of quantum physics view it, and by extension the entire physics enterprise, as nothing more than a computational tool to produce predictions. On the other hand, from the economics side, while I think it is still customary to pay lip service to the as if’ orthodoxy at least in decision theory classes, I don’t often hear it in seminars. And when neuro-economists claim to localize the decision making process in the brain they seem to view the components of the model as more than just mathematical constructions.

Yep, I am advertising another paper. Stay tuned :)

Uber posts a price ${p}$ per ride and keeps a commission ${\alpha}$ on the price. Suppose Uber is the only ride matching service in town. If ${D(p)}$ is the demand function for rides at per ride price ${p}$ and ${S(w)}$ is the supply curve for drivers at wage ${w}$ per ride, Uber must choose ${\alpha}$ and ${p}$ to solve the following:

$\displaystyle \max_{\alpha, p} \alpha p D(p)$

subject to

$\displaystyle D(p) \leq S((1-\alpha)p)$

The last constraint comes from the assumption that Uber is committed to ensuring that every rider seeking a ride at the posted price gets one.

Suppose, Uber did not link the payment to driver to the price charged to rider in this particular way. Then, Uber would solve

$\displaystyle \max_{p,w} pD(p) - wS(w)$

subject to

$\displaystyle D(p) \leq S(w)$

The first optimization problem is clearly more restrictive than the second. Hence, the claim that Uber is not profit maximizing. Which raises the obvious puzzle, why is Uber using a revenue sharing scheme?

Sydney Afriat arrived in Purdue in the late 60’s with a Bentley in tow. Mort Kamien described him as having walked out of the pages of an Ian Flemming novel. Why he brought the Bentley was a puzzle, as there were no qualified mechanics as far as the eye could see. In Indiana, that is a long way. Afriat would take his Bentley on long drives only to be interrupted by mechanical difficulties that necessitated the Bentley being towed to wait for parts or specialized help.

I came upon Afriat when I learnt about the problem of rationalizability.  One has a model of choice and a collection of observations about what an agent selected. Can one rationalize the observed choices by the given model of choice? In Afriat’s seminal paper on the subject, the observations consisted of price-quantity pairs for a vector of goods and a budget. The goal was to determine if the observed choices were consistent with an agent maximizing a concave utility function subject to the budget constraint. Afriat’s paper has prompted many other papers asking the same question for different models of choice. There is an aspect of these papers, including Afriat’s, that I find puzzling.

To illustrate, consider rationalizing expected utility (Eran Shmaya suggested that expected consumption’ might be more accurate). Let ${S = \{1,2 \ldots, n\}}$ be the set of possible states. We are given a sequence of observations ${\{x^{i},p^{i}\}_{i=1}^{m}}$ and a single budget ${b}$. Here ${x^i_j}$ represents consumption in state ${j}$ and ${p^i_j}$ is the unit price of consumption in state ${j}$ in observation ${i}$. We want to know if there is a probability distribution over states, ${v=(v_{1},...,v_{n})}$, such that each ${(x^i, p^i)}$ maximizes expected utility. In other words, ${(x^i, p^i)}$ solves

$\displaystyle \max \sum_{j=1}^{n}v_{j}x^i_{j}$

subject to

$\displaystyle \sum_{j=1}^{n}p^i_{j}x_{j}\leq b$

$\displaystyle x^i_{j}\geq 0\,\,\forall j \in S$

The solution to the above program is obvious. Identify the variable with the largest objective coefficient to constraint ratio and make it as large as possible. It is immediate that a collection of observations ${\{x^{i},p^{i}\}_{i=1}^{m}}$ can be rationalized by a suitable set ${\{v_{j}\} _{j=1}^{n}}$ of non-zero and nonnegative ${v_{j}}$‘s if the following system has a feasible solution:

$\displaystyle \frac{v_{r}}{p^i_r}\geq \frac{v_{j}}{p^i_{j}} \,\,\forall j, \,\, x^i_r> 0$

$\displaystyle \sum_{j \in S}v_{j}=1$

$\displaystyle v_{j}\geq 0\,\,\forall j \in S$

This completes the task as formulated by Afriat. A system of inequalities has been identified, that if feasible means the given observations can be rationalized. How hard is this to do in other cases? As long as the model of choice involves optimization and the optimization problem is well behaved in that first order conditions, say, suffice to characterize optimality, its a homework exercise. One can do this all day, thanks to Afriat; concave, additively separable concave, etc. etc.

Interestingly, no rationalizability paper stops at the point of identifying the inequalities. Even Afriat’s paper goes a step farther and proceeds to characterize’ when the observations can be rationalized. But, feasibility of the inequalities themselves is just such a characterization. What more is needed?

Perhaps, the characterization involving inequalities lacks interpretation’. Or, if the given system for a set of observations was infeasible, we may be interested in the obstacle to feasibility. Afriat’s paper gave a characterization in terms of the strong axiom of revealed preference, i.e., an absence of cycles of certain kinds. But that is precisely the Farkas alternative to the system of inequalities identified in Afriat. The absence of cycles condition follows from the fact that the initial set of inequalities is associated with the problem of finding a shortest path (see the chapter on rationalizability in my mechanism design book). Let me illustrate with the example above. It is equivalent to finding a non-negative and non trivial solution to

$\displaystyle \frac{v_{r}}{v_j}\geq \frac{p^i_{r}}{p^i_{j}} \,\,\forall j, \,\, x^i_r> 0$

Take logs:

$\displaystyle \ln{v_r} - \ln{v_j} \geq \ln{\frac{p^i_{r}}{p^i_{j}}} \,\,\forall j, \,\, x^i_r> 0$

This is exactly the dual to the problem of finding a shortest path in a suitable network (I believe that Afriat has a paper, that I’ve not found, which focuses on systems of the form $b_{rs} >$ $x_s - x_r$ ).The cycle characterization would involve products of terms like ${\frac{p^i_{r}}{p^i_{j}}}$ being less than 1 (or greater than 1 depending on convention). So, what would this add?

Completed what I wanted about monopoly and launched into imperfect competition and introduced the nash equilibrium. I follow the set up in the chapter of pricing from my pricing book with Lakshman Krishnamurthi. The novelty, if any, is to start with Bertrand competition, add capacity and then differentiation. I do this to highlight the different forces at play so that they are not obscured by the algebra of identifying reaction functions and finding where they cross. We’ll get to those later on. Midterm Day, 12. I am, as Enoch Powell once remarked in another, unattractive context,

…. filled with foreboding. Like the Roman, I seem to see “the River Tiber foaming with much blood”.

Much of my energy has been taken up with designing homework problems and a midterm exam on monopoly. Interesting questions are hard to come by. Those lying around make me want to make gnaw my feet off. I started with the assumption, which I may live to regret, that my students are capable of the mechanical and have good memories. The goal, instead, is to get them to put what they have learnt in class to use. Here is an example. Two upstream suppliers, A and B, who each supply an input to a Retailer. The Retailer is characterized by a production function that tells you how much output it generates from the inputs supplied by A and B as well as a demand curve for the final product. Fix the price set by B to, w, say. Now compute the price that A should charge to maximize profit. Its double marginalization with a twist. Suppose the inputs are substitutes for each other. If B raises its price above w what effect will that have on A’s profits? There are two effects. The retailers costs will go up of course, so reducing its output. However, A will retain a larger share of the smaller output. Which will be bigger? Its a question that requires them to put various pieces together. I’ve had them work up to it by solving the various pieces under different guises in different problems. Am I expecting too much? I’ll find out after the midterm. Yet, I cannot see anyway around having questions like this. What is the point of the mathematics we require them to use and know if we don’t ask them to apply it when blah-blah alone is insufficient?

I now have a modest store of such problems, but coming up with them has been devilish hard. Working out the solutions is painful, because it involves actual algebra and one cannot afford errors (on that account I’m behind the curve). To compound matters, one is unable to recycle the exam and homework problems given various sharing sites. I begin to regret not making them do just algebra.

Finally, if too late. It was announced October 3rd. For more on Blackwell see this post or the special GEB issue.

Its all the rage. I’m not immune to jumping on a bandwagon, but by the time I get there the dogs have barked and the caravan has moved on. Is there anything new to be said on the subject? Perhaps we can get by with relabeling things we already know? Useful, but not exciting. To think about this I tried to come with questions about privacy that struck me as important. Here is my list, in hopes that it will prompt others to improve upon it.
1) Is the concern for privacy intrinsic or instrumental?

The question matters because an answer would have a profound impact on how one evaluates the welfare consequences of various policies.

2) Property rights over information.

Much of the information about us that is of interest is the result of interactions with others. When I purchase a book from Amazon, who owns’ the record of that transaction? It could be argued that the record of the transaction is as much Amazon’s as it is mine. The question is not new. It arises, for example, when one writes a biography.

What about when a transaction takes place via an intermediary? What rights does the intermediary have to the record of the transaction?

3) A full specification of property rights would spell out who has the right to disclose what and to whom and under what conditions.

Some of this will involve a balance between the public good and individual harm. Out of court settlements regarding commercial matters whose terms are secret, prevent learning about systemic problems (Akerlof and his lemons, seems like a likely candidate for relabeling).

When is and under what conditions is mandated disclosure warranted? One can also imagine settings where one might wish to prohibit the voluntary disclosure of confidential information. Some professional schools, for example prohibit the disclosure of grades to potential employers (Grossman and Milgrom, anyone?).

4) Compliance. How might one monitor and verify that the specification of property rights have been adhered to?

For example, a promise not to disclose to a third party. In a given setting, are some kinds of promises even feasible? One may promise not to use certain identifying characteristics in the allocation of resources but those characteristics may have good proxies in other allowed’ characteristics.

Who should bear the costs of such monitoring? (Coase?)As much information of interest is collected by devices, one might think about the regulation’ of devices as a part of compliance. What standards, if any should devices that collect and transmit information adhere to? Is managing privacy best done through device standards or contracts?

I spent these two classes going over two-part tariffs. Were this just the algebra, it would be overkill. The novelty, if any, was to tie the whole business to how one should price  in a razor & blade business (engines and spare parts, kindle and ebooks etc). The basic 2-part model sets a high fixed fee (which one can associate with the durable) and sells each unit of the consumable at marginal cost. The analysis offers an opportunity to remind them of the problem of regulating the monopolist charging a uniform price.

The conclusion of the basic 2-part model  suggests charging a high price for razors and a low price for blades. This seems to run counter to the prevailing wisdom. Its an opportunity to solicit reasons for why the conclusion of the model might be wrong headed. We ran through a litany of possibilities: heterogenous preferences (opportunity to do a heavy vs light user calculation), hold up (one student observed that we can trust Amazon to keep the price of ebooks low otherwise we would switch to pirated versions!), liquidity constraints, competition. Tied this to Gillete’s history expounded in a paper by Randall Pick (see an earlier post ) and then onto Amazon’s pricing of the kindle and ebooks (see this post). This allowed for a discussion of the wholesale model vs agency model of pricing which the students had been asked to work out in the homework’s (nice application of basic monopoly pricing exercises!).

The take-away’ I tried to emphasize was how models help us formulate questions (rather than simply provide prescriptions), which in turn gives us greater insight into what might be going on.

This post describes the main theorem in my new paper with Nabil. Scroll down for open questions following this theorem. The theorem asserts that a Bayesian agent in a stationary environment will learn to make predictions as if he knew the data generating process, so that the as time goes by structural uncertainty dissipates. The standard example is when the sequence of outcomes is i.i.d. with an unknown parameter. As times goes by the agent learns the parameter.

The formulation of `learning to make predictions’ goes through merging, which traces back to Blackwell and Dubins. I will not give Blackwell and Dubin’s definition in this post but a weaker definition, suggested by Kalai and Lehrer.

A Bayesian agent observes an infinite sequence of outcomes from a finite set ${A}$. Let ${\mu\in\Delta(A^\mathbb{N})}$ represent the agent’s belief about the future outcomes. Suppose that before observing every day’s outcome the agent makes a probabilistic prediction about it. I denote by ${\mu(\cdot|a_0,\dots,a_{n-1})}$ the element in ${\Delta(A)}$ which represents the agent’s prediction about the outcome of day ${n}$ just after he observed the outcomes ${a_0,\dots,a_{n-1}}$ of previous days. In the following definition it is instructive to think about ${\tilde\mu}$ as the true data generating process, i.e., the process that generates the sequence of outcomes, which may be different from the agent’s belief.

Definition 1 (Kalai and Lehrer) Let ${\mu,\tilde\mu\in\Delta(A^\mathbb{N})}$. Then ${\mu}$ merges with ${\tilde\mu}$ if for ${\tilde\mu}$-almost every realization ${(a_0,\dots,a_{n-1},\dots)}$ it holds that

$\displaystyle \lim_{n\rightarrow\infty}\|\mu(\cdot|a_0,\dots,a_{n-1})-\tilde\mu(\cdot|a_0,\dots,a_{n-1})\|=0.$

Assume now that the agent’s belief ${\mu}$ is stationary, and let ${\mu=\int \theta~\lambda(\mathrm{d}\theta)}$ be its ergodic decomposition. Recall that in this decomposition ${\theta}$ ranges over ergodic beliefs and ${\lambda}$ represents structural uncertainty. Does the agent learn to make predictions ? Using the definition of merging we can ask, does ${\mu}$ merges with ${\theta}$ ? The answer, perhaps surprisingly, is no. I gave an example in my previous post.

Let me now move to a weaker definition of merging, that was first suggested by Lehrer and Smorodinsky. This definition requires the agent to make correct predictions in almost every period.

Definition 2 Let ${\mu,\tilde\mu\in\Delta(A^\mathbb{N})}$. Then ${\mu}$ weakly merges with ${\tilde\mu}$ if ${\tilde\mu}$-almost every realization ${(a_0,\dots,a_{n-1},\dots)}$ it holds that

$\displaystyle \lim_{n\rightarrow\infty,n\in T}\|\mu(\cdot|a_0,\dots,a_{n-1})-\tilde\mu(\cdot|a_0,\dots,a_{n-1})\|=0$

for a set ${T\subseteq \mathbb{N}}$ of periods of density ${1}$.

The definition of weak merging is natural: patient agents whose belief weakly merges with the true data generating process will make almost optimal decisions. Kalai, Lehrer and Smorodinsky discuss these notions of mergings and also their relationship with Dawid’s idea of calibration.

I am now in a position to state the theorem I have been talking about for two months:

Theorem 3 Let ${\mu\in\Delta(A^\mathbb{N})}$ be stationary, and let ${\mu=\int \theta~\lambda(\mathrm{d}\theta)}$ be its ergodic decomposition. Then ${\mu}$ weakly merges with ${\theta}$ for ${\lambda}$-almost every ${\theta}$.

In words: An agent who has some structural uncertainty about the data generating process will learn to make predictions in most periods as if he knew the data generating process.

Finally, here are the promised open questions. They deal with the two qualification in the theorem. The first question is about the “${\lambda}$-almost every ${\theta}$” in the theorem. As Larry Wasserman mentioned this is unsatisfactory in some senses. So,

Question 1 Does there exists a stationary ${\mu}$ (equivalently a belief ${\lambda}$ over ergodic beliefs) such that ${\mu}$ weakly merges with ${\theta}$ for every ergodic distribution ${\theta}$ ?

The second question is about strengthening weak merging to merging. We already know that this cannot be done for arbitrary belief ${\lambda}$ over ergodic processes, but what if ${\lambda}$ is concentrated on some natural family of processes, for example hidden markov processes with a bounded number of hidden states ? Here is the simplest setup for which I don’t know the answer.

Question 2 The outcome of the stock market at every day is either U or D (up or down). An agent believes that this outcome is a stochastic function of an unobserved (hidden) state of the economy which can be either G or B (good or bad): When the hidden state is B the outcome is U with probability ${q_B}$ (and D with probability ${1-q_B}$), and when the state is G the outcome is U with probability ${q_G}$. The hidden state changes according to a markov process with transition probability ${\rho(B|B)=1-\rho(G|B)=p_B}$, ${\rho(B|G)=1-\rho(G|G)=p_G}$. The parameter is ${(p_B,p_G,q_B,q_G)}$ and the agent has some prior ${\lambda}$ over the parameter. Does the agent’s belief about outcomes merge with the truth for ${\lambda}$-almost every ${(p_B,p_G,q_B,q_G)}$ ?.

On day 6, went the through the standard 2 period durables good problem, carefully working out the demand curve in each period. Did this to emphasize later how this problem is just like the problem of a multi-product monopolist with substitutes. Then, onto a discussion of JC Penny. In retrospect, not the best of examples. Doubt they shop at JC Penny, or follow the business section of the paper. One student gave a good summary of events as background to rest of class. Textbooks would have been better.

Subsequently, multi-product monopolist; substitute and complement. Emphasized this meant each product could not be priced in isolation of the other. Now the puzzle. Why would a seller introduce a substitute to itself? Recalling discussion of durables good monopolist, this seems like lunacy. A bright spark suggested that the substitute product might appeal to a segment that one is not currently selling to. Yes, but wouldn’t that cannibalize sales from existing product? Time for a model! Before getting to model, formally introduced price discrimination.

Day 7, talked briefly about homework and role of mathematics in economic analysis. Recalled the question of regulating the monopolist. Lowering price benefits consumers but harms seller. Do the benefits of customers exceed harm done to seller? Blah, blah cannot settle the issue. Need a model and have to analyze it to come to a conclusion. While we represent the world (or at least a part of it) mathematically, it does not follow that every mathematical object corresponds to something in reality. Made this point by pointing them to the homework question with demand curve having a constant elasticity of 1. Profit maximizing price is infinity, which is clearly silly. Differentiating and setting to zero is not a substitute for thinking.

Went on to focus on versioning and bundling. Versioning provides natural setting to talk about cannibalization and catering to new segment. Went through a model to show how the competing forces play out. Then to bundling.

Discussion of reasons to bundle that do not involve price discrimination. Then a model and its analysis. Motivated it by asking whether they would prefer to have ala carte programming from cable providers. In the model, unbundling results in higher prices which surprises them and was a good note to end on.

On day 5, unhappy with the way I covered regulation of monopolist earlier, went over it again. To put some flesh on the bone, I asked at conclusion of the analysis if they would favor regulating the price of drug on which the seller had a patent? Some discomfort with the idea. A number suggested the need to provide incentives to invest in R&D. In response I asked why not compensate them for their R&D? Ask for the R&D costs and pay them that plus something extra if we want to cover opportunity cost. Some discussion of how one would monitor and verify these costs. At which point someone piped in that if R&D costs were difficult to monitor, why not have the Government just do the R&D? Now we really are on the road to socialized medicine. Some appeals to the efficiency of competitive markets which I put on hold with the promise that we would return to this issue later on in the semester.

Thus far class had been limited to a uniform price monopolist. Pivoted to discussing a multi-product monopolist by way of a small example of a durables good monopolist selling over two periods. Had the class act out out the role of buyers and me the seller cutting price over time.  It provided an opportunity to discuss the role of commitment and tie it back to the ultimatum game played Day 1. On day 6 will revisit this with a discussion of JC Penny, which will allow one to get to next item on the agenda: price discrimination.