`….a place of breathtaking barbarity……. On Norfolk Island an Irishman named William Riley received 100 lashes for ”Singing a Song” (no doubt a rebel one) and 50 for asking a warder for a chew of tobacco. Deranged by cruelty and misery, some men would opt for a lifetime at the bottom of the carceral heap by blinding themselves; thus, they reasoned, they would be left alone.’

It is in this portion of his book, that Hughes recalls an eyewitness account of a suicide lottery of the type mentioned in Clarke’s novel. Here is Clarke’s succinct description of it:

The scheme of escape hit upon by the convict intellect was simply this. Three men being together, lots were drawn to determine whom should be murdered. The drawer of the longest straw was the “lucky” man. He was killed. The drawer of the next longest straw was the murderer. He was hanged. The unlucky one was the witness. He had, of course, an excellent chance of being hung also, but his doom was not so certain, and he therefore looked upon himself as unfortunate.

Clarke and Hughes deviate slightly upon the precise incentives that would drive participation in the scheme. As many of the convicts on Norfolk island were Irish, the scheme was concocted as a way to to circumvent the Catholic prohibition on suicide. Hughes suggests that, after the murder, witness and culprit would be shipped back to the mainland for trial. Conditions there were better, so for both there was brief respite and a greater opportunity for escape.

Its an arresting story, that one is loath to give up. But, one is compelled to ask, is it true? If yes, was it common? Tim Causer of King’s College London went back to look at the records and says the answers are `maybe’ and `no’. Here is his summing up:

`Capital offences committed with apparent suicidal intent are an important part of Norfolk Island’s history, but they need to be understood more fully. It should be recognised just how rare they were, that ‘suicide lotteries’ are embellishments upon actual cases of state-assisted suicide and repeating the myth only reinforces the sensationalised interpretation of Norfolk Island’s history.’

You can find the full account here.

]]>`Whenever the Vice President and a majority of either the principal officers of the executive departments or of such other body as Congress may by law provide, transmit to the President pro tempore of the Senate and the Speaker of the House of Representatives their written declaration that the President is unable to discharge the powers and duties of his office, the Vice President shall immediately assume the powers and duties of the office as Acting President.’

The VP is Pence. The President pro tempore of the Senate, is the senior senator of the majority party and Paul Ryan is the Speaker of the House.

The President can object. At which point, Congress resolves the matter, specifically,

`….two-thirds vote of both Houses that the President is unable to discharge the powers and duties of his office, the Vice President shall continue to discharge the same as Acting President; otherwise, the President shall resume the powers and duties of his office.’

]]>

I generally agree with Tao’ sentiment and argument, but I have a quibble. Tao describe the current situation as mutual knowledge without common knowledge. This, I think, is wrong. To get politics out of the way, let me explain my position using a similar situation which Tao also mentions: The Emperor’s new clothes. I have already come across people casting the Emperor’s story in terms of mutual knowledge without common knowledge, and I think it is also wrong. The way I understand the story, before the kid shouts, each of the Emperor’s subjects sees that the Emperor is naked, but after observing everybody else’s reaction, each subject updates her own initial belief and deduces that she was probably wrong. The subjects now don’t think that the Emperor is naked. Rather, each subjects thinks that her own eyes deceived her.

But when game theorists and logicians say that an assertion is mutual knowledge (or mutual belief) we mean that each of us, after taking into account our own information including what we deduce about other people’s information, think the assertion is true. In my reading of the Emperor’s new cloths story this is not the case.

For an assertion to be common knowledge, we need in addition that everybody knows that everybody knows that the assertion is true, and that everybody knows that everybody knows that everybody knows that the assertion is true, and onwards to infinity. A good example of a situation with mutual knowledge and no common knowledge is the blue-eyed islanders puzzle (using the story as it appears Terrence’ blog and a big spoiler ahead if you are not familiar with the puzzle): Before the foreigner makes an announcement, it is mutual knowledge that there are at least 99 blue-eyed islanders, but this fact is not common knowledge: If Alice and Bob are both blue-eyed then Alice, not knowing the color of her own eyes, thinks that Bob might observe only 98 blue-eyed islanders. In fact it is not even common knowledge that there are at least 98 blue-eyed Islanders, because Alice thinks that Bob might think that Craig might only observe 97 blue-eyed Islanders. By similar reasoning, before the foreigner’s announcement, it is not even common knowledge that there is at least one blue-eyed islander. Once the foreigner announces it, this fact becomes common knowledge.

No mutual knowledge and no common knowledge are two situations that can have different behavioral implications. Suppose that we offer each of the subjects the following private voting game: Is the emperor wearing clothes ? You have to answer yes or no. If you answer correctly you get a free ice cream sandwich, otherwise you get nothing. According to my reading of the story they will all give the wrong answer, and get nothing. On the other hand, suppose you offer a similar game to the islanders — even before the foreigner arrives — Do you think that there is at least one blue-eyed islander ? they will answer correctly.

There is an alternative reading of the Emperor’s story, according to which it is indeed a story about mutual knowledge without common knowledge: Even after observing the crowd’s reaction, each subject still knows that the Emperor is naked, but she keeps her mouth shut because she suspects that her fellow subjects don’t realize it and she doesn’t want to make a fool of herself. This reading strikes me as less psychologically interesting, but, more importantly, if that’s how you understand the story then there is nothing to worry about. All the subjects will vote correctly anyway and get the ice cream even without the little kid making it a common knowledge. And Trump will not be elected president even if people continue to keep their mouth shut.

]]>When I came across this editorial I was dumbfounded by the arrogance of the editors, who seem to know about statistics as much as I know about social psychology. But I haven’t heard about this journal until yesterday, and if I did I am pretty sure I wouldn’t believe anything they publish, p-value or no p-value. So I don’t have the right to complain here.

Here are somebodies who have the right to complain: The American Statistical Association. Concerned with the misuse, mistrust and misunderstanding of the p-value, ASA has recently issued a policy statement on p- values and statistical significance, intended for researchers who are not statisticians.

How do you explain p-value to practitioners who don’t care about things like Neyman-Pearson Lemma, independence and UMP tests ? First, you use language that obscures conceptual difficulties: “the probability that a statistical summary of the data would be equal to or more extreme than its observed value’’ — without saying what “more extreme’’ means. Second, you use warnings and slogans about what p-value doesn’t mean or can’t do, like “p-value does not measure the size of an effect or the importance of a result.’’

Among these slogans my favorite is

P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone

What’s cute about this statement is that it assumes that everybody understands what “there is 5% chance that the studied hypothesis is true” and that the notion of P-value is the one that is difficult to understand. In fact, the opposite is true.

Probability is conceptually tricky. It’s meaning is somewhat clear in a situation of a repeated experiment: I more or less understand what it means that a coin has 50% chance to land on Heads. (Yes. Only more or less). But without going full subjective I have no idea what is the meaning of the probability that a given hypothesis (boys who eat pickles in kindergarten have higher SAT score than girls who play firefighters) is true. On the other hand, The meaning of the corresponding P-value relies only on the conceptually simpler notion of probabilities in a repeated experiment.

Why therefore do the committee members (rightly !) assume that people are comfortable with the difficult concept of probability that an hypothesis is true and are uncomfortable with the easy concept of p-value ? I think the reason is that unlike the word “p-value”, the word “probability” is a word that we use in everyday life, so most people feel they know what it means. Since they have never thought about it formally, they are not aware that they actually don’t.

So here is a modest proposal for preventing the misuse and misunderstanding of statistical inference: Instead of saying “this hypothesis holds with p-value 0.03” say “We are 97% confident that this hypothesis holds”. We all know what “confident” means right ?

]]>To think thorough the implications of this, its useful to revisit an example of Arthur Pigou. There is a measure 1 of travelers all of whom wish to leave the same origin () for the same destination (). There are two possible paths from to . The `top’ one has a travel time of 1 unit independent of the measure of travelers who use it. The `bottom’ one has a travel time that grows linearly with the measure of travelers who employ it. Thus, if fraction of travelers take the bottom path, each incurs a travel time of units.

A central planner, say, Uber, interested in minimizing total travel time will route half of all travelers through the top and the remainder through the bottom. Total travel time will be . The only Nash equilibrium of the path selection game is for all travelers to choose the bottom path yielding a total travel time of . Thus, if the only choice is to delegate my route selection to Uber or make it myself, there is no equilibrium where all travelers delegate to Uber.

Now suppose, there are two competing ride hailing services. Assume fraction of travelers are signed up with Uber and fraction are signed up with Lyft. To avoid annoying corner cases, . Each firm routes its users so as to minimize the total travel time that their users incur. Uber will choose fraction of its subscribers to use the top path and the remaining fraction will use the bottom path. Lyft will choose a fraction of its subscribers to use the top path and the remaining fraction will use the bottom path.

A straight forward calculation reveals that the only Nash equilibrium of the Uber vs. Lyft game is and . An interesting case is when , i.e., Uber has a dominant market share. In this case , i.e., Lyft sends none of its users through the top path. Uber on the hand will send half its users via the top and the remainder by the bottom path. Assuming Uber randomly assigns its users to top and bottom with equal probability, the average travel time for a Uber user will be

The travel time for a Lyft user will be

Total travel time will be , less than in the Nash equilibrium outcome. However, Lyft would offer travelers a lower travel time than Uber. This is because, Uber which has the bulk of travelers, must use the top path to reduce total travel times. If this were the case, travelers would switch from Uber to Lyft. This conclusion ignores prices, which at present are not part of the model.

Suppose we include prices and assume that travelers now evaluate a ride hailing service based on delivered price, that is price plus travel time. Thus, we are assuming that all travelers value time at $1 a unit of time. The volume of customers served by Uber and Lyft is no longer fixed and they will focus on minimizing average travel time per customer. A plausible guess is that there will be an equal price equilibrium where travelers divide evenly between the two services, i.e., . Each service will route of its customers through the top and the remainder through the bottom. Average travel time per customer will be . However, total travel time on the bottom will be , giving every customer an incentive to opt out and drive their own car on the bottom path.

What this simple minded analysis highlights is that the benefits of coordination may be hard to achieve if travelers can opt out and drive themselves. To minimize congestion, the ride hailing services must limit traffic on the bottom path. This is the one that is congestible. However, doing so makes its attractive in terms of travel time encouraging travelers to opt out.

]]>

Shapley got the Nobel in 2012 and according to Robert Aumann deserved to get it right with Nash. Shapley himself however was not completely on board: “I consider myself a mathematician and the award is for economics. I never, never in my life took a course in economics.” If you are wondering what he means by “a mathematician” read the following quote, from the last paragraph of his stable matching paper with David Gale

The argument is carried out not in mathematical symbols but in ordinary English; there are no obscure or technical terms. Knowledge of calculus is not presupposed. In fact, one hardly needs to know how to count. Yet any mathematician will immediately recognize the argument as mathematical…

What, then, to raise the old question once more, is mathematics? The answer, it appears, is that any argument which is carried out with sufficient precision is mathematical

In the paper Gale and Shapley considered a problem of matching (or assignment as they called it) of applicants to colleges, where each applicant has his own preference over colleges and each college has its preference over applicants. Moreover, each college has a quota. Here is the definition of stability, taken from the original paper

Definition: An assignment of applicants to colleges will be called unstable if there are two applicants and who are assigned to colleges and , respectively, although prefers to and prefers to .

According to the Gale-Shapley algorithm, applicants apply to colleges sequentially following their preferences. A college with quota maintains a `waiting list’ of size with the top applicants that has applied to it so far, and rejects all other applicants. When an applicant is rejected from a college he applies to his next favorite college. Gale and Shapley proved that the algorithm terminates with a stable assignment.

One reason that the paper was so successful is that the Gale Shapley method is actually used in practice. (A famous example is the national resident program that assigns budding physicians to hospitals). From theoretical perspective my favorite follow-up is a paper of Dubins and Freedman “Machiavelli and the Gale-Shapley Algorithm” (1981): Suppose that some applicant, Machiavelli, decides to `cheat’ and apply to colleges in different order than his true ranking. Can Machiavelli improves his position in the assignment produced by the algorithm ? Dubins and Freedman prove that the answer to this question is no.

Shapley’s contribution to game theory is too vast to mention in a single post. Since I mainly want to say something about his mathematics let me mention Shapley-Folkman-Starr Lemma, a kind of discrete analogue of Lyapunov’s theorem on the range of non-atomic vector measures, and KKMS Lemma which I still don’t understand its meaning but it has something to do with fixed points and Yaron and I have used it in our paper about rental harmony.

I am going to talk in more details about stochasic games, introduced by Shapley in 1953, since this area has been flourishing recently with some really big developments. A (two-player, zero-sum) stochastic game is given by a finite set of states, finite set of actions for the players, a period payoff function , a distribution over for every state and actions , and a discount factor . At every period the system is at some state , players choose actions simultaneously and independently. Then the column player pays to the row player. The game then moves to a new state in the next period, randomized according to . Players evaluate their infinite stream of payoofs via the discount factor . The model is a generalization of the single player dynamic programming model which was studied by Blackwell and Bellman. Shapley proved that every zero-sum stochastic game admits a value, by imitating the familiar single player argument, which have been the joy and pride of macroeconomists ever since Lucas asset pricing model (think Bellman Equation and the contraction operators). Fink later proved using similar ideas that non-zero sum discounted stochastic games admit perfect markov equilibria.

A major question, following a similar question in the single player setup, is the limit behavior of the value and the optimal strategies when players become more patient (i.e., goes to ). Mertens and Neyman have proved that the limit exists, and moreover that for every there strategies which are -optimal for sufficiently large discount factor. Whether a similar result holds for Nash equilibrium in -player stochastic games is probably the most important open question in game theory. Another important question is whether the limit of the value exists for zero-sum games in which the state is not observed by both players. Bruno Zilloto has recently answered this question by providing a counter-example. I should probably warn that you need to know how to count and also some calculus to follow up this literature. Bruno Zilloto will give the Shapley Lecture in Games2016 in Maastricht. Congrats, Bruno ! and thanks to Shapley for leaving us with some much stuff to play with !

]]>Here is a the handwritten letter which Nash wrote to the NSA in 1955 (pdf), fifteen years before Cook formalized the P/NP problem. In the letter Nash conjectures that for most encryption mechanisms, recovering the key from the cipher requires exponential amount of time. And here is what Nash had to say about proving this conjecture:

The first appearance in print of a version of the game with Colonel Blotto’s name attached is, I believe, in the The Weekend Puzzle Book by Caliban (June 1924). Caliban was the pen name of Hubert Phillips one time head of Economics at the University of Bristol and a puzzle contributor to The New Statesman.

Blotto itself is a slang word for inebriation. It does not, apparently, derive from the word `blot’, meaning to absorb liquid. One account credits a French manufacturer of delivery tricycles (Blotto Freres, see the picture) that were infamous for their instability. This inspired Laurel and Hardy to title one of their movies Blotto. In it they get blotto on cold tea, thinking it whiskey.

Over time, the Colonel has been promoted. In 2006 to General and to Field Marshall in 2011.

]]>