You are currently browsing the tag archive for the ‘normal form’ tag.
There are two equivalent ways to understand the best response property of a Nash Equilibrium strategy. First, we can say that the player plays a mixed strategy whose expected payoff is maximal among all possible mixed strategies. Second, we can say that the player randomly chooses a pure strategy from the set of pure strategies whose expected payoff is maximal among all possible pure strategies.
So far so good, and every student of game theory is aware of this equivalence. What I think is less known is that the two perspectives are not identical for -best response and -equilibrium: A mixed strategy whose expected payoff is almost optimal might put some positive (though small) probability on a pure strategy which gives a horrible payoff. In this post I am going to explain why I used to think the difference between the two perspectives is inconsequential, and why, following a conversation with Ayala Mashiah-Yaakovi about her work on subgame perfect equilibrium in Borel games, I changed my mind.
Here is slight modification of the example Christoph presented in the last MEDS lunch (I don’t remember the attribution). I am going to describe three games, and, because for my main point I need to think of the game as taking place in the laboratory, I will call the game `an experiment’ and the players `subjects’, denoted S1 and S2. Below are full descriptions of three experiments. These descriptions are also given to the subjects.
Experiment 1: S2 leaves the room. S1 faces a disk divided into two sides and has to chose one side. His choice is recorded (say, marked on the back of the disk). Then the disk is randomly rotated. Then S2 returns and has to chose a side. If both subjects chose the same side, they get one dollar each. Otherwise, they are cast into the lake of fire.
Experiment 2: Same as Experiment 1, except that the disk is not rotated.
Experiment 3: The Experiment is called `Driving in Illinois’. The rest of the game is as in Experiment 2.
`At every day a player takes an action’. This is the starting point of many models of repeated interaction. We let time run to infinity to reflect the fact that players don’t have in mind a fixed termination point for the game. We do, however, fix the starting point , which I think in many cases is unnatural: By the time I realize I know the bartender in my local Starbucks and maybe I should start tipping, I already lost count of the number of times I have been there. This is why I would like to model it as a game with infinite past. Also, it will be cool to have a paper that starts with `At every day ‘ for a change. But, as I am sure many game theorists have independently discovered, it is not clear how to proceed.
I heard this from Marco who heard it from Tzachi. Not sure what to make of it, but that will not deter me from ruminating publicly
There is a sack of chocolate and you have two options: either take one piece from the sack to yourself, or take three pieces which will be given to Dylan. Dylan also has two options: one pieces for himself or three to you. After you both made your choices independently each goes home with the amount of chocolate he collected.
Write down the payoff matrix for this game and you’ll get the Prisoner’s Dilemma. But where is the dilemma here ? In the few occasions that I presented the Nash solution to the prisoner’s dilemma to a friend who didn’t know about game theory, the response was always `something is wrong here’. I am pretty sure that if I try to do the same with the Chocolate Dilemma the response will be `well, duh’. Obviously every sensible person will prefer one chocolate piece to himself than three pieces to somebody else.
One can say that this hypothetical response vindicates the Nash solution also in the Prisoner’s Dilemma. After all, these are the same games. If people find the solution intuitive in the Chocolate Dilemma they should also accept it in the Prisoner’s Dilemma. But this argument is not very convincing since these are the same games only because we game theorists identify a game with its payoff matrix. The rest of the world might disagree.