Sergiu Hart organized a small conference on Game Dynamics that took place last week at the Center for Rationality in the Hebrew University of Jerusalem. Luckily for me, Sergiu also likes dynamic games, and therefore talks were on both subjects.

The conference was Sergiu’s style: first talk at 10:30, second at 11:45, then a long lunch break, and then two additional talks, at 15:00 and 16:15. In one of the lunch breaks I asked John Levy to explain to me his construction of a discounted game without a stationary discounted equilibrium. I will try to share with you what I learned. (John’s discussion paper appears here.)

There are m players labeled 1,2,…,m. What is m? It is sufficiently large such that the total discounted payoff from stage m onward has a little effect on a player’s payoff. That is, if the discount factor is β (so that the total discounted payoff is the sum over n of (β to the power n) times (stage payoff at stage n)), then we require that (β to the power m) is a small number.

To describe the set of states, let A be the set of real numbers y in the unit interval [0,1] that satisfy the following condition: the binary expansion of y does not end with an infinite stream of 0’th or with an infinite stream of 1’s. In other words, A is the set of real numbers in the unit interval [0,1] that are not diadic.  Below I will denote the k-th bit in the binary expansion of a number y in the unit interval by yk, so that y=(y1,y2,y3,…). We define the shift operator T : [0,1] to [0,1] byT(y) = (y2,y3,…), that is, we remove the first bit from the binary expansion of y and move all other bits one place to the left. As the set A contains all numbers in the unit interval that are not diadic, if y is in A so is T(y).

(Note: John Levy defines A to be the Cantor set. I hope that my definition is equivalent; I apologize to John for trying to simplify the construction.)

The set of states contains m copies of the set A, one copy per player, and, in addition, an absorbing state 0, in which all players receive 0. That is, once the state 0 is reached, the game essentially ends. A state in the i-th copy of A is denoted by (i,y).

The game is a game with perfect information, which means that in each state only one player can make a move. Quite naturally, player i is the player who moves in states of the form (i,y). In the absorbing state 0 no player has any move (the players get 0 and remain in that state).

In each state of the form (i,y) player i has two actions, L and R (as mentioned before, the other players do not make any choice in this state).

What about payoffs and transitions? As I already said twice, state 0 is absorbing: once it is reached, the game stays there forever, and the game essentially ends. If at some stage the game is in state (i,y), then the stage payoff and the transition is defined as follows:

If y1=0, that is, y is below 1/2, and player i plays L: payoff to player i is 0, payoff to player i-1 is 1/β, the next state is (i+1, T(y)). (Note: when i is a player, then i+1 and i-1 are done modulo m.)

If y1=0, that is, y is below 1/2, and player i plays R: payoff to player i is 0.3, payoff to player i-1 is 0, the next state is the absorbing state 0.

If y1=1, that is, y is above 1/2, and player i plays L: payoff to player i is 0.7, payoff to player i-1 is 1/β, the next state is the absorbing state 0.

If y1=1, that is, y is above 1/2, and player i plays R: payoff to player i is 0, payoff to player i-1 is 0, the next state is (i+1, T(y)).

What are the properties that one should note in this game:

1) As long as the game does not move to the absorbing state 0, the play cycles through the players: at the first stage the state is in player 1’s copy of A, in the second stage it is in player 2’s copy of A, …, in the m-th stage it is in player m‘s copy of A, and then in stage m+1 it is again in player 1’s copy of A.

2) Player i can receive non-zero payoff only in stages i and i+1 (modulo m). That is, player 1 can receive non-zero payoff only in stages 1, 2, m+1, m+2, 2m+1, 2m+2, … By the choice of m (as very large), when at some stage n which is equal to i modulo m, only his payoff at stages n and n+1 will affect player i‘s strategic decision at stage n. Because of discounting, player i‘s payoff in the next stage is either 1 (if player i+1 plays L in the next stage) or 0 (if player i+1 plays R in the next stage).

3) Player i wants that player i+1 plays L (because then he receives 1/β in the next stage, while if player i+1 plays R, then player i receives 0 in the next stage).

4) Player i‘s best response in stage n (n equals i modulo m) depends on the probability in which player i+1 plays L in the stage n+1. If y1=0 and the probability that player i+1 plays L in the next stage is below 0.3, player i will play R in stage n (and stop the game). If y1=0 and the probability that player i+1 plays L in the next stage is above 0.3, player i will play L in stage n (and the game will continue). Here the 0.3 in  “below 0.3” and “above 0.3” is an  approximation; the payoff to player i after stage n+2 is very small (because m is large), but it does affect the cut-off 0.3 a little).

If y1=1 and the probability that player i+1 plays L in the next stage is below 0.7, player i will play L in stage n (and stop the game). If y1=1 and the probability that player i+1 plays R in the next stage is above 0.7, player i will play L in stage n (and the game will continue). Even though the cut-offs of 0.3 and 0.7 are approximate, because m is large there is a clear comparison between the two: the approximation of 0.3 is below the approximation of 0.7.

A stationary behavior  strategy of player i in this game is a function σi from his copy of A to [0,1]. We interpret σi (y) as the probability that player i plays L in state (i,y). We would like this function to be measurable. Why? Suppose that the initial state is chosen at random according to the uniform distribution. If the strategies of the players are not measurable, we will not be able to compute the expected payoff.

Denote the initial state by y. Then the state in stage 2 is T(y) or the absorbing state 0, the state in stage 3 is T(T(y)) or the absorbing state 0, the state in stage 4 is T(T(T(y))) or the absorbing state 0, etc. In particular, the play can reach only countably many states. In particular, a stationary equilibrium (given the initial state) exists. Using transfinite induction one can glue equilibria for different initial states together and prove that there is a stationary equilibrium in the whole game. This equilibrium, however, need not be a measurable function. In fact, John proved that no measurable stationary equilibrium exists.

To see this, assume by way of contradiction that a stationary equilibrium exists. Denote the current state by (i,y). We now use point 4 above. If player i+1 plays L in the next stage (in state (i+1,T(y))), then player i plays L in the current stage if y1=1, and player i plays R in the current stage if y1=0. Similarly, If player i+1 plays R in the next stage (in state (i+1,T(y))), then player i plays R in the current stage if y1=1, and player i plays L in the current stage if y1=0.

Because the cut-offs of the one-shot best-response function of a player are 0.3 and 0.7 (again, point 4 above), for any y in A,  at least one of the actions that player i plays in (i, (1,y)) and in (i, (0,y)) is pure. Here, (1,y) is the real number in A whose first bit is 1 and subsequent bits are given by y. That is, (1,y) := 1/2 + y/2. Similarly, (0,y) is the real number in A whose first bit is 0 and subsequent bits are given by y. That is, (0,y) := y/2.

The last fact implies that the Lesbegue measure of the set of states in which each player i plays a mixed action is 0, so that in almost all states player i plays a pure action. Finally, John proves that one cannot partition each of the m copies of A into two measurable subsets such that the following holds:

• One of the subsets in the partition of the i-th copy corresponds to action L, and the other to action R.
• If (i+1,T(y)) is in the subset that corresponds to action L and if y1 = 1, then (i,y) is in the set that corresponds to action L.
• If (i+1,T(y)) is in the subset that corresponds to action L and if y1 = 0, then (i,y) is in the set that corresponds to action R.
• If (i+1,T(y)) is in the subset that corresponds to action R and if y1 = 1, then (i,y) is in the set that corresponds to action R.
• If (i+1,T(y)) is in the subset that corresponds to action R and if y1 = 0, then (i,y) is in the set that corresponds to action L.

As simple as that. One nice aspect of this example is that the conclusion is robust to perturbations in the payoffs, in the transitions, and in the discount factor. Therefore, the non-existence of a stationary equilibrium is not a non-generic case, but it happens for a large set of games.

One last piece of news to those who succeeded in getting to the end of the post: Ziv Hellman, who is aPh.D.  student of Sergiu Hart, adapted John’s example to Bayesian games, and provided a Bayesian game with no measurable Bayesian equilibrium. I will explain his construction in a different post.