You are currently browsing the monthly archive for February 2010.

US forces have now spent almost as much as time in Afghanistan as the Greeks at Troy. Even by the most optimistic assessments, they will return well after the Greeks did from Illium. That a campaign on distant shores has been sustained for close to a decade by the blood, bone and sinew of volunteers beggars the imagination. If every nation has its Illiad, this is the American one.

I think I know why US troops are fighting. But I am puzzled about why the Taliban are. Consider the following `what if’. What if, when the US entered Afghanistan, the Taliban leadership scurried over the border to Pakistan (which they did) whilst rank and file hid their weapons and passed themselves off as `civvies’. In short they, the Taliban, sat on their hands. Afghanistan would have been considered `pacified’ or `safe for democracy’. The desire to bring troops home or redeploy to Iraq coupled with no pressing peace keeping duties in Afghanistan would have resulted in elections in a year, two at most. A feckless, corruptible government would come to power. The US would exit leaving a token force as advisors and trainers. The conditions would be ripe for the country to descend again into the chaos which the Taliban exploited to come to power the first time. It is unlikely the US would return to eject the Taliban. All told, the Taliban would have been back on top by 2006.

Before you arrive from your home to the gym, you have to arrive halfway. Then you must do half of the remaining way, then half of what remains… Infinitely many happenings must occur before anybody arrives anywhere. Therefore, says Zeno of Elea, nobody arrives anywhere and all motion is an illusion.

This post is about a cute new paper (pdf) by Dov Samet, which uses arguments similar to Zeno’s to study social interaction.

Zero-sum games are usually easier to handle than non-zero-sum games. It is often easier to prove the existence of the value in zero-sum games than to prove the existence of equilibrium payoffs in non-zero-sum games, and it is easier to calculate optimal strategies in zero-sum games than equilibria in non-zero-sum games.

I found it striking that there are cases where the solution of zero-sum games delivers for free the solution of non-zero-sum games. Such examples include Borel games, that I discussed in length in previous posts, differential games, and stopping games in continuous time.

Suppose that we are given a multi-player Borel game: there are n players, who alternately choose actions in a given set A of actions, which can be finite or infinite. The payoff function of each player i is some real valued function u_i(a_1,a_2,…) of the infinite play (so that a_t is the action that was chosen by player t modulo n at stage t). As I mentioned in the post “Borel Games: II”, in two-player zero-sum Borel games, the value exists, and each player has a subgame-perfect epsilon-optimal strategy: a strategy that, in every subgame, guarantees the value of that subgame, up to epsilon. How can we use optimal strategies in some properly crafted two-player zero-sum games to construct an equilibrium in the multi-player game?

For every player i, consider an auxiliary two-player zero-sum game G_i, where player i is the maximizer, his payoff is u_i(a_1,a_2,…) , and all other players try to minimize player i’s payoff. Thus, all other players form a coalition, effectively a single player, who tries to lower player i’s payoff as much as possible. The game G_i has a value, player i has a subgame-perfect epsilon-optimal strategy sigma_i, and the minimizers, all players except of player I, have a subgame-perfect epsilon-optimal strategy tau_i = (tau^j_i)_{j \neq i}. Note that sigma_i is also a strategy of player i in the original game, and tau_i^j is also a strategy of player j in the original game.

Let us now construct a strategy profile in the original game: each player i follows a subgame-perfect epsilon-optimal strategy sigma_i. Because the strategy sigma_i is subgame-perfect epsilon-optimal, when the players follow the strategy profile (sigma_i)_i, the payoff of each player is at least his min-max value minus epsilon. And this is true in every subgame of the game. Suppose that some player, say player i, deviates from sigma_i. Because the game has perfect information, a deviation is detected immediately. Let’s add threat strategies: if some player j deviates, the other players switch to the epsilon-optimal strategy of the minimizers in the auxiliary game G_j. To summarize, the strategy profile that we constructed is: follow (sigma_i)_i as long as no player deviates. Once some player, say player j, deviates, then all other players switch to an epsilon-optimal strategy of the minimizers in the game G_j.

Let us verify that this strategy profile is a 2epsilon-eqilibrium. Consider the game after the history h, and suppose it is player j’s turn to play. If no player deviates, the payoff of player j is at least his min-max value in the subgame that starts at h, v(h), minus epsilon. If player j deviates and play an action a’ that he is not supposed to play after h, then his deviation is immediately detected, and he will be punished by the other players; his payoff will then be at most his min-max value in the subgame that starts at (h,a’), v(h,a’), plus epsilon. But v(h) is at least v(h,a’), because when the other players try to punish him, at h player j will choose the action that leads him to the maximal min-max value: v(h) = sup_{a’’} v(h,a’’). Therefore, the payoff upon deviation is at most v(h,a’)+epsilon, which is smaller than v(h) + epsilon, which is smaller than the payoff if player j does not deviate + two epsilons.

What did we need so that this proof method would work? First, we needed the existence of a subgame-perfect epsilon-optimal strategy. Second, we need that a deviation be detected immediately.

Are there other models where these two requirements are satisfied, so that the solution of zero-sum games yields the solution of non-zero-sum games for free? Yes!

Differential games are games played in continuous time. The state variable is a point in a Euclidean space, and its location is controlled by two players. The game is played on a finite time interval, and the payoff is a sum of two quantities. The first depends on the final location of the state variable, and the second is an integral of “stage payoffs”, that depend on the whole path the state variable did along the game. Under proper continuity conditions, differential games have a subgame-perfect epsilon-optimal strategy, and the value is a continuous function of the state variable and the horizon. Thus, the first requirement of the construction holds. The nature of continuous time implies that deviations are detected in infinitesimal time. Because the value function is continuous, if a player deviates at time t, and his deviation is detected at time t+dt, then the player can be punished approximately at the level of his min-max value at time t. Therefore the second requirement holds as well, and the construction presented above holds.

Another model where the method applies is stopping games in continuous time. In this model at every time instance each player decides whether to stop the game or to continue; the game terminates once at least one player stops, and the payoff is a function of the terminal time and the set of players who decide to stop at the terminal moment. Under proper continuity conditions, zero-sum games have a value and subgame-perfect epsilon-optimal strategies, and the value is right-continuous, so that as for differential game the construction applies.

Are there additional classes of games where this approach works? Are there other solution concepts where it may be useful?

Anne and Bill play the following game: at odd stages Anne chooses either 0 or 1, and at even stages Bill chooses either 0 or 1. This way an infinite sequence of bits is generated. This sequence is the binary representation of a unique number x in the interval [0,1]. The payoff is given by some real-valued function u(x); the game is zero-sum, Anne is the maximizer and Bill is the minimizer.

This game is a two-player zero-sum Borel game. As I mentioned in a previous post, the action sets of the two players can in fact be arbitrary sets (finite or infinite), and they may be history dependent. In such a case, the payoff function will be a real-valued function defined over the set of all possible infinite plays.

In a previous post I argued that a two-player zero-sum game admit a value, provided the payoff function is bounded and Borel measurable. That is, there is a real number v, and for every positive epsilon Anne has a strategy sigma_* and Bill has a strategy tau_*, such that:

1) When Anne plays according to sigma_*, then the payoff is at least v-epsilon, whatever Bill plays.

2) When Bill plays according to tau_*, then the payoff is at most v+epsilon, whatever Anne plays.

The strategies sigma_* and tau_* are epsilon-optimal strategies of Anne and Bill.

If Anne and Bill are rational, then the outcome of the game will be close to v.

When Anne plays according to the strategy sigma_*, and Bill plays according to the strategy tau_*, the payoff will be between v-epsilon and v+epsilon. If Anne decides to deviate from sigma_*, while Bill sticks to tau_*, the payoff will be at most v+epsilon, so that Anne’s gain from such a deviation is at most 2epsilon. Similarly, if Bill decides to deviate from tau_*, while Anne sticks to sigma_*, the payoff will be at least v-epsilon, so that Bill’s gain from such a deviation is at most 2epsilon as well.

If the range of the payoff function u is finite, then for epsilon sufficiently small, every epsilon-optimal strategy is actually 0-optimal: because payoffs are discrete, if Anne’s strategy guarantees a payoff at least v-epsilon, then because for epsilon small enough there are no possible outcomes between v-epsilon and v, this strategy in fact guarantees v.

Let’s take a concrete example. Suppose that the function u is determined by the first three bits that Anne and Bill choose:

u(0) = 1

u(10) = 1

u(110) = 1

u(111) = 2

That is, if Anne chooses 0 at the first stage, the payoff is 1;

if Anne chooses 1 at the first stage, and Bill chooses 0 at the second stage, the payoff is 0;

if Anne chooses 1 at the first stage, and Bill chooses 1 at the second stage, and Anne chooses 0 at the third stage, the payoff is 1;

and if Anne chooses 1 at the first stage, Bill chooses 1 at the second stage, and Anne chooses 1 at the third stage, the payoff is 2.

The value of this game is 1: Anne can guarantee obtaining 1 by choosing 0 at the first stage, and Bill can guarantee paying no more than 1 by choosing 0 at the second stage, in the case that Anne chose 1 at the first stage.

Another strategy for Anne that guarantees her a payoff of 1 is to choose 1 at the first stage, and 0 at the second stage. In that case, whatever Bill chooses at the second stage, Anne’s payoff is 1. Call this strategy sigma_1.

Of course, a better strategy for Anne is to choose 1 at the first stage, and 1 at the second stage. With this strategy Anne guarantees for herself at least 1, and, if Bill chooses 1 at the second stage, she actually gets 2. Call this strategy sigma_2.

The strategy sigma_2 is subgame-perfect optimal: it guarantees to Anne the highest quantity that she can guarantee, in every subgame. Here, a subgame corresponds to a finite sequence of actions that Anne and Bill could have played in the first stages of the game. The subgame G(0) corresponds to the continuation game if Anne chose 0 at the first stage; the subgame G(11) corresponds to the continuation game if Anne chose 1 at the first stage, and Bill chose 1 at the second stage; etc.

The value of the game G(0) is 1, as Anne’s payoff is at least 1 whatever the players play, and Bill can guarantee that the payoff is at most 1 (by choosing 0 at the second stage).

The value of the game G(11) is 2, as Anne can guarantee a payoff 2 by choosing 1 at the third stage, and Bill can guarantee that the payoff is at most 2 (because all payoffs in the game are at most 2).

As I mentioned above, by Martin’s (1975) result, every two-player zero-sum game have a value, and both players have epsilon-optimal strategies. Does there exist a subgame-perfect epsilon-optimal strategy? That is, a strategy sigma_* that satisfies the following condition: for every subgame, the strategy sigma_* restricted to that subgame is epsilon-optimal in the subgame?

The answer is positive, and in fact follows quite easily from Martin’s result. To see why, let’s add one more definition: say that a strategy sigma of Anne guarantees an amount w if whatever Bill play, the payoff is at least w. Plainly in such a case w is smaller than the value of the game, because the value is the supermum of all the amounts w that Anne can guarantee.

No for the construction of a subgame-perfect epsilon-optimal strategy: ask Anne to start playing an epsilon/2-optimal strategy sigma_1 in the whole game. At every one of her decision node x, Anne compares the value w(x) that the strategy she plays guarantees, and the value v(x) of the subgame that starts at the current decision node. If the difference v(x)-w(x) is less than epsilon, then sigma_1 guarantees the value (up to epsilon) in the subgame that starts at x, so Anne continues following the strategy sigma_1. If the difference is higher than epsilon, Anne switches to an epsilon/2-optimal strategy sigma_2 in the subgame that starts at x. So that now her strategy guarantees at least v(x)-epsilon/2, which is strictly higher than the amount w(x) that the previous strategy she was following guarantees.

Anne continues to compare the amount guaranteed by the current strategy that she follows to the value of the subgame that starts at the current node, and she switches to a new epsilon/2-optimal strategy whenever the difference is higher than epsilon.

Whenever Anne switches a strategy, the amount that she can guarantee increases by epsilon/2. Because the payoffs are bounded, the number of times she can switch strategies is finite. In particular, from some point on Anne sticks to some epsilon/2-optimal strategy in some subgame, and for every subgame that the play visits, this strategy guarantees at least the value of the subgame minus epsilon. To summarize, this strategy is subgame-perfect epsilon-strategy for Anne.

So we know that a subgame-perfect epsilon-optimal strategy exists. As the next post will show, such a strategy is needed to construct epsilon-equilibria in non-zero-sum Borel games.

Job candidates, departments, journals, professors… Nobody is safe from the unrelenting rating machine of this profession. My intuition and some anecdotal evidence say that economists are more obsessed with ranking than our colleagues from humanities and natural sciences. To investigate the issue more systematically, I thought to define ranking obsession using google search count. So, for a string let be the number of results in google search for S.

Definition 1Let be a name of an academic field. Theranking obsession factorof is given by

Here stands for concatenation. For example, the ROF of Economics is the ratio of N(“ranking of economics departments”) and N(“economics departments”).

Yes, some would answer, since the cognitive and physical processes that take place in my mind/body/whatever while I make decisions or choose actions can be simulated by a computer program.

Even if we accept this assertion as true, I find the argument unsatisfactory. For me, the subject matter of game theory and decision theory is the idea of rationality, not the lousy shadow of rationality that evolved in my neighbourhood. When I come across all sort of dumb things people do, my response is `So what ?’ Why should we redo our theory of rationality to accommodate the petty goings-on of a bunch of talking monkeys on a mostly harmless planet in the middle of nowhere ?

So, the fact that human beings happen to be walking computers doesn’t mean players are also computers.

And yet, even if like me you don’t think of your players as models for human beings, I still believe you may want to consider computability of their strategies. My reason is that game theory traditionally studies not only rational behavior but also the more vague notion of rational reasoning:

Imagine each player instead of making each decision as the necessity for it arises, makes up his mind in advance for all possible contingencies; i.e. that the player begins to play with a complete plan which specifies what choices he will make in every possible situation …. We call such a plan a strategy. (von-Neumann and Morgenstern, Section 11.1)

Back to Rabin’s game that I talked about yesterday. Even if we accept the existence of the function that appears in the second item as a mathematical entity, such a function presents a `plan’ that cannot, even in principle, be executed, described, or reasoned about.

So I think a case can be made that by restricting our attention to computable functions we don’t arbitrarily restrict the player’s strategy set. On the contrary, the standard game theoretic formulation, by allowing non-computable functions, expands it to mathematical creatures that actually don’t capture our ideal concept of strategy

Since in the near future I will have to start advocating a computability argument, and since this position is somewhat in contrast to what was my view a couple of years ago when I first came across algorithmic game theory, I thought I should try to clarify my thoughts about the issue in a couple of posts.

The most beautiful example I know for the surprising outcomes we get when taking computability into account in game theory is Michael Rabin’s paper from 1957. Rabin considers a finite win-loss game with perfect information between two players, Alice and Bob. The game is played in three rounds, at each round a player announces a natural number (an action): First Alice announces . Then Bob announces and then Alice announces . After these declaration are made, the game coordinator determines the winner: If Alice wins. If Bob wins. The set , *Alice’s winning set*, is part of the description of the games. This is a finite game, which can be solved by a background induction. By the so called Zermelo’s Theorem, the game is determined: One of the player has a winning strategy. The identity of the winner depends of course on .

Rabin gives an example for a game with the following propertie:

- is a decidable set: this means that there is an effective way to decide, for every triple whether or not . So the job of the game coordinator in this game can be performed by a computer program. Games with effectively computable payoff are called
*actual games*: These are the games that can actually be played. - Bob has a winning strategy : whatever actions and Alice chooses in the first and third stages, Bob will win the game if in the second stage he chooses : so that for any
- Bob has no computable winning strategy: For every function that can be computed by a computer program there exist some actions for Alice such that .

Feel the blow ? Bob finds himself in a strange predicament. On the one hand, after observing Alice’s action from the first round, Bob knows that there is some action he can choose that beats any possible response of Alice. However, if Bob tries to `plan ahead’ which action to take after every possible choice of , he cannot produce such a plan. Bob can beat all Alice’s strategies and Alice can beat all Bob’s (computable) strategies. von-Neumann and Morgenstern’s least controversial idea, solving a zero-sum game with perfect information by backward induction, trembles. The argument that in finite zero-sum games with perfect information a player doesn’t care if the opponent finds out his strategy falls apart.

(To be continued…)

Anne and Bill play the following sequential game: at odd stages Anne chooses either 0 or 1, and at even stages Bill chooses either 0 or 1. This way they generate an infinite sequence of bits, which is the binary expansion of a real number x in the interval [0,1]. Who wins the game? Well, there is a subset W of [0,1], the winning set of Anne, which is known to the players in advance. If the real number x that they generated is in W, Anne wins. If it is outside W, Bill wins.

Suppose for example that W=[0,1/2]. Then, by choosing 0 at the first stage of the game, Anne guarantees a win. If, on the other hand, W is the union of the two intervals [0,1/4) and (1/2,3/4), then by choosing 0 at the second stage, Bill guarantees a win.

Zermelo’s Theorem guarantees that every finite two-player game with two outcomes: “player I wins” or “player II wins”, is determined: one of the players has a winning strategy. The game between Anne and Bill is not a finite game, and therefore Zermelo’s Theorem does not apply.

Gale and Stewart (1953) proved that if the set W is closed, then the game is determined. This result was significantly extended by Martin (1975), who proved that if W is Borel measurable then the game is determined.

Borel games are similar to the game described above between Anne and Bill, but instead of choosing a bit at every stage, Anne and Bill have some set of actions A available to them; at odd stages Anne chooses an action from A, and at even stages Bill chooses an action from A. The set A may be finite or infinite. When the set of actions A is not {0,1}, Anne’s winning set is a subset of A^∞.

In fact, it does not matter whether Anne and Bill have the same set of actions, or different sets of actions. Actually, the set of actions available to each of them may be history dependent. Also, the players need not alternate in choosing actions: the identity of the player who currently chooses the action may be history dependent as well. The result of Martin (1975) was proved to this general setup.

Up to now, the outcome of the game was either 1 (Anne wins) or -1 (Bill wins). However, we could think of general two-player zero-sum Borel games. Here the data of the game includes a payoff function u that assigns a real number u(p) to each play p (=infinite sequence of actions).

One game that falls into this class of games is the following: we are given a directed graph. The vertices of the graph are divided into four types: some vertices are “win vertices of Anne”, others are “win vertices of Bill”, yet others are “vertices controlled by Anne”, and the rest are “vertices controlled by Bill”. The game starts at one of the vertices, and moves between the vertices as follows: if the current vertex s is controlled by Anne, Anne chooses an outgoing edge from s, and the vertex at the end of the edge is the new vertex. If s is controlled by Bill, it is he who chooses an outgoing edge. If s is a winning vertex of one of the players, then that player is the winner, and the game ends. If the play never reaches a winning vertex, the game ends with a draw.

Martin’s (1975) result implies that if the payoff function u is measurable and bounded, then the game has a value. The argument is as follows: for every real number x denote by W(x) the set of all plays p where u(p) ≥ x. Because u is Borel measurable, the set W(x) is a Borel set. By Martin, the game G(x) with winning set W(x) is determined for every x. If Anne can guarantee a win in G(x), it means that she can guarantee a payoff of at least x in the original game; if Bill can guarantee a win in G(x), it means that he can guarantee a payoff smaller than x in the original game. Because u is bounded, for x=∞ Bill can guarantee a win in the game G(x), and for x=-∞ Anne can guarantee a win in the game G(x). Let then x* be the maximal x for which the game G(x) is determined for Anne. Then x* is the value of the original game.

So, every two-player zero-sum Borel game has a value? What about equilibria in multi-player Borel games? This will be discussed in a subsequent post.

A couple of years ago I wrote an article for a popular journal on game theory. The standard content: John Nash and the Beautiful Mind, strategic analysis, auctions, the value of information and the winner’s curse. A social worker, who works with local action groups, read the article, and realized that it may be interesting and useful for his community to model the interaction between local action groups, and municipalities and large firms, using game theory. I found this idea worthwhile pursuing, and after some time, two years, to be precise, we ended up analyzing the construction of the new central bus station at Tel Aviv, the second largest city in Israel, and the struggle of the local neighborhood to receive fair compensation for the damages caused by this project.

The modeling is routine and not difficult. One player is the municipality; the other player(s) is the locals. The municipality can be fair or unfair; the locals can fight together, separately or do nothing. Determine payoffs, find equilibria. Lo and behold, in equilibrium the municipality if unfair, it plays divide-and-conquer, it plays a war of attrition.

The mathematical analysis is just a tool to obtain insights: what do we learn from this interaction? This struggle made me think of the tyranny of the municipalities and government. In the back of our minds we all know of this tyranny, we read of it in the news and we see it around us. This interaction is nevetheless a nice example where a municipality mistreats its people: instead of providing fair compensation, it dragged the people for years in court, and did its best to kill their fight.

Unions where established to fight the tyranny of employers. But what can one do to fight the tyranny of municipalities against the weaker sections of the populations? A popular union won’t do. It is fun to strike and to remain at home for others, but it is less fun to demonstrate 100 kilometers away from home for a cause you may not even agree to.

But then, what can be done? Can one establish a non-for-profit organization that helps action groups, backed by first-rate lawyers and experts in public relations, and supported by thousands of people who value justice and fairness and are willing to support struggles against the tyranny of the rich, the strong, and the various institutions of state? Who can be the driving force behind such an organization? Who will finance it? Who ensures us that it, too, will not become part of the rich and strong? Other ideas? Other solutions that exist someplace?

## Recent Comments