You are currently browsing the monthly archive for January 2012.

Here, recounting the debate before the Bin Laden operation.

[The President] said, I have to make this decision what is your opinion. He started with the National Security adviser, the Secretary of State and he ended with me. Every single person in that room hedged their bet, except Leon Panetta. Leon said GO. Everyone else said 49, 51, this…

It got to me. Joe what do you think? I said “You know I didn’t know we have so many economists around the table. We owe the man a direct answer.

Me: First, is `economist’ a synonym to somebody who can’t give a direct answer ? I thought we have a better reputation. Second, the way I see it 49,51 is actually a direct answer and a bold prediction. It means that if they start making independent attempts to kill Bin Laden, and substantially less or substantially more than half of these attempts succeed then the president has all the reasons to give these guys the boot.

A mathematical detour today. Will state and prove some basic results in combinatorial optimization that have applications in matching and auction theory. The first is the max-flow-min-cut theorem. I’ll use it to prove Hall’s marriage theorem as well as derive the Border-Matthews characterization of implementable interim allocation rules. It has other applications. Itai Sher, for example, uses it in the analysis of a special case of the Glazer-Rubinstein model of persuasion. Mobius et al use it in their study of capital flows in social networks (the Peruvian village paper). Olszewski and myself have a forthcoming paper where we use it to talk about team formation.

Then, its on to polymatroids which can be used to show the assignment problem retains its integrality property in the presence of additional constraints. These additional constraints arise in, for example, school choice problems. See the paper by Budish, Che, Kojima and Milgrom for a list of examples.

1. The Maximum Flow Problem

Let {G= (V,E)} be a network with vertex set {V} and edge set {E}. Assume, for convenience, that edges are `bi-directional’. In {V} there are two special vertices. One called {s}, the source and the other {t}, the sink. Each edge {(i,j) \in E} has a capacity {c_{ij}} that is the maximum flow that can traverse {(i,j)} (in either direction). Assume {c_{ij}} is integral for all {(i,j) \in E}. To avoid trivialities assume at least one path in {G} from {s} to {t}.

The maximum flow problem is to find the maximum amount of flow that can be sent from {s}, through the network, to {t}. That there is only source-sink pair is not a restriction. An instance with multiple sources and sinks can be converted into one with a single {s-t} pair through the introduction of a dummy super source and super sink.

A cut in {G} is a set of edges whose removal disconnects {s} from {t}. Equivalently, a cut is a set of edges that intersects every path from {s} to {t} in {G}. A cut can also be defined in terms of a partition of {V} into two sets {S} and {T} such that {s \in S} and {t \in T}. A cut would be the set of edges with one endpoint in {S} and the other in {T}. The capacity of a cut is the sum of the capacities of the edges in the the cut.

The maximum {s-t} flow is clearly bounded above by the capacity of any cut. Indeed, the max {s-t} flow must be bounded above by the minimum capacity cut. Remarkably, this upper bound is tight, i.e., the max flow is the min cut. Note that the min cut must be integer valued but it is not obvious that the maximum flow should be integer valued.

We can formulate the problem of finding a maximum flow as a linear program. To do this, let {x_{ij}} denote the amount of flow on edge {(i,j)} from {i} to {j}. Then, the maximum flow problem can be expressed as follows:

\displaystyle  \max f

subject to

\displaystyle  f - \sum_{(s,j) \in E}x_{sj} = 0

\displaystyle \sum_{i \in V: (i,j) \in E}x_{ij} - \sum_{k \in V: (j,k) \in E}x_{jk} = 0 \quad \forall j \in V\setminus\{s,t\}

\displaystyle  -f + \sum_{(i,t) \in E}x_{it} = 0

\displaystyle 0 \leq x_{ij} \leq c_{ij}\quad, \forall (i,j) \in E

The first constraint says that the flow, {f}, out of {s} must use edges of the form {(s,j)}. The second constraint imposes flow conservation in each vertex, i.e., flow in must equal flow out. The third constraint ensures that the total flow {f} entering {t} must use edges of the form {(j,t)}.

The constraint matrix is a network matrix and so TUM. You can see this by noting that each variable {x_{ij}} appears in at most two constraints with opposite sign. Once as an `incoming’ edge and once as an `outgoing’ edge. The variable {f} also appears exactly twice with opposite sign.

TUM of the constraint matrix means that the flow on each edge and as well as the amount of flow is integer valued.

Theorem 1 The maximum flow is equal to the capacity of the minimum cut.

Proof: Let {f^*} be the value of the maximum flow. By LP duality:

\displaystyle  f^* = \min \sum_{(i,j) \in E}c_{ij}z_{ij}

\displaystyle  z_{ij} \geq |y_i - y_j| \quad \forall (i,j) \in E

\displaystyle y_s - y_t = 1

\displaystyle  z_{ij} \geq 0 \quad \forall (i,j) \in E

To see where it came from, observe that we have one dual variable {y_i} for each {i \in V} and one variable {z_{ij}} for each {(i,j) \in E}. Recall the bidirectional assumption. So, {z_{ij}} is different from {z_{ji}}. Thus one should have two constraints for each pair {(i,j)}:

\displaystyle z_{ij} \geq y_i - y_j

\displaystyle z_{ji} \geq y_j - y_i

I have simply folded these into a single constraint {z_{ij} \geq |y_i - y_j|}. Notice that at optimality at most one of {z_{ij}} and {z_{ji}} can be positive which is why this folding is legitimate.

Notice also, that we can WLOG fix {y_s =1} and {y_t = 0}. This allows us to right the dual as:

\displaystyle  f^* = \min \sum_{(i,j) \in E}c_{ij}z_{ij}

\displaystyle z_{ij} \geq |y_i - y_j| \quad \forall (i,j) \in E

\displaystyle y_s = 1

\displaystyle  y_t = 0

\displaystyle  z_{ij} \geq 0 \quad \forall (i,j) \in E

By TUM of the constraint matrix, we know that an optimal solution to this program that is integral exists. An integer solution to this program corresponds to a cut. Let {S = \{i: y_i =1\}} and {T = \{i: y_i = 0\}}. Observe

\displaystyle f^* = \sum_{i \in S, j \in T}c_{ij},

which is the capacity of the cut associated with the partition. It is easy to see that every cut corresponds to a feasible integer solution to the program.{\Box}

I can’t resist offering a probabilistic proof due to Bertsimas, Teo and myself. Let {(y^*, z^*)} be an optimal solution to the dual program. Suppose, for a contradiction that it is fractional. Consider the following randomization scheme:

  1. Position node {i} in {[0,1]} according to the value of {y^*_i}.
  2. Generate a single random variable {U} uniformly distributed in {[0,1]}. Round all nodes {i} with {y^*_i \leq U} to {y_i=0} and all nodes {i} with {y^*_i > U} to {_i=1}.
  3. Set {z_{ij} = |y_i-y_j|}.

The randomization scheme clearly returns a feasible cut. More importantly the randomization scheme defines a probability distribution over the set of cuts. Now lets compute the expected capacity of the cut generated by the randomization scheme:

\displaystyle \sum_{(i,j) \in E}c_{ij} P\{\min (y^*_i, y^*_j) \leq U \leq \max (y^*_i, y^*_j)\}

\displaystyle = \sum_{(i,j) \in E}c_{ij}|y^*_i - y^*_j| = \sum_{(i,j) \in E}c_{ij}z^*_{ij}.

Thus, the expected capacity of our random cut is the same as the optimal objective function value. Because the randomization scheme is a distribution over cuts, it follows that there must be a cut with capacity equal to {f^*}.

1.1. Hall Marriage Theorem

Recall, that in the DGS auction from the previous session, an important step relied on the Hall Marriage theorem. Here, I will use maxflow-mincut to prove it.

Let {(V_1, V_2, E)} be a bipartite graph with vertices {V_1 \cup V_2} and edge set {E}. Suppose {V_1| = |V_2| = n}. Note edges run between {V_1} and {V_2} only. A perfect matching is a subset of edges {M} such that each vertex is incident to exactly one edge in {M}. A matching is a subset of edges {M} such that each vertex is incident to at most one edge in {M}.

For any {S \subseteq V_1}, let {N(S) \subseteq V_2} be the set of neighbors of {S}. That is {N(S) = \{j \in V_2: (i,j) \in E,\,\, i \in S\}}.

Theorem 2 A bipartite graph, {(V_1, V_2, E)} contains a perfect matching iff. { |S| \leq |N(S)|} for all {S \subseteq V_1}.

Proof: This is clearly necessary. So, suppose { |S| \leq |N(S)|} for all {S \subseteq V_1} and for a contradiction assume that the graph does not contain a perfect matching.

Introduce a dummy source vertex {s} with an edge directed to each vertex in {V_1}. To each edge {(s,i)} with {i \in V_1}, assign a capacity of 1 unit. Introduce a dummy sink {t} with an edge directed from each vertex of {V_2} to {t}. To each edge of the form {(j,t)} assign a capacity of 1 unit. To each edge in {E} assign a capacity of {\infty}.

Consider now a maximum flow from {s} to {t}. By Theorem 1 we can assume the flow is integral, i.e., the flow on each edge is an integer amount. Second, it is easy to see that no edge will contain more than 1 unit of flow. Third, the set of edges, {M} in {E} that carry a non-zero amount of flow must be a matching. Why? No more than one unit of flow can leave any {i \in V_1} and no more than one unit of flow can enter an {j \in V_2}. Fourth, each matching corresponds to a feasible flow. Thus, the value of the maximum flow gives us the size of a maximum cardinality matching. Because we assumed no perfect matching, the value of the maximum flow must be less than {n}. i.e., {|M| < n}.

Now a maximum flow must equal the value of a minimum cut. Clearly, no minimum cut can contain edges in {E} because they have infinite capacity. So, the minimum cut must consist of edges of the form {(s,i)} and {(j,t)}. Let {A_1 \subseteq V_1} and {A_2 \subseteq V_2} be the set of vertices incident to the edges in a minimum cut. The capacity of this cut will be {|A_1| + |A_2| < n}. Because they form a cut, removing them from the graph would delete all paths from {s} to {t}. Do that.

Consider any {i \in V_1 \setminus A_1}. There cannot be a {j \in V_2 \setminus A_2} such that {(i,j) \in E}. If not, there would be a path {s \rightarrow i \rightarrow j \rightarrow t}. That means every neighbor of a vertex {i ]\in V_1 \setminus A_1} resides in {A_2}. Hence, by assumption:

\displaystyle |A_2| \geq |V_1 \setminus A_1| = n- |A_1| \rightarrow |A_1| + |A_2| \geq n,

a contradiction. {\Box}

The Hall marriage theorem is easily generalized to something called Gale’s demand theorem. Suppose each vertex in {i \in V_1} is a demand vertex, demanding {d_i} units of a homogenous good. Each vertex {j \in V_2} is a supply vertex, supplying {s_j} units of that same good. Supply can be shipped to demand nodes only along the edges in {E}. Is there a feasible way to match supply with demand? Yes, iff. for all {S \subseteq V_1} we have

\displaystyle \sum_{i \in S}d_i \leq \sum_{j \in N(S)}s_j.

1.2. Border’s Theorem

For simplicity assume two agents, {T} be the type space of agent 1, {S} the type space of agent 2 and a single good. Suppose, also for simplicity, that types are independent draws. Denote by {f_i(x)} the probability of agent {i} having type {x}. Let {a_i(t,s)} the probability that the good is allocated to agent {i} when agent 1 reports {t} and agent 2 reports {s}. Feasibility requires that

\displaystyle a_1(t, s) + a_2(t, s) \leq 1 \,\, \forall t, s

\displaystyle a_i(t,s) \geq 0\,\, \forall i,\,\, \forall t, s

{a} is an allocation rule. Denote by {q_1(t)} the interim allocation probability to agent 1when she reports {t}. Note that

\displaystyle q_1(t) = \sum_{s \in S}a_1(t,s)f_2(s)

A similar expression applies for {q_2(s)}. Call an interim allocation probability implementable if there exists an allocation rule that corresponds to it. Here is the question posed originally posed by Steve Matthews (who conjectured the answer): characterize the implementable {q}‘s. The high level idea first. Roll up all the inequalities relating {a} to {q} and feasibility into matrix form:

\displaystyle {\cal I}q - {\cal P}a = 0

\displaystyle {\cal A}a \leq 1

\displaystyle a \geq 0

Denote by {F} the set of {(q,a)} that satisfy the system above. The purely mathematical question is this: for which {q}‘s does there exist at least one {a} such that {(q,a) \in K}? In other words, we want to identify the set {K}:

\displaystyle K = \{q: \exists a\,\, with\,\, (q,a) \in F\}.

The set {K} is called the reduced form. Geometrically, {K} is just the projection of {F} onto the {q}-space. By the Farkas lemma, the set {K} can be characterized in the following way. Let {C = \{(u,v): v \geq 0, -u{\cal P} + v{\cal A} = 0\}}. {C} is a cone. Then,

\displaystyle K = \{q: u \cdot q \leq v \cdot 1\,\, \forall (u,v) \in C\}.

It would seem that {K} is described by a continuum of inequalities. However, because {C} is a cone, it suffices to consider only the generators of the cone {C} and there are finitely many of them. The challenge now reduces to identifying the generators of {C}. When the matrices {{\cal A}} and {{\cal P}} have suitable structure this is possible and is the case here. For convenience assume {T = \{t, t'\}} and {S = \{s,s'\}}. Here are all the inequalities:

\displaystyle  a_1(t,s) + a_2(t,s) \leq 1

\displaystyle  a_1(t',s) + a_2(t',s) \leq 1

\displaystyle  a_1(t',s') + a_2(t',s') \leq 1

\displaystyle  a_1(t,s') + a_2(t,s') \leq 1

\displaystyle  a_1(t,s)f_2(s) + a_1(t,s')f_2(s') = q_1(t)

\displaystyle  a_1(t',s)f_2(s) + a_1(t',s')f_2(s') = q_1(t')

\displaystyle  a_2(t,s)f_1(t) + a_2(t',s)f_1(t') = q_2(s)

\displaystyle  a_2(t,s')f_1(t) + a_2(t',s')f_1(t') = q_2(s')

Now some appropriate scaling:

\displaystyle  f_1(t)f_2(s) a_1(t,s) + f_1(t)f_2(s)a_2(t,s) \leq f_1(t)f_2(s)

\displaystyle  f_1(t')f_2(s)a_1(t',s) + f_1(t')f_2(s)a_2(t',s) \leq f_1(t')f_2(s)

\displaystyle  f_1(t')f_2(s')a_1(t',s') + f_1(t')f_2(s')a_2(t',s') \leq f_1(t')f_2(s')

\displaystyle  f_1(t)f_2(s')a_1(t,s') + f_1(t)f_2(s')a_2(t,s') \leq f_1(t)f_2(s')

\displaystyle  a_1(t,s)f_1(t)f_2(s) + a_1(t,s')f_1(t)f_2(s') = f_1(t)q_1(t)

\displaystyle  a_1(t',s)f_1(t')f_2(s) + a_1(t',s')f_1(t')f_2(s') = f_1(t')q_1(t')

\displaystyle  a_2(t,s)f_1(t)f_2(s) + a_2(t',s)f_1(t')f_2(s) = f_2(s)q_2(s)

\displaystyle  a_2(t,s')f_1(t)f_2(s') + a_2(t',s')f_1(t')f_2(s') = f_2(s')q_2(s')

Now a change of variables to make things nice and neat: {x_i(u,v) = a_i(u,v)f_1(u)f_2(v)}.

\displaystyle  x_1(t,s) + x_2(t,s) \leq f_1(t)f_2(s)

\displaystyle x_1(t',s) + x_2(t',s) \leq f_1(t')f_2(s)

\displaystyle  x_1(t',s') + x_2(t',s') \leq f_1(t')f_2(s')

\displaystyle  x_1(t,s') + x_2(t,s') \leq f_1(t)f_2(s')

\displaystyle  x_1(t,s)+ x_1(t,s') = f_1(t)q_1(t)

\displaystyle  x_1(t',s) + x_1(t',s') = f_1(t')q_1(t')

\displaystyle  x_2(t,s)+ x_2(t',s) = f_2(s)q_2(s)

\displaystyle  x_2(t,s') + x_2(t',s') = f_2(s')q_2(s')

This system corresponds to the flow balance conditions for a flow problem on a bipartite graph. For each profile of types {(u,v)} introduce a supply vertex with supply {f_1(u)f_2(v)}. For each {u \in T} introduce a demand vertex with demand {f_1(t)q_1(t)}. From each supply vertex of the form {(u,v)}, introduce two edges. One directed from {(u,v)} to demand vertex {f_1(u)q_1(t)}; the flow on this edge is {x_1(u,v)}. The other edge directed into demand vertex {f_2(v)q_2(v)}; the flow on this edge is {x_2(u,v)}.

The first four inequalities say that the flow out of a supply vertex cannot exceed its available supply. The second four ensure that the flow into a demand vertex matches the demand.

Using this flow interpretation we can use Gale’s generalization of the Hall marriage theorem to determine when a feasible flow exists. Existence of a feasible flow means that {q} is implementable. Let {A \subseteq T} and {B \subseteq S}. Then, for all such {A} and {B},

\displaystyle \sum_{t \in A}f_1(t)q_1(t) + \sum_{s \in B}f_2(t)q_2(s) \leq \sum_{t \in A}f_1(t)[\sum_{s \in S}f_2(s)] + \sum_{s \in B}f_2(s)[\sum_{t \in T}f_1(t)] = \sum_{t \in A}f_1(t) + \sum_{s \in B}f_2(s)

is necessary and sufficient for {q} to be implementable. The left hand side of this inequality is the expected quantity of the good allocated to agents when their types come from {A} and {B}. The right hand side is the probability that at least one agent has a type in {A \cup B}. In fact we can rewrite this inequality to read

\displaystyle \sum_{t \in A}f_1(t)q_1(t) + \sum_{s \in B}f_2(t)q_2(s) \leq 1 - \sum_{(u,v) \in A^c \times B^c}f_1(u)f_2(v).

Here, {A^c} and {B^c} are the complements of {A} and {B}.

As you will see later one can generalize this analysis to cases where the allocation rule, {a} must satisfy additional constraints. For details see Che, Kim and Mierendorff as well as Federgruen and Groenvelt.

2. Polymatroids

Let {E} be a ground set. A real valued function {f} defined on subsets of {E} is

  • non-decreasing if {S \subseteq T \Rightarrow f(S) \leq f(T)}, and
  • {f} is submodular if {\forall S, T \subset E}

    \displaystyle f(S) + f(T) \geq f(S \cup T) + f(S \cap T).

  • {f} is supermodular if {-f} is submodular.

Let {f} be a submodular function on {E}. The polytope:

\displaystyle P(f) = \{x \in \Re^n_+: \sum_{j \in S}x_j \leq f(S)\quad \forall S \subseteq E\}

is the polymatroid associated with {(E,f)}. Notice that {P(f) \neq \emptyset} iff {f(S) \geq 0\,\, \forall S \subseteq E}. Polymatroids arise in a number of economic settings. One that we have just discussed are the Border inequalities. Suppose a single object to allocate among {n} bidders whose types are independent draws from the same distribution (with density {\pi}). Restrict attention to allocation rules that do not depend on the identity of the agent. Let {q(t)} be the interim allocation probability if an agent reports type {t}. Then, {q} is implementable iff.

\displaystyle n \sum_{t \in S}\pi(t)q(t) \leq 1 - \Pi_{t \not \in S}\pi(t)\,\, \forall S \subseteq T.

Call the right hand side of the above {f(S)} and consider the change of variables {x_t = n\pi(t)q(t)}:

\displaystyle \sum_{t \in S}x_t \leq f(S)\,\, \forall S \subseteq T.

Now {f} is a monotone polymatroid. So, the feasible region described by the Border inequalities is a polymatroid. The theorem below tells us that every extremal interim allocation probability can be implemented as a priority rule. That is, types are assigned priorities and in any profile the good is allocated to the agent with the highest priority. This connection to priority rules is useful for analyzing optimal auctions in dynamic settings (see my survey paper on dynamic mechanism design) as well as competition between auctions (see Pai’s jmp).

Theorem 3 Let {f} be a non-decreasing, integer valued, submodular function on {E} with {f( \emptyset) =0}. Then, the polymatroid {P(f)} has all integral extreme points.

Proof: Choose a weight vector {c} and order the elements of {E} by decreasing weight:

\displaystyle c_1 \geq c_2 \ldots \geq c_k \geq 0 > c_{k+1} \ldots \geq c_n.

let {S^0 = \emptyset} and {S^j = \{1, 2, \ldots, j\}} for all {j \in E}. Consider the vector {x} obtained by the following greedy procedure:

  1. {x_j = f(S^j) - f(S^{j-1})} for {1 \leq j \leq k}
  2. {x_j = 0} for {j \geq k+1}.

I show that {x \in \arg \max\{cz : z \in P(f)\}.} Note {x} is integral and non-negative because {f} is non-decreasing. To verify that {x \in P(f)} observe:

\displaystyle  \sum_{j \in T}x_j = \sum_{j \in T\cap S^k}[f(S^j) - f(S^{j-1})] \leq \sum_{j\in T\cap S^k}[f(S^j \cap T) - f(S^{j-1} \cap T)] \leq f(S^k \cap T) - f(\emptyset) \leq f(T).

The dual to our optimization problem is:

\displaystyle  \min \sum_{S \subseteq E}f(S)y_S

subject to

\displaystyle  \sum_{S \ni j}y_S \geq c_j\quad \forall j \in E

\displaystyle  y_S \geq 0 \quad \forall S \subseteq E

To show that {x} is an optimal primal solution we identify a dual feasible solution with an objective function value of {cx}. Set {y_{S^j} = c_j - c_{j+1}} for {1 \leq j < k}. Set {y_{S^k} = c_k} and {y_{S^j} = 0} for {j \geq k+1}.

Notice that {y_S \geq 0} for all {S \subseteq E} and is feasible in the dual because:

\displaystyle \sum_{S \ni j}y_S = y_{S^j} + \ldots y_{S^k} = c_j

for all {j \leq k} and

\displaystyle \sum_{S \ni j}y_S \geq 0 \geq c_j

if {j \geq k+1}.

The dual objective function value is:

\displaystyle \sum_{j=1}^{k-1}(c_j - c_{j+1})f(S^j) + c_kf(S^k) = \sum_{j=1}^kc_j[f(S^j) - f(S^{j-1})] = cx.

Now suppose {P(f)} has a fractional extreme point. Choose {c} to be the unique weight vector that is optimized at that point. However, as shown above, every weight vector has an integral point in {P(f)} that optimizes it, contradiction.{\Box}

If we drop the requirement that {f} be non-decreasing then {\max\{cz: z \in P(f)\}} can still be solved by using the greedy algorithm because

\displaystyle \max\{cz: z \in P(f)\} = \max \{cz: z \in P(f_{mon})\}.

Here {f_{mon}(S) = \min\{f(T): S \subseteq T \subseteq E\}} which is submodular and non-decreasing. Now I describe a non-constructive proof of Theorem 3 which uses an `uncrossing’ trick that is good to know. Let {y^*} be an optimal solution to the polymatroid optimization problem. Let {{\cal F} = \{S: y^*_S > 0\}}. Then, we can assume that {{\cal F}} is laminar. If true, this means that the optimal basis in the dual is TUM. But this is also an an optimal basis for the primal. Thus, there is an optimal primal solution that is integral. Ok, suppose {{\cal F}} is not laminar. Then there exist {S, T \in {\cal F}} such that {S \cap T \neq \emptyset}. Construct a new dual solution as follows: Increase {y^*_{S \cup T}, y^*_{S \cap T}} by {\epsilon} each. Decrease {y^*_S}, {y^*_T} by {\epsilon} each. This new dual solution feasible. The change in objective function value is

\displaystyle \epsilon f(S \cup T) + \epsilon f(S \cap T) - \epsilon f(S) - \epsilon f(T)

which cannot exceed zero by submodularity. This is the uncrossing step. Repeated applications produces an optimal dual solution which is laminar.

The following remarkable result on polymatroid intersection is due to Edmonds.

Theorem 4 Let {f} and {g} be non-decreasing, integer valued, submodular functions on {E} with {f( \emptyset) = g(\emptyset) =0}. Then {P(f) \cap P(g)} is integral.

Proof: Pick an arbitrary integral {c} and consider {max \{cx: x \in P(f) \cap P(g)\}}. The dual to this linear program is

\displaystyle  \min \sum_{S \subseteq E}f(S)y_S + \sum_{S \subseteq E}g(S)z_S

subject to

\displaystyle  \sum_{S \ni j}y_S + \sum_{S \ni j}z_S \geq c_j\quad \forall j \in E

\displaystyle  y_S, z_S \geq 0 \quad \forall S \subseteq E

Let {(y^*,z^*)} be an optimal dual solution. Let {{\cal F} = \{S: y^*_S > 0\}} and {{\cal G} = \{S: z^*_S > 0\}}. By uncrossing we can assume that each of these laminar. Recall from the first session that the constraint matrix associated with {{\cal F}} can be arranged so that it has exactly one non-zero entry in each row. Similarly with {{\cal G}}. Hence, the constraint matrix formed from the concatenation of {{\cal F}} and {{\cal G}} (which is also the optimal basis) is TUM. {\Box}

From the proof we get another result for free. Let {{\cal F}} and {{\cal G}} be two laminar collections on the same ground set {E}. Consider the following polyhedron:

\displaystyle  \sum_{j \in T} x_j \leq b_T\,\, \forall T \in {\cal F}

\displaystyle \sum_{j \in T} x_j \leq b'_T\,\, \forall T \in {\cal G}

As long as the {b}‘s are integral, this polyhedron is integral.

How is this useful? Return to the SS model. For any set of edges, {K} (of the form {(i,j)} with {i \in B} and {j \in S}) define {f(K)} to be the number of vertices in {B} that the edges in {K}are incident to. Similarly, define {g(K)} to be the number of vertices in {S} that edges in {K} are incident to. Note {f} and {g} are submodular. Here is an equivalent formulation of the problems of finding an efficient assignment:

\displaystyle \max \sum_{i \in B}\sum_{j \in S}u_{ij}x_{ij}

subject to

\displaystyle  x \in P(f) \cap P(g)

Why is this useful? Notice, by choosing {f} and {g} differently we can model more constraints than just the number of edges incident to a vertex is at most 1.

Bit the bullet and started a (reading) course on market design. The plan is to begin with some  lectures that cover some of  the usual stuff that is covered followed by readings. First session was on the Shapley-Shubik (SS) assignment model. A summary can be obtained by clicking on mktdesign1

The readings cover the following topics:

1) Credit Default Swaps

du & zhu

gupta & sundarjan

chernov, gorbenko & makarov

2) Bankruptcy
The Economics of Bankruptcy Reform Author(s): Philippe Aghion, Oliver Hart, John Moore Source: Journal of Law, Economics, & Organization, Vol. 8, No. 3 (Oct., 1992), pp. 523-546

New Approach to Corporate reorganization by Lucien Bebchuk, Harvard Law Review, 1988

Analysis of secured debt, Stultz and johnson, JFE, 1985

3) Healthcare Markets

Ho, Katherine 2009. “Insurer-Provider Networks in the Medical Care Market.” American Economic Review, 99(1): 393–430.

Fong and Schwarz

Arrow & cast of thousands

4)  IPO’s

Jaganathan et al

Jovanovich et al

5)  Affirmative Action
Hickman
Fryer & Loury 
Chung

5) Wages, centralized matching, unraveling
Bulow & Levin
Roth
Halaburda

1. Shapley-Shubik Model

There is a set of buyers {B} and a set of sellers {S} each selling one unit of a good (could be divisible or not). Let {v_{ij} \geq 0} be the monetary value that buyer {j \in B} assigns to seller {i \in S}‘s object. In other words, we assume quasi-linear preferences. No buyer wants more than one of any good. By adding dummy sellers or buyers we can ensure that {|B|=|S|}.

Depending on the {v_{ij}}‘s you can assign a couple of interpretations to the SS model. First, {v_{ij} = u_j - c_i}. Here {u_i} is the value to buyer {i} of acquiring a good irrespective who sells it. In effect the sellers all sell the same good. {c_i} is the opportunity cost to seller {i}. Under this interpretation, SS is a model of bilateral trade.

Second, {v_{ij}} is the output generated when worker {j} is matched with firm {i}. Sometimes one assumes a specific functional form form for {v_{ij}}, for example, {v_{ij} = a_ib_j}. Here each agent (firm or worker) has a type and their joint output is a function of their types. Third, each seller is an advertising slot and each buyer an advertiser. Then, {a_i} is the the click through rate on slot {i} and {b_j} is the value per click to advertiser {j}. Because billions (and if one were Carl Sagan one would repeat the word `billions’) of $’s are transacted this way, it is obligatory to mention this example in a class on market design to give the whole enterprise a certain gravitas. There is a natural optimization problem one can associate with SS, which is to find an assignment of goods to buyers to maximize the total value achieved so that no buyer consumes more than one good and no seller gives more than one good. In other words, find an efficient allocation. This is called the assignment problem or maximum weight bipartite matching. The problem has a long history going back to Monge (1700’s). Indeed, well before Becker, it was Monge who considered the problem under the assumption that {v_{ij} = a_ib_j}. Since Monge’s time, the literature has split. One branch, via Kantorovich, interprets the problem in terms of coupling of measures and this has blossomed into an active and influential area of Mathematics called optimal transport of measure. It has important applications in pde’s and concentration of measure. There are also some applications in economic theory. I won’t go down that branch. The other, via Koopmans and Hitchcock, sticks to the discrete set up described here. That is the branch I will follow.

2. Shapley-Shubik (Divisible)

So that the problem of finding an efficient allocation can be expressed as an LP suppose the goods are divisible and each buyers utility for goods is linear in quantity. However, no buyer wants to consume more than one unit in total (unit demand). Let {x_{ij}} be the quantity of good {i} allocated to buyer {j}. For any {N \subseteq B} and {M \subseteq S}, let

\displaystyle  V(M,N) = \max \sum_{j \in N}\sum_{i \in M}v_{ij}x_{ij}

subject to

\displaystyle  \sum_{j \in N}x_{ij} \leq 1\quad \forall i \in M

\displaystyle  \sum_{i \in M}x_{ij} \leq 1\quad \forall j \in N

\displaystyle  x_{ij} \geq 0\quad, \forall i \in M,\: j \in N

The first constraint arises from the unit demand assumption. The second ensures no seller supplies more than they have. These constraints imply that {x_{ij} \leq 1} for all {i} and {j}.

Let {p_i} be the dual variable associated with each constraint {\sum_{j \in N}x_{ij} \leq 1} and {s_j} the dual variable associated with each constraint {\sum_{i \in M}x_{ij} \leq 1}. The dual to the above program is:

\displaystyle  \min \sum_{j \in N}s_j + \sum_{i \in M}p_i

subject to

\displaystyle  s_j + p_i \geq v_{ij}\,\, \forall j \in N,\,\forall i \in M

\displaystyle  s_j, p_i \geq 0 \,\, \forall j \in N,\,\forall i \in M

The dual has an interesting interpretation. Think of {p_i} as the price of object {i}. Given a collection of prices, the optimal solution to the dual is found by setting each {s_j} to {\{\max_{i \in M}(v_{ij} - p_i), 0\}^+.} Thus, each {s_j} represents the maximum surplus that agent {j} can receive from the consumption of a single object at prices {\{p_i\}_{i \in M}}.

At prices {\{p_i\}_{i \in M}}, buyer {j} must solve:

\displaystyle  \max \sum_{i \in M}(v_{ij}- p_j)x_{ij}

subject to

\displaystyle  \sum_{i \in M}x_{ij} \leq 1\quad \forall j \in N

\displaystyle  x_{ij} \geq 0\quad, \forall i \in M,\: j \in N

to determine its demand. It is easy to check that the optimal solution is any convex combination of objects that yield maximum surplus to the buyer.

Suppose {x^*} is an optimal primal solution and {(s^*,p^*)} an optimal dual solution. Then, the prices {p^*} `support’ the efficient allocation {x^*} in the following sense. Suppose we post a price {p^*_i} for each {i \in M}. Ask each buyer to name their demand correspondance at the posted prices. Then, it is possible to give each agent one of their demanded bundles and not exceed supply . To see why, recall complementary slackness:

\displaystyle (s^*_j + p^*_i - v_{ij})x^*_{ij} = 0.

So, if {x^*_{ij} > 0} it follows that {s^*_j = v_{ij} - p^*_i = \{\max_{r \in M}(v_{rj} - p^*_r), 0\}^+.} Hence, in this economy there is a set of prices that can be posted for each good so as to balance supply with demand. In other words, LP duality gives us the existence of Walrasian prices in this economy.

The set of feasible dual solutions forms a lattice with respect to the partial order {(s, p) \succeq (s', p')} if and only if {(-s,p) \geq (-s',p')}. In this partial order there is a smallest element, which corresponds to the smallest Walrasian price. This can be determined by choosing from among all optimal dual solutions one that minimizes {\sum_{i \in M}p_i} (equivalently maximizes total surplus to buyers).

Hold {M} fixed and consider {V( \cdot, M)} as a function of subsets of {B}. Then, {V( \cdot, M)} is non-decreasing. It is also submodular. Why? From the dual we see that it is obtained by minimizing a linear (i.e. submodular) function over a lattice. As submodularity is preserved under minimization, it follows that {V(N, M)} is submodular in {N}. A similar argument shows that {V(N,M)} is supermodular in {M}. This submodularity/supermodularity is useful in deriving certain incentive properties of the efficient allocation.

One can associate a tu co-op game with the assignment model. For any coalition of buyers ({N}) and sellers ({N}) let {V(N,M)} be the value of the coalition. The core, {C(B,S)} will be the set of solutions to the following:

\displaystyle \sum_{j \in B}s_j + \sum_{i \in S}p_i = V(B,S)

\displaystyle \sum_{j \in N}s_j + \sum_{i \in M}p_i \geq V(N,M)\,\, \forall N \subseteq B, \,\, M \subseteq S

The core has the usual interpretation. It is the set of allocations that cannot be blocked by any coalition of buyers and sellers.

Consider the following dual program (DP):

\displaystyle  V(B,S) = \min \sum_{j \in B}s_j + \sum_{i \in S}p_i

subject to

\displaystyle  s_j + p_i \geq v_{ij}\,\, \forall j \in B,\,\,\forall i \in S

\displaystyle  s_j, p_i \geq 0 \,\, \forall j \in B,\,\forall i \in S

It is straight forward to check that every optimal solution to (DP) is in {C(B,S)} and every point in {C(B,S)} corresponds to an optimal solution to (DP). Why is this interesting? It says that most of the constraints that define the core are redundant. It suffices to consider only pairs of agents consisting of one buyer and one seller. If the core is the set of efficient allocations immune to deviations by any subset of agents, then equivalence to (DP) says that it suffices to consider only pairwise deviations.

From the core, consider the following two relations:

\displaystyle \sum_{j \in B}s_j + \sum_{i \in S}p_i = V(B,S)

\displaystyle \sum_{j \in B \setminus k}s_j + \sum_{i \in S}p_i \geq V(B \setminus k,S)

Negate the second and add to the first:

\displaystyle s_k \leq V(B,S) - V(B \setminus k, M).

The term on the right hand side is buyer {k}‘s marginal product. Hence, no point in the core can give any buyer more than their marginal product. Submodularity of {V(\cdot, S)} implies that there is a point in the core that gives each buyer their marginal product (an old result of Shapley’s). What is this point? It is the one that corresponds to the minimal Walrasian price, i.e., the optimal solution to (DP) that maximizes {\sum_{j \in B}s_j}.

Suppose sellers are non-strategic and we run a Vickrey auction to allocate the goods amongst the buyers. Then, each buyer would walk away with a surplus equal to their marginal product (which is what the Vickrey auction gives buyers to give them the incentive to reveal their valuations). Thus, in this assignment economy, minimal Walrasian prices correspond to the prices that would be charged in a Vickrey auction.

3. Shapley-Shubik (Indivisible)

Now suppose the goods in SS are indivisible. I’m going to show the results outlined above carry through unchanged. To do that I need to tell you about totally unimodular matrices.

3.1. Total Unimodularity

A matrix is called totally unimodular (TUM) iff the determinant of each square submatrix has value 1, -1 or 0. If a matrix is TUM then so is its transpose. If {A} and {E} are TUM, then so is {AE}.

Example 1 The following matrix

\displaystyle  \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ \end{bmatrix}

is TUM. The following is not:

\displaystyle  \begin{bmatrix} 1 & 1 & 0 \\ 0 & 1 & 1 \\ 1 & 0 &1\\ \end{bmatrix}

Every proper square submatrix has determinant with absolute value 0 or 1, but the determinant of the entire matrix is 2.

Theorem 1 Let {A} be a {m \times n} TUM matrix. Let {b} be a {m \times 1} integral vector. Then, every extreme point of {\{Ax = b, x \geq 0\}} is integral.

Proof: To every extreme point {w} of {\{Ax = b, x \geq 0\}} there is a basis of {A} such that {w = B^{-1}b}. By Cramer’s rule, we can write {B^{-1} = {B^*}/{\det B}} where {B^*} is the adjoint of {B}. Since {A} has all integer entries, {B^*} has all integer entires. Since {A} is TUM and {B} is non-singular, it follows that {|\det B| = 1}. Hence {B^{-1}} has all integer entries. Thus, {B^{-1}b} is integral.{\Box}

Paul Seymour gave a complete (polytime) characterization of TUM matrices. The upshot of it is that most TUM matrices are network matrices. {A} is a network matrix if {a_{ij} = 0, 1, -1} for all {i,j} and each column contains at most two non-zero entires of opposite sign.

Theorem 2 If {A} is a network matrix, then {A} is TUM.

Proof: Proof is by induction on the size of a submatrix. Consider a {k \times k} square submatrix of {A}, call it {C}. If each column of {C} has exactly two non-zero entries then {\det C = 0}. Why? Summing up the rows of {C} gives us zero, meaning that the rows of {C} are linearly dependent. If there is a column of {C} that contains exactly one non-zero entry, then compute the determinant of {C} using the minor associated with this entry. By induction the determinant must have value, 0,1, -1. A column with all zeros means that {\det C = 0}.{\Box}

3.2. Back to SS

I’ll show that the constraint matrix for the assignment program that defines {V(B,S)} is TUM. This would mean that there is always an efficient allocation which produces an integral allocation.

Fix a good {i} and agent {j}. Consider the column associated with the variable {x_{ij}}. The variable appears with a coefficient of 1 in exactly two rows. One occurs in a row corresponding to agent {j} and the other to a row corresponding to object {i}. Let {L} consist of all rows corresponding to objects and {R} the set of all rows corresponding to agents. Multiply all the rows in {L} by -1. We now have a constraint matrix where each column contains exactly two non-zero entries of opposite sign.

3.3. Ascending Auctions

The equivalence between minimal Walrasian prices and Vickrey prices in SS means that the Vickrey outcome can be obtained from a t\^{a}tonnement process that terminates in the minimal Walrasian price. Two have been proposed in the literature. One by Crawford and Knoer and the other Demange, Gale and Sotomayor (see also Leonard). Both a variations of dual ascent algorithms for solving the LP formulation of the assignment model.

I’ll outline Demange, Gale and Sotomayor (DGS). Assume that the {v_{ij}}‘s are integral. In each iteration we have a vector of prices {\{p_i\}_{i \in S}}. Initially all prices are set to zero. In each iteration buyers report their demand correspondance. That is, the set of goods that maximize their surplus at current prices. Let {D_j(p)} be the demand correspondance of buyer {j}. Consider now the bipartite graph defined on {B \cup S} as follows: an edge {(i,j)} exists iff {i \in D_j(p)}. If this graph contains a perfect matching, stop. At current prices demand equals supply. A perfect matching means that there is a way to give each {j \in B} an {i \in S} such that {i \in D_j(p)} and no good is allocated to more than one buyer. If a perfect matching does not exist, by the Hall marriage theorem (I will state and prove this later), there must exists a set {N \subseteq B} such that {|N| > |\cup_{j \in N}D_j(p)|.} The set {\cup_{j \in N}D_j(p)} is called overdemanded. Identify a minimally overdemanded set and increase the price of each good in this set by 1 unit. Repeat.

4. Bilateral Trade

Lets put this machinery to work on bilateral trade. This is the case when {v_{ij} = u_j-c_i}. Suppose {u_j} is the private information of buyer {j} and {c_i} is the private information of seller {i}.

  1. The core of the associated tu game is non-empty.
  2. The point in the core that maximizes the total surplus to buyers makes it a dominant strategy for them to reveal their private information.
  3. The point in the core that maximizes the total profit to the sellers makes it a dominant strategy for them to reveal their private information.
  4. In general, there is no point in the core that is jointly `best’ for buyers and sellers. Hence, there is no way to obtain an outcome in the core for which it is a dominant strategy for both sides to reveal their private information.

We have the outlines of an archetypal matching paper. First, show that a stable or core outcome exists. Second, show that when only one side is strategic, one can implement the core outcome in an incentive compatible way. Third, observe that it is impossible to implement the core outcome when both sides are strategic. Fourth, show that as the economy gets large, one can implement something asymptotically close to the core in an incentive compatible way (or implement a core outcome in an asymptotically incentive compatible way). So, lets do the fourth.

The idea is due to Preston McAfee. Order the buyers so that {u_1 \geq u_2 \geq \ldots \geq u_n}. Order the sellers so that {c_1 \leq c_2 \leq \ldots \leq c_n}. The efficient allocation can be computed by determining the largest {k} such that {u_k \geq c_k}. Buyer {i} is matched with seller {i} for all {i \leq k}. McAfee suggests stopping at {k-1}. Charge all buyers {i \leq k-1} a price of {b_k}. Pay all sellers {i \leq k-1}, {c_k}. What each agent pays or receives does not depend on their reports. So, reporting their private information is a dominant strategy. Further, the mechanism runs a slight surplus and is individual rational. However, it is not efficient. The efficiency loss is {u_k-c_k}. Assuming {u_i}‘s and {c_i}‘s are independent draws from a distribution with bounded support, the percentage loss in efficiency approaches zero as {n \rightarrow \infty}.

Alternatively, one can implement the Vickrey outcome. In this case each buyer pays {b_{k+1}} and each seller receives {c_{k+1}}. The deficit of the Vickrey auction will grow like {k|b_{k+1} - c_{k+1}|}. One can then use properties of order statistics and the Kolmogorv-Smirnov bounds to show that the deficit goes to zero as {n \rightarrow \infty}.

5. More on TUM

Recall the constraints for the assignment model:

\displaystyle  \sum_{j \in B}x_{ij} \leq 1\quad \forall i \in S

\displaystyle  \sum_{i \in S}x_{ij} \leq 1\quad \forall j \in B

An integer solution, {x^*}, to these constraints defines a permutation matrix whose {(i,j)^{th}} entry is {x^*_{ij}}. A fractional solution to these constraints is a doubly stochastic matrix. TUM of the constraint matrix means that every doubly stochastic matrix is a convex combination of permutation matrices. This is is known as the Birkhoff-von Neuman theorem. Alternatively, the assignment constraints define the convex hull of doubly stochastic matrices.

For our purposes, TUM means that every fractional solution to the assignment constraints can be interpreted as a lottery over integer assignments. Thus, the assignment constraints give us a succinct description of the set of all lotteries over integer assignments. This is will be useful in applications when we search for randomized allocation rules satisfying other properties.

Network matrices are an important class of TUM matrices. There are matrices that don’t appear to be network matrices but after certain elementary row operations can be converted into network matrices. One such class is the set of 0-1 matrices with the consecutive 1’s property (C1). A 0-1 matrix has the consecutive 1’s property if there is a permutation of its rows such that the non-zero entries in each column are consecutive. The following matrix is C1:

\displaystyle  \begin{bmatrix} 1 & 0 & 0 \\ 1 & 1 & 0 \\ 1 & 1 & 0\\ 1 & 0 & 1\\ 0 & 0 & 1\\ \end{bmatrix}

C1 matrices arise in a variety of applications (interval graphs, cyclic staffing). Fulkerson and Gross were the first to give a polytime algorithm for recognizing C1 matrices. The following is due to Veinott and Wagner (1962).

Theorem 3 If {A} is a C1 matrix, then, {A} is TUM.

Proof: Suppose the rows of {A} have already been permuted so that the columns have the consecutive 1’s property. Suppose that {A} is {n \times n}. Define {E} to be the following {n \times n} matrix:

  1. For all {i < n}, {e_{ii} = 1}, {e_{i, i+1} = -1}.
  2. For {i = n}, {e_{nn} = 1}.
  3. For all {i} and {j \neq i+1}, {e_{ij} = 0}.

Here is a {5 \times 5} example of {E}:

\displaystyle  \begin{bmatrix} 1 & -1 & 0 & 0 & 0 \\ 0 & 1 & -1 & 0 & 0\\ 0 & 0 & 1 & -1 & 0\\ 0 & 0 & 0 & 1 & -1\\ 0 & 0 & 0 & 0 & 1\\ \end{bmatrix}

To complete the proof it suffices to verify that {E} is TUM and {EA} is a network matrix. Note that pre-multiplying {A} by {E} corresponds to negating row {i+1} of {A} and adding it to row {i} of {A}. {\Box}

I turn now to a class of C1 matrices that will be useful later.

Let {N} be a ground set of elements and {{\cal F}} a collection of subsets of {N}. {{\cal F}} is called laminar if for any {S, T \in {\cal F}} either {S \subset T}, {T \subset S} or {S \cap T = \emptyset}. If one drops the condition that {S \cap T = \emptyset}, then {{\cal F}} is called a chain.

Given a collection of subsets, {{\cal F}} we can represent it using a 0-1 matrix as follows. A column for each member of {{\cal F}} and a row for each element of {N}. Set {a_{ij} = 1} if the set corresponding to column {j} contains {i \in N}. Its easy to see that if {{\cal F}} is laminar, then {A} is C1. Call a 0-1 matrix that arises in this way a laminar matrix.

In fact, {A} is `equivalent’ to a 0-1 matrix with exactly one non-zero entry in each row. Here is is how. Pick any two columns of {A}, {j} and {k}. Let {S_j} and {S_k} be the sets they correspond to in {{\cal F}}. Suppose {S_j \subseteq S_k}. Negate column {j} and add it to column {k}. Note that this can at most flip the sign of the determinant of any square submatrix of {A}. Repeat. The result is a 0-1 matrix whose columns are disjoint, i.e., exactly one non-zero entry in each row.

So, wikipedia is dark today in protest of an initiative in congress to block sites that link to sites that infringe on copyrighted intellectual property. Ever noticed before how many times a day you use wikipedia ?

Here is what I don’t get about this whole idea of “copyrighted intellectual property”. Is it something advocated on moral grounds or on economic grounds ? I mean, when Bob sneaks into Alice’s vineyard and eats the grapes without permission, we view it as a moral atrocity; It’s just a wicked thing to do; It invokes the wrath of the gods; Moses explicitly forbade it. To be sure, it’s hard to pin down what exactly makes the vineyard belong to Alice without getting into a recursive definition of ownership, and if we try tracing back the vineyard from one legitimate owner to another we arrive to the first man who just fenced a piece of land and said “This is mine”. But here the economic argument kicks in — Most of us don’t begrudge this initial act of illegitimate fencing because the bastard who committed it was the founder of civil society. We like the idea of civil society. We like prosperity and growth. Without protection of private property we will have none of these.

But what about protection of “intellectual property” ? Clearly this is not a necessary condition for a civil society. It’s also not a necessary condition for production of knowledge and culture. We had Plato and Archimedes and Cicero and Shakespeare and Newton before it occurred to anybody that Bob has to gets Alice’s permission to reproduce a code that Alice wrote. Coming to think about it, when did the concept of intellectual property pop up anyway ? Waitaminute let me just check it up on wikipedia. Oops.. What did we ever do before wikipedia ?

The White House thinks that intellectual property is justified on economic grounds

Online piracy is a real problem that harms the American economy, and threatens jobs for significant numbers of middle class workers and hurts some of our nation’s most creative and innovative companies and entrepreneurs. It harms everyone from struggling artists to production crews, and from startup social media companies to large movie studios.

I wonder if this assertion backed by some empirical research ? I realize some people lose their job because of online piracy. Also, Some people lost their jobs following the introduction of ATMs. But we view ATMs as positive development since it made a certain service way cheaper. My guess is that the same is true about intellectual piracy — it makes distribution of culture and knowledge cheaper and therefore makes also the production of culture and knowledge cheaper. True, some companies, particularly the established ones, are damaged by intellectual theft. Other companies, particularly startups, benefit. One may argue that intellectual piracy destroys incentive to produce and therefore no new culture or knowledge will be produced absent some protection for intellectual property. But this is a claim that can be empirically checked no ? We live in a world of file sharing and user generated (often stolen) content sites. Are there less books written ?

Embassy Suite hotel Saturday morning. Photo by Jacob Leshno.

Update (May 2017) McLennan and Tourky seem to be the first to make the  argument in this post (link to their paper)

 

They say that when Alfred Tarski came up with his theorem that the axiom of choice is equivalent to the statement that, for every set {A}, {A} and {A\times A} have the same cardinality, he first tried to publish it in the French PNAS. Both venerable referees rejected the paper: Frechet argued there is no novelty in equivalence between two well known theorems; Lebesgue argued that there is no interest in equivalence between two false statments. I don’t know if this ever happened but it’s a cool story. I like to think about it everytime a paper of mine is rejected and the referees contradict each other.

Back to game theory, one often hears that the existence of Nash Equilibrium is equivalent to Brouwer’s fixed point theorem. Of course we all know that Brouwer implies Nash but the other direction is more tricky less known. I heard a satisfying argument for the first time a couple of months ago from Rida. I don’t know whether this is a folk theorem or somebody’s theorem but it is pretty awesome and should appear in every game theory textbook.

So, assume Nash’s Theorem and let {X} be a compact convex set in {\mathbf{R}^n} and {f:X\rightarrow X} be a continuous function. We claim that {f} has a fixed point. Indeed, consider the two-player normal-form game in which the set of pure strategies of every player is {X}, and the payoffs under strategy profile {(x,y)\in X^2} is {-\|x-y\|^2} for player I and {-\|f(x)-y\|^2} for player II. Since strategy sets are compact and the payoff function is continuous, the game has an equilibrium in mixed strategies. In fact, the equilibrium strategies must be pure. (Indeed, for every mixed strategy {\mu} of player II, player 1 has a unique best response, the one concentrated on the barycenter of {\mu}). But if {(x,y)} is a pure equilibrium then it is immediate that {x=y=f(x)}.

Update I should add that I believe that the transition from existence of mixed Nash Equilibrium in games with finite strategy sets to existence of mixed Nash Equilibrium in games with compact strategy sets and continuous payoffs is not hard. In the case of the game that I defined above, if {\{x_0,x_1,\dots\}} is a dense subset of {X} and {(\mu_n,\nu_n)\in \Delta(X)\times\Delta(X)} is a mixed equilibrium profile in the finite game with the same payoff functions and in which both players are restricted to the pure strategy set {\{x_1,\dots,x_n\}}, then an accumulation point of the sequence {\{(\mu_n,\nu_n)\}_{n\geq 1}} in the weak{^\ast} topology is a mixed strategy equilibrium in the original game.

 

Kellogg faculty blogroll