You are currently browsing the category archive for the ‘Auctions’ category.

According to the NY Times,  some Californians

would have to cut their water consumption by 35 percent under the terms of a preliminary plan issued by state officials on Tuesday to meet a 25 percent mandatory statewide reduction in urban water use.

There is an obvious way to achieve this, raise the price of water. If its obvious why isn’t it the first thing California did some years back when it was clear that water was scarce? In some cases the hands of the state are tied by water rights allocated for commercial purposes, so lets focus on household consumption.

We know that the first tool the state reaches for is regulation. See, for example, the following memo from the California State water board. Interestingly, it begins by noting that the state is in the 4th year of a drought! Eventually, it is clear that regulation is insufficient and then, price increases are considered. In fact, the water reduction targets quoted from the NY Times above come from a report by the Governor of the state that also urges the use of

rate structures and other pricing mechanisms

to achieve reductions. Again, why is price last rather first? Is this because the state must maintain a credible reputation for not exploiting its monopoly power with water?

If one is going reduce consumption by raising prices, should it be an across the board price increase? Note that consumption is metered so the exact amount that is purchased by a household is known to the provider. The state also has access to other information: location, home size, family size and income. In principle, the state could price discriminate. Here is an example from the Irvine Ranch Water district. Each household is given an initial `allotment’ that depends on household size and area that is landscaped. Exceed the allotment and the price of water rises. For more details and the impact on consumption see the following paper. Is it obvious that this is the `correct’ mechanism?



Penn state runs auctions to license its intellectual property. For each license on the block there is a brief description of what the relevant technology is and an opening bid which I interpret as a reserve price. It also notes whether the license is exclusive or not. Thus, the license is sold for a single upfront fee. No royalties or other form of contingent payment. As far as I can tell the design is an open ascending auction.

In an earlier pair of posts I discussed a class of combinatorial auctions when agents have binary quadratic valuations. To formulate the problem of finding a welfare maximizing allocation let {x^k_j = 1} if object {j \in M} is given to agent {k \in N} and zero otherwise. Denote the utility of agent {k \in N} from consuming bundle {S \subseteq M} by

\displaystyle u_k(S) = \sum_{i \in S}u^k_i + \sum_{i, j \in S}w^k_{ij}.

The problem of maximizing total welfare is

\displaystyle \max \sum_{k \in N}\sum_{i \in M}u^k_ix^k_i + \sum_{k \in N}\sum_{i \neq j}w^k_{ij}x^k_ix^k_j

subject to

\displaystyle \sum_{k \in N}x^k_i \leq 1\,\, \forall i \in M

\displaystyle x^k_i \in \{0,1\}\,\, \forall i \in M, k \in N

I remarked that Candogan, Ozdaglar and Parrilo (2013) identified a solvable instance of the welfare maximization problem. They impose two conditions. The first is called sign consistency. For each {i,j \in M}, the sign of {w^k_{ij} } and {w^r_{ij}} for any {k, r \in N} is the same. Furthermore, this applies to all pairs {i, j \in M}.

Let {G^w} be a graph with vertex set {M} and for any {i,j \in M} such that {w^k_{ij} \neq 0} introduce an edge {(i,j)}. Because of the sign consistency condition we can label the edges of {G^w} as being positive or negative depending on the sign of {w^k_{ij}}. Let {E^+ = \{(i,j): w^k_{ij} \geq 0\}} and {E^- = \{(i,j): w^k_{ij} < 0\}}. The second condition is that {G^w} be a tree.

The following is the relaxation that they consider:

\displaystyle \max \sum_{k \in N}\sum_{i \in M}u^k_ix^k_i + \sum_{k \in N}\sum_{(i,j) \in E^+ \cup E^-}w^k_{ij}z^k_{ij}

subject to

\displaystyle \sum_{k \in N}x^k_i \leq 1\,\, \forall i \in M

\displaystyle z^k_{ij} \leq x^k_i, x^k_j\,\, \forall k \in N, (i,j) \in E^+

\displaystyle z^k_{ij} \geq x^k_i + x^k_j - 1\,\, \forall k \in N, (i,j) \in E^-

\displaystyle x^k_i, z^k_{ij} \in \{0,1\}\,\, \forall i \in M, k \in N

Denote by {P} the polyhedron of feasible solutions to the last program. I give a new proof of the fact that the extreme points of {P} are integral. My thanks to Ozan Candogan for (1) patiently going through a number of failed proofs and (2) being kind enough not to say :“why the bleep don’t you just read the proof we have.”

Let {{\cal C}} be the maximal connected components of {G^w} after deletion of the edges in {E^-} (call this {G^+}). The proof will be by induction on {|{\cal C}|}. The case {|{\cal C}| = 1} follows from total unimodularity. I prove this later.

Suppose {|{\cal C}| > 1}. Let {({\bar z}, {\bar x})} be an optimal solution to our linear program. We can choose {({\bar z}, {\bar x})} to be an extreme point of {P}. As {G^w} is a tree, there must exist a {C \in {\cal C}} incident to exactly one negative edge, say {(p,q)}. Denote by {P'} the polyhedron {P} restricted to just the vertices of {C} and by {Q} the polyhedron {P} restricted to just the vertices in the complement of {C}. By the induction hypothesis, both {P'} and {Q} are integral polyhedrons. Each extreme point of {P'} ({Q}) assigns a vertex of {C} (the complement of {C}) to a particular agent. Let {X_1, X_2, \ldots, X_a} be the set of extreme points of {P'}. If in extreme point {X_r}, vertex {p} is assigned to agent {k} we write {X_{rk} = 1} and zero otherwise. Similarly with the extreme points {Y_1, Y_2, \ldots, Y_b} of {Q}. Thus, {Y_{rk} = 1} is {Y_r} assigns vertex {q} to agent {k}. Let {v(X_{r})} be the objective function value of the assignment {X_r}, similarly with {v(Y_{rk})}.

Now {({\bar z}, {\bar x})} restricted to {P'} can be expressed as {\sum_r\lambda_{r}X_{r}}. Similarly, {({\bar z}, {\bar x})} restricted to {Q} can be expressed as {\sum_r\mu_{r}Y_{r}}. We can now reformulate our linear program as follows:

\displaystyle \max \sum_r\lambda_{r}v(X_{r}) + \sum_r\mu_{r}v(Y_{r}) -\sum_{k \in N} |w^k_{pq}|y^k_{pq}

subject to

\displaystyle -\sum_r\lambda_{rk} = -1

\displaystyle -\sum_r\mu_{rk} = -1

\displaystyle \sum_{r: X_{rk} = 1}\lambda_{r}X_{r} + \sum_{r: Y_{rk} = 1}\mu_{r}Y_{r} \leq y^k_{pq} + 1\,\, \forall k \in N

\displaystyle \lambda_{r}, \mu_{r}, y^k_{pq} \geq 0\,\, \forall r, k

The constraint matrix of this last program is totally unimodular. This follows from the fact that each variable appears in at most two constraints with coefficients of opposite sign and absolute value 1 (this is because {X_{rk}} and {X_{rk'}} cannot both be 1, similarly with the {Y}‘s). Total unimodularity implies that the last program has integral optimal solution and we are done. In fact, I believe the argument can be easily modified to to the case where in {G^w} every cycle must contain a positive even number of negative edges.

Return to the case {|{\cal C}| = 1}. Consider the polyhedron {P} restricted to just one {C \in {\cal C}}. It will have the form:

\displaystyle \sum_{k \in N}x^k_i \leq 1\,\, \forall i \in C

\displaystyle z^k_{ij} \leq x^k_i, x^k_j\,\, \forall k \in N, (i,j) \in E^+ \cap C

\displaystyle x^k_i, z^k_{ij} \in \{0,1\}\,\, \forall i \in C, k \in N

Notice the absence of negative edges. To establish total unimodularity we use the Ghouila-Houri (GH) theorem. Fix any subset, {S}, of rows/constraints. The goal is to partition them into two sets {L} and {R} so that column by column the difference in the sum of the non-zero entries in {L} and and the sum of the nonzero entries in {R} differ by at most one.

Observe that the rows associated with constraints {\sum_{k \in N}x^k_i \leq 1} are disjoint, so we are free to partition them in any way we like. Fix a partition of these rows. We must show to partition the remaining rows to satisfy the GH theorem. If {y^k_{ij} - x^k_i \leq 0} is present in {S} but {y^k_{ij} -x^k_j \leq 0} is absent (or vice-versa), we are free to assign the row associated with {y^k_{ij} - x^k_i \leq 0} in any way to satisfy the GH theorem. The difficulty will arise when both {y^k_{ij}}, {x^k_i} and {x^k_j} are present in {S}. To ensure that the GH theorem is satisfied we may have to ensure that the rows associated with {y^k_{ij} - x^k_i \leq 0} and {y^k_{ij} -x^k_j \leq 0} be separated.

When {S} is the set of all constraints we show how to find a partition that satisfies the GH theorem. We build this partition by sequentially assigning rows to {L} and {R} making sure that after each assignment the conditions of the GH theorem are satisfied for the rows that have been assigned. It will be clear that this procedure can also be applied when only a subset of constraints are present (indeed, satisfying the GH theorem will be easier in this case).

Fix an agent {k \in N}. The following procedure will be repeated for each agent in turn. Pick an arbitrary vertex in {C} (which is a tree) to be a root and direct all edges `away’ from the root (when {S} is a subset of the constraints we delete from {C} any edge {(i,j)} in which at most one from the pair {y^k_{ij} - x^k_i \leq 0} and {y^k_{ij} -x^k_j \leq 0} appears in {S}) . Label the root {L}. Label all its neighbors {R}, label the neighbors of the neighbors {L} and so on. If vertex {i \in C} was labeled {L} assign the row {\sum_{k \in N}x^k_i \leq 1} to the set {L} otherwise to the row {R}. This produces a partition of the constraints of the form {\sum_{k \in N}x^k_i \leq 1} satisfying GH.

Initially, all leaves and edges of {C} are unmarked. Trace out a path from the root to one of the leaves of {C} and mark that leaf. Each unmarked directed edge {(i,j)} on this path corresponds to the pair {y^k_{ij} - x^k_i \leq 0} and {y^k_{ij} -x^k_j \leq 0}. Assign {y^k_{ij} - x^k_i \leq 0} to the same set that is the label of {i}. Assign {y^k_{ij} - x^k_j \leq 0} to the same set that is the label of vertex {j}. Notice that in making this assignment the conditions of the GH theorem continues to satisfied. Mark the edge {(i,j)}. If we repeat this procedure again with another path from the root to an unmarked leaf, we will violate the GH theorem. To see why suppose the tree contains edge {(i,j)} as well as {(i,t)}. Suppose {i} was labeled {L} on the first iteration and {(i,j)} was marked. This means {y^k_{ij} - x^k_{i} \leq 0} was assigned to {L}. Subsequently {y^k_{it} - x^k_i \leq 0} will also be assigned to {L} which will produce a partition that violates the GH theorem. We can avoid this problem by flipping the labels on all the vertices before repeating the path tracing procedure.

What is the institutional detail that makes electricity special? Its in the physics that I will summarize with a model of DC current in a resistive network. Note that other sources, like Wikipedia give other reasons, for why electricity is special:

Electricity is by its nature difficult to store and has to be available on demand. Consequently, unlike other products, it is not possible, under normal operating conditions, to keep it in stock, ration it or have customers queue for it. Furthermore, demand and supply vary continuously. There is therefore a physical requirement for a controlling agency, the transmission system operator, to coordinate the dispatch of generating units to meet the expected demand of the system across the transmission grid.

I’m skeptical. To see why, replace electricity by air travel.

Let {V} be the set of vertices and {E^*} the set of edges a the network. It will be convenient in what follows to assign (arbitrarily) an orientation to each edge in {E^*}. Let {E} be the set of directed arcs that result. Hence, {(i,j) \in E} mens that the edge {(i,j)} is directed from {i} to {j}. Notice, if {(i,j) \in E}, then {(i,j) \not \in E}.

Associated with each {(i,j) \in E} is a number {x_{ij}} that we interpret as a flow of electricity. If {x_{ij} > 0} we interpret this to be a flow from {i} to {j}. If {x_{ij} < 0} we interpret this as a flow from {j} to {i}.

  1. Let {\rho_{ij}} is the resistance on link {(i,j)}.
  2. {c_i} unit cost of injecting current into node {i}.
  3. {v_i} marginal value of current consumed at node {i}.
  4. {d_i} amount of current consumed at node {i}.
  5. {s_i} amount of current injected at node {i}.
  6. {K_{ij}} capacity of link {(i,j)}.

Current must satisfy two conditions. The first is conservation of flow at each node:

\displaystyle s_i + \sum_{k: (k,i) \in E}x_{ji} = \sum_{j: (i,j) \in E}x_{ij} + d_i\,\, \forall i \in V

The second is Ohm’s law. There exist node potentials {\{\phi_i\}_{i \in V}} such that

\displaystyle \rho_{ij}x_{ij} = \phi_i - \phi_j\,\, \forall (i,j) \in E.

Using this systems equations one can derive the school boy rules for computing the resistance of a network (add them in series, add the reciprocals in parallel). At the end of this post is a digression that shows how to formulate the problem of finding a flow that satisfies Ohm’s law as an optimization problem. Its not relevant for the economics, but charming nonetheless.

At each node {i \in V} there is a power supplier with constant marginal cost of production of {c_i} upto {S_i} units. At each {i \in V} there is a consumer with constant marginal value of {v_i} upto {D_i} units. A natural optimization problem to consider is

\displaystyle \max \sum_{i \in V}[v_id_i - c_is_i]

subject to

\displaystyle \sum_{j: (i,j) \in E}x_{ij} -\sum_{j: (j,i) \in E}x_{ji} - s_i + d_i= 0\,\, \forall i \in V

\displaystyle \rho_{ij}x_{ij} = \mu_i - \mu_j\,\, \forall (i,j) \in E

\displaystyle -K_{ij} \leq x_{ij} \leq K_{ij}\,\, \forall (i,j) \in E

\displaystyle 0 \leq s_i \leq S_i\,\, \forall i \in V

\displaystyle 0 \leq d_i \leq D_i\,\, \forall i \in V

This is the problem of finding a flow that maximizes surplus.

Let {{\cal C}} be the set of cycles in {(V, E^*)}. Observe that each {C \in {\cal C}} corresponds to a cycle in {(V, E)} if we ignore the orientation of the edges. For each cycle {C \in {\cal C}}, let {C^+} denote the edges in {E} that are traversed in accordance with their orientation. Let {C^-} be the set of edges in {C} that are traversed in the opposing orientation.

We can project out the {\phi} variables and reformulate as

\displaystyle \max \sum_{i \in V}[v_id_i - c_is_i]

subject to

\displaystyle \sum_{j: (i,j) \in E}x_{ij} -\sum_{j: (j,i) \in E}x_{ji} - s_i + d_i= 0\,\, \forall i \in V

\displaystyle \sum_{(i,j) \in C^+}\rho_{ij}x_{ij} - \sum_{(i,j) \in C^-}\rho_{ij}x_{ij} = 0\,\, \forall \,\, C \in {\cal C}

\displaystyle -K_{ij} \leq x_{ij} \leq K_{ij}\,\, \forall (i,j) \in E

\displaystyle 0 \leq s_i \leq S_i\,\, \forall i \in V

\displaystyle 0 \leq d_i \leq D_i\,\, \forall i \in V

Recall the scenario we ended with in part 1. Let {V = \{1, 2, 3\}}, {E = \{(1,3), (1,2), (2,3)\}} and in addition suppose {\rho_{ij} =1} for all {(i,j)}. Only {(1,3)} has a capacity constraint of 600. Let {D_1 = D_2 = 0} and {S_3 = 0}. Also {c_1 = 20} and {c_2 = 40} and each have unlimited capacity. At node 3, the marginal value is {V > 40} upto 1500 units and zero thereafter. The optimization problem is

\displaystyle \max Vd_3 - 20s_1 - 40 s_2

subject to

\displaystyle x_{12} + x_{13} - s_1 = 0

\displaystyle x_{23} - s_2 - x_{12} = 0

\displaystyle d_3 - x_{13} - x_{23} = 0

\displaystyle x_{13} - x_{23} - x_{12} = 0

\displaystyle -600 \leq x_{13} \leq 600

\displaystyle 0 \leq d_3 \leq 1500

Notice, for every unit of flow sent along {(1,3)}, half a unit of flow must be sent along {(1,2)} and {(2,3)} as well to satisfy the cycle flow constraint.

The solution to this problem is {x_{13} = 600}, {x_{12} = -300}, {x_{23} = 900}, {s_1 = 300}, {s_2 = 1200} and {d_3 = 1500}. What is remarkable about this not all of customer 3’s demand is met by the lowest cost producer even though that producer has unlimited capacity. Why is this? The intuitive solution would have been send 600 units along {(1,3)} and 900 units along {(1,2) \rightarrow (2,3)}. This flow violates the cycle constraint.

In this example, when generator 1 injects electricity into the network to serve customer 3’s demand, a positive amount of that electricity must flow along every path from 1 to 3 in specific proportions. The same is true for generator 2. Thus, generator 1 is unable to supply all of customer 3’s demands. However, to accommodate generator 2, it must actually reduce its flow! Hence, customer 3 cannot contract with generators 1 and 2 independently to supply power. The shared infrastructure requires that they co-ordinate what they inject into the system. This need for coordination is the argument for a clearing house not just to manage the network but to match supply with demand. This is the argument for why electricity markets must be designed.

The externalities caused by electricity flows is not a proof that a clearing house is needed. After all, we know that if we price the externalities properly we should be able to implement the efficient outcome. Let us examine what prices might be needed by looking at the dual to the surplus maximization problem.

Let {y_i} be the dual variable associated with the flow balance constraint. Let {\lambda_C} be associated with the cycle constraints. Let {\nu_i} and {\theta_i} be associated with link capacity constraints. Let {\mu_i} and {\sigma_i} be associated with the remaining tow constraints. These can be interpreted as the profit of supplier {i} and the surplus of customer {i} respectively. For completeness the dual would be:

\displaystyle \min \sum_{(i,j) \in E}[\nu_{ij} + \theta_{ij}]K_{ij} + \sum_{i \in V}[S_i \mu_i + D_i \sigma_i]

subject to

\displaystyle -\theta_{ij} + \nu_{ij} + \rho_{ij}\sum_{C^+ \ni (i,j)}\lambda_C - \rho_{ij}\sum_{C^- \ni (i,j)}\lambda_C + y_i - y_j = 0\,\, \forall (i,j) \in E

\displaystyle \mu_i - y_i \geq -c_i\,\, \forall i \in V

\displaystyle \sigma_i + y_i \geq v_i\,\, \forall i \in V

\displaystyle \nu_{ij}, \theta_{ij}, \mu_i, \sigma_i \geq 0\,\, \forall i \in V,\,\,\forall (i,j) \in E

Now {y_i} has a natural interpretation as a price to be paid for consumption at node {i} for supply injected at node {i}. {\mu_i} and {\nu_i} can be interpreted as the price of capacity. However, {\lambda_C} is trickier, price for flow around a cycle? It would seem that one would have to assign ownership of each link as well as ownership of cycles in order to have a market to generate these prices.

In part two, as promised, I turn to the welfare maximization problem. To formulate the problem of finding a welfare maximizing allocation let {x^k_j = 1} if object {j \in M} is given to agent {k \in N} and zero otherwise. Denote the utility of agent {k \in N} from consuming bundle {S \subseteq M} by

\displaystyle u_k(S) = \sum_{i \in S}u^k_i + \sum_{i, j \in S}w^k_{ij}.

The problem of maximizing total welfare is

\displaystyle \max \sum_{k \in N}\sum_{i \in M}u^k_ix^k_i + \sum_{k \in N}\sum_{i \neq j}w^k_{ij}x^k_ix^k_j

subject to

\displaystyle \sum_{k \in N}x^k_i \leq 1\,\, \forall i \in M

\displaystyle x^k_i \in \{0,1\}\,\, \forall i \in M, k \in N

Welfare maximization with BQP preferences is in general NP-hard. One proof relies on a reduction to the multi-way cut problem. Given a graph {G=(V,E)} with edge weight {c_{ij}} for each {(i,j) \in E}, and a set of terminal vertices {T = \{v_1, v_2, \ldots, v_n\}}, a multiway cut is a set of edges whose removal disconnects every pair of terminal vertices. The problem of finding the multiway cut of minimum total weight is called the multiway cut problem. When {T} consists of only two terminals ({k = 2}) the problem reduces to the well known minimum cut problem. For {k \geq 3}, it is known that the problem is {NP-}hard even on planar graphs.

We can obtain the multiway cut problem by setting {u^k_i = 0} for all {k \in N} and {i \in M} and {w^k_{ij} = c_{ij}} for all {k \in N} and {(i,j) \in E}. Any pair {(i,j)} such that {x^k_ix^r_j = 1} for {k \neq r} will be part of a multi-way cut. This reduction implies that welfare maximization when {w^k_{ij} \geq 0} for all {k \in N} and {i,j \in M} is NP-hard. This is in contrast to the case of surplus maximization.

,Candogan, Ozdaglar and Parrilo (2013), the paper that prompted this post, identify a solvable instance of the welfare maximization problem. They impose two conditions. The first is called sign consistency. For each {i,j \in M} the sign of {w^k_{ij} } and {w^r_{ij}} for any {k, r \in N} is the same. Furthermore, this applies to all pairs {i, j \in M}. Sign consistency by itself is not sufficient to obtain a solvable instance. Another condition is needed. Let {G^w} be a graph with vertex set {M} and for any {i,j \in M} such that {w^k_{ij} \neq 0} introduce an edge {(i,j)}. The second condition is that {G^w} be a tree. Interestingly, Erdos and Sz\'{e}kely (1995) show that a generalization of the multiway cut problem which corresponds to welfare maximization under sign consistency and {u^k_i = 0} for all {k \in N} and {i \in M}, is polynomially solvable when the underlying graph {G} is a tree. The Candogan, Ozdaglar and Parrilo (COP) proof is based on a dynamic programming argument similar to the one used in Erdos and Sz\'{e}kely (1994).

The key result in COP is the following natural linearization of the welfare maximization problem admits an integral optimal solution.

\displaystyle \max \sum_{k \in N}\sum_{i \in M}u^k_ix^k_i + \sum_{k \in N}\sum_{i \neq j}w^k_{ij}y^k_{ij}

subject to

\displaystyle \sum_{k \in N}x^k_i \leq 1\,\, \forall i \in M

\displaystyle y^k_{ij} \leq \{x^k_i, x^k_j\}^-\,\, \forall w^k_{ij} > 0

\displaystyle y^k_{ij} \geq x^k_i + x^k_j -1\,\, \forall w^k_{ij} < 0

\displaystyle x^k_i \in \{0,1\}\,\, \forall i \in M, k \in N

There is a connection between the welfare maximization problem and packing subtrees into trees that I want to highlight. It suggests a possible avenue by which one might enlarge the class of preferences COP consider.

Because of the sign consistency condition we can label the edges of {G^w} as being positive or negative depending on the sign of {w^k_{ij}}. Let {E^+ = \{(i,j): w^k_{ij} \geq 0\}} and {E^- = \{(i,j): w^k_{ij} < 0\}}. Let {{\cal C}} be the maximal connected components of {G^w} after deletion of the edges in {E^-} (call this {G^+}). For any {C \in {\cal C}} and {S \subseteq C} let

\displaystyle v_k(S) = \sum_{i \in S}u^k_i + \sum_{i,j \in S}w^k_{ij}.

By the way, for each {C \in {\cal C}} and {k \in N}, {v_k} is supermodular over the subsets of {C}. Let {z_k(S) = 1} if {S \subseteq C} is assigned to agent {k} and zero otherwise. The problem of finding a welfare maximizing allocation can be expressed as follows:

\displaystyle \max \sum_{k \in N}\sum_{C \in {\cal C}}\sum_{S \subseteq C}v_k(S)z_k(S) - \sum_{k \in N}\sum_{(i,j) \in E^-}|w^k_{ij}|y^k_{ij}

subject to

\displaystyle \sum_{S \ni i}\sum_{k \in N}z_k(S) \leq 1\,\, \forall i \in M,\,\, S \subseteq C \in {\cal C}

\displaystyle y^k_{ij} \geq \sum_{S \ni i}z_k(S) + \sum_{S \ni j}z_k(S) - 1\,\, \forall k \in N, (i,j) \in E^-

\displaystyle z_k(S) \in \{0,1\}\,\, \forall k \in N, S \subseteq C \in {\cal C}

\displaystyle y^k_{ij} \in \{0,1\}\,\, \forall k \in N, (i,j) \in E^-

In the above program, we can, without loss, restrict attention to subsets {S} that a subtrees (connected subgraphs) of {E^+}. To see why, suppose in that in some optimal solution to the above integer program, {z_k(S) = 1} where {S} is not a subtree. Then, we can write {S = A \cup B} where both {A} and {B} are in the same component of {C} as {S} is. Furthermore, it must be the the case that there is no edge {(i,j) \in E^+} such that {i \in A} and {j \in B}. Therefore, {v_k(S) = v_k(A) + v_k(B)}. That means we can construct a new optimal solution by setting {z_k(S) = 0} and raising {z_k(A)} and {z_k(B)} to 1. Note, in the original solution {z_k(A), z_k(B) =0} by virtue of the first constraint. As long as {S} is not a subtree we can repeat this argument.

Hence, if we let {{\cal T}} be the set of subtrees of {G^+}, the welfare maximization problem can be expressed as follows:

\displaystyle \max \sum_{k \in N}\sum_{T \in {\cal T}}v_k(T)z_k(T) - \sum_{k \in N}\sum_{(i,j) \in E^-}|w^k_{ij}|y^k_{ij}

subject to

\displaystyle \sum_{T \in {\cal T}}\sum_{T \ni i}\sum_{k \in N}z_k(T) \leq 1\,\, \forall i \in M,

\displaystyle y^k_{ij} \geq \sum_{T \ni i}z_k(T) + \sum_{S \ni j}z_k(S) - 1\,\, \forall k \in N, (i,j) \in E^-

\displaystyle z_k(S) \in \{0,1\}\,\, \forall k \in N, T \in {\cal T}

\displaystyle y^k_{ij} \in \{0,1\}\,\, \forall k \in N, (i,j) \in E^-

Its important to emphasize that no {T \in {\cal T}} contains a negative edge.

Were it not for the second set of the constraints, integrality would follow from Barany, Edmonds and Wolsey (1986). I’m not aware of this variation of the tree packing problem having been considered. A follow up paper by Aghezzaf and Wolsey (1994) comes close in the sense of allowing for a piecewise linear concave objective function.

A paper by Azevdo, Weyl and White in a recent issue of Theoretical Economics caught my eye. It establishes existence of Walrasian prices in an economy with indivisible goods, a continuum of agents and quasilinear utility. The proof uses Kakutani’s theorem. Here is an argument based on an observation about extreme points of linear programs. It shows that there is a way to scale up the number of agents and goods, so that in the scaled up economy a Walrasian equilibrium exists.
First, the observation. Consider {\max \{cx: Ax = b, x \geq 0\}}. The matrix {A} and the RHS vector {b} are all rational. Let {x^*} be an optimal extreme point solution and {\Delta} the absolute value of the determinant of the optimal basis. Then, {\Delta x^*} must be an integral vector. Equivalently, if in our original linear program we scale the constraints by {\Delta}, the new linear program has an optimal solution that is integral.

Now, apply this to the existence question. Let {N} be a set of agents, {G} a set of distinct goods and {u_i(S)}the utility that agent {i} enjoys from consuming the bundle {S \subseteq G}. Note, no restrictions on {u} beyond non-negativity and quasi-linearity.

As utilities are quasi-linear we can formulate the problem of finding the efficient allocation of goods to agents as an integer program. Let {x_i(S) = 1} if the bundle {S} is assigned to agent {i} and 0 otherwise. The program is

\displaystyle \max \sum_{i \in N}\sum_{S \subseteq G}u_i(S)x_i(S)

subject to
\displaystyle  \sum_{S \subseteq G}x_i(S) \leq 1\,\, \forall i \in N

\displaystyle \sum_{i \in N}\sum_{S \ni g} x_i(S) \leq 1 \forall g \in G

\displaystyle x_i(S) \in \{0,1\}\,\, \forall i \in N, S \subseteq G

If we drop the integer constraints we have an LP. Let {x^*} be an optimal solution to that LP. Complementary slackness allows us to interpret the dual variables associated with the second constraint as Walrasian prices for the goods. Also, any bundle {S} such that {x_i^*(S) > 0} must be in agent {i}‘s demand correspondence.
Let {\Delta} be the absolute value of the determinant of the optimal basis. We can write {x_i^*(S) = \frac{z_i^*(S)}{\Delta}} for all {i \in N} and {S \subseteq G} where {z_i^*(S)} is an integer. Now construct an enlarged economy as follows.

Scale up the supply of each {g \in G} by a factor of {\Delta}. Replace each agent {i \in N} by {N_i = \sum_{S \subseteq G}z_i^*(S)} clones. It should be clear now where this is going, but lets dot the i’s. To formulate the problem of finding an efficient allocation in this enlarged economy let {y_{ij}(S) = 1} if bundle {S} is allocated the {j^{th}} clone of agent {i} and zero otherwise. Let {u_{ij}(S)} be the utility function of the {j^{th}} clone of agent {i}. Here is the corresponding integer program.

\displaystyle \max \sum_{i \in N}\sum_{j \in N_i}\sum_{S \subseteq G}u_{ij}(S)y_{ij}(S)

subject to
\displaystyle  \sum_{S \subseteq G}y_{ij}(S) \leq 1\,\, \forall i \in N, j \in N_i

\displaystyle \sum_{i \in N}\sum_{j \in N_i}\sum_{S \ni g} y_{ij}(S) \leq \Delta \forall g \in G

\displaystyle y_{ij}(S) \in \{0,1\}\,\, \forall i \in N, j \in N_i, S \subseteq G

Its easy to see a feasible solution is to give for each {i} and {S} such that {z_i^*(S) > 0}, the {z_i^*(S)} clones in {N_i} a bundle {S}. The optimal dual variables from the relaxation of the first program complements this solution which verifies optimality. Thus, Walrasian prices that support the efficient allocation in the augmented economy exist.

In this post I describe an alternative proof of a nifty result that appears in a forthcoming paper by Goeree, Kushnir, Moldovanu and Xi. They show (under private values) given any Bayesian incentive compatible mechanism, M, there is a dominant strategy mechanism that gives each agent the same expected surplus as M provides.

For economy of exposition only, suppose 2 agents and a finite set of possible outcomes, {A}. Suppose, also the same type space, {\{1, \ldots, m\}} for both. Let {f(\cdot)} be the density function over types. To avoid clutter, assume the uniform distribution, i.e., {f(\cdot) = \frac{1}{m}}. Nothing in the subsequent analysis relies on this.

When agent 1 reports type {s} and agent 2 reports type {t}, denote by {z_r(s,t) } the probability that outcome {r \in A} is selected. The {z}‘s must be non-negative and satisfy {\sum_{r \in A}z_r(s,t) = 1.}

Associated with each agent {i} is a vector {\{a_{ir}\}_{r \in A}} that determines, along with her type, the utility she enjoys from a given allocation. In particular, given the allocation rule {z}, the utility that agent {1} of type {t} enjoys when the other agent reports type {s} is {\sum_{r \in A}ta_{1r}z_r(t,s).} A similar expression holds agent 2.

Let {q_i(s,t) = \sum_{r \in A}a_{ir}z_r(s,t).} Interpret the {q}‘s as the `quantity’ of goods that each agent receives. Dominant strategy implies that {q_1(s,t)} should be monotone increasing in {s} for each fixed {t} and {q_2(s,t)} should be monotone increasing in {t} for each fixed {s}. The interim `quantities will be: {mQ_i(s) = \sum_t\sum_{r \in A}a_{ir}z_r(s,t).} Bayesian incentive compatibility (BIC) implies that the {Q_i}‘s should be monotone. To determine if given BIC interim `quantities’ {Q_i}‘s can be implemented via dominant strategies, we need to know if the following system is feasible

\displaystyle \sum_{r \in A}a_{1r}[z_r(i,j) - z_r(i-1,j)] \geq 0\,\, \forall \,\, i = 2, \ldots, m \ \ \ \ \ (1)

\displaystyle \sum_{r \in A}a_{2r}[z_r(i,j) - z_r(i,j-1)] \geq 0\,\, \forall \,\, j = 2, \ldots, m \ \ \ \ \ (2)

\displaystyle \sum_t\sum_{r \in A}a_{1r}z_r(s,t) = mQ_1(s) \ \ \ \ \ (3)

\displaystyle \sum_s\sum_{r \in A}a_{2r}z_r(s,t) = mQ_2(t) \ \ \ \ \ (4)

\displaystyle \sum_{r \in A}z_r(s,t) = 1\,\, \forall s,t \ \ \ \ \ (5)

\displaystyle z_r(i,j) \geq 0\,\, \forall i,j,r \ \ \ \ \ (6)

System (1-6) is feasible iff the optimal objective function value of the following program is zero:

\displaystyle \min \sum_{i,j}w_{ij} + \sum_{i,j}h_{ij}

subject to
\displaystyle \sum_{r \in A}a_{1r}[z_r(i,j) - z_r(i-1,j)] + w_{ij} - u_{ij} = 0\,\, \forall \,\, i = 2, \ldots, m\,\, (\mu_{ij}) \ \ \ \ \ (7)

\displaystyle \sum_{r \in A}a_{2r}[z_r(i,j) - z_r(i,j-1)] + h_{ij} - v_{ij} = 0\,\, \forall \,\, j = 2, \ldots, m\,\, (\nu_{ij}) \ \ \ \ \ (8)

\displaystyle \sum_j\sum_{r \in A}a_{1r}z_r(i,j) = mQ_1(i)\,\, (\alpha_i) \ \ \ \ \ (9)

\displaystyle \sum_i\sum_{r \in A}a_{2r}z_r(i,j) = mQ_2(j)\,\, (\beta_j) \ \ \ \ \ (10)

\displaystyle \sum_{r \in A}z_r(i,j) = 1\,\, \forall i,j\,\, (\lambda(i,j)) \ \ \ \ \ (11)

\displaystyle w_{ij},\,\, h_{ij}\,\,z_r(i,j) \geq 0\,\, \forall i,j,r \ \ \ \ \ (12)

Let {(z^*, w^*, h^*, u^*, v^*)} be an optimal solution to this program.
Suppose, for a contradiction there is a pair {(p,q)} such that {w^*_{pq} > 0}. I will argue that there must exist an {s} such that {u^*_{ps} > 0}. Suppose not, then for each {j \neq q}, either {w^*_{pj} > 0} or {w^*_{pj} = 0} and {u^*_{pj} = 0} (at optimality, it cannot be that {w^*_{pq}} and {u^*_{pq}} are both non-zero). In this case

\displaystyle m[Q_1(p)-Q_1(p-1)] = \sum_j\sum_{r \in A}a_{1r}z^*_r(p,j) - \sum_j\sum_{r \in A}a_{1r}z^*_r(p-1,j)

= \sum_{j: w^*_{pj} > 0}a_{1r}[z^*_r(p,j) - z^*_r(p-1,j)] = -\sum_j w^*_{pj}. This last term is negative, a contradiction. Therefore, there is a s such that w^*_{ps} = 0 but u^*_{ps} > 0.

Let {Z_1(i,j) = \sum_{r \in A}a_{1r}z^*_r(i,j)}, {Z_2(i,j) = \sum_{r \in A}a_{2r}z^*_r(i,j)} and denote by {Z(i,j)} the point {(Z_1(i,j), Z_2(i,j))}. Observe that {Z(i,j)} is in the convex hull, {K} of {\{(a_{1r}, a_{2r}\}_{r \in A}} for all {i,j}. Thus choosing {\{z_r(i,j)\}_{r \in A}}‘s amounts to choosing a point {Z(i,j) \in K}. Equivalently, choosing a point {Z(i,j) \in K} gives rise to a set of {\{z_r(i,j)\}_{r \in A}}‘s. For convenience assume that {Z(i,j)} is in the strict interior of {K} for all {(i,j)} and that K is full dimensional. This avoids having to deal with secondary arguments that obscure the main idea.

Recall, {w^*_{pq} > 0} implies {Z_1(p,q)  0} implies that {Z_1(p,s) > Z_1(p,s)}. Take all points {\{Z(i,q)\}_{i \geq p}} and shift them horizontally to the right by {\delta}. Call these new points {\{Z'(i,q)\}_{i \geq p}}. Observe that {Z'(i,q) \in K} for all {i \geq p}. Next, take all points {\{Z(i,s)\}_{i \geq p}} and shift them horizontally to the left by {\delta} to form new points {\{Z'(i,s)\}_{i \geq p}}. These points are also in {K}. Leave all other points {\{Z(p,j)\}_{j \neq, q,s}} unchanged.

Because the vertical coordinates of all points were left unchanged, (8) and (10) are satisfied by this choice of points. Because {\{Z(i,q)\}_{i \geq p}} and {\{Z(i,s)\}_{i \geq p}} were shifted in opposite directions along the horizontal, (9) is still true. Finally, because all points in {\{Z(i,q)\}_{i \geq p}} and {\{Z(i,s)\}_{i \geq p}} were shifted by the same amount, (7) continues to hold.

The shift leftwards of {Z(p,s)} reduces {u_{ps}} while the rightward shift of {Z(p,q)} reduces {w_{pq}}. Thus, we get a new solution with higher objective function value, a contradiction.

If {Z(p,q)} and {Z(p,s)} are not the interior of {K} but on the boundary, then horizontal shifts alone may place them outside of {K}. In the case of {Z(p,q)} this can only happen if {Z_2(p,q) > Z_2(p-1,q)}. In this case, shift {Z(p,q)} across and to the right by {\delta} as well and then downwards by the same amount. This would have to be matched by a corresponding upward shift by some point {Z_2(h,q)}. Similarly with {Z(p,s)}.

Thanks to Alexey Kushnir and Ahmad Peivandi for comments.

This session was devoted to three different perspectives on Gale and Shapley’s stable matching problem. The first, is traditional wherein nothing more than a command of the English language coupled with a patience for moderately long chains of reasoning is needed. The second, is Adachi’s (from his OR letters paper) lattice formulation of the question and an example to illustrate how it can be generalized to other settings. The third, based on Vandevate’s characterization of the convex hull of stable matchings. I also stuck in a new result that fills a much needed gap in the literature that I hope is correct (if incorrect perhaps it will stimulate others).

1. Stable Matchings

The stable matching problem was introduced by Gale and Shapley as a model of how to assign students to colleges. One can view it as the ordinal version of the SS model. The problem has a set~{M} of men and a set~{W} of women. Each {m \in M} has a strict preference ordering over the elements of {W} and each {w \in W} has a strict preference ordering over the men. The preference ordering of agent~{i} is denoted {\succ_i} and {x \succ_i y} means agent~{i} ranks {x} above {y}. One can accommodate the possibility of an agent choosing to remain single by including for each man (woman) a dummy woman (man) in the set {W } ({M}) that corresponds to being single (or matched with oneself). With this construction we can assume {|M| = |W|}.

A matching is called \index{matching!unstable}unstable if there are two men {m, m'} and two women {w}, {w'} such that

  1. {m} is matched to {w},
  2. {m'} is matched to {w'},
  3. and {w' \succ_m w} and {m \succ_{w'} m'}

The pair {(m,w')} is called a blocking pair. A matching that has no blocking pairs is called stable.

Given the preferences of the men and women, its always possible to find a stable matching using the deferred acceptance algorithm.

Deferred Acceptance Algorithm, male-propose version

  1. First, each man proposes to his top ranked choice.
  2. Next, each woman who has received at least two proposals keeps (tentatively) her top ranked proposal and rejects the rest.
  3. Then, each man who has been rejected, proposes to his top ranked choice amongst the women who have not rejected him.
  4. Again each woman who has at least two proposals (including ones from previous rounds), keeps her top ranked proposal and rejects the rest.
  5. Process repeats until no man has a woman to propose to or each woman has at most one proposal.

Theorem 1 The male propose algorithm terminates in a stable matching.

Proof: When it terminates, it clearly does so in a matching. Suppose the matching is not stable. Then, there exists a blocking pair {(m_1, w_1)} with {m_1} matched to {w_2}, say, and {w_1} matched to {m_2}. Since {(m_1, w_1)} is blocking and {w_1 \succ_{m_1} w_2}, in the proposal algorithm, {m_1} would have proposed to {w_1} before {w_2}. Since {m_1} was not matched with {w_1} by the algorithm it must be because {w_1} received a proposal from a man that she ranked higher than {m_1}. Since the algorithm matches her to {m_2} it follows that {m_2 \succ_{w_1} m_1}. This contradicts the fact that {(m_1, w_1)} is a blocking pair. {\Box}

One could describe an algorithm where the females propose and the outcome would also be a stable matching and possibly different from the one returned using the male propose version. Thus, not only is a stable matching guaranteed to exist but there can be more than one. If there can be more than one stable matching, is there a reason to prefer one to another?

Denote a matching by {\mu}. The woman matched to man {m} in the matching {\mu} is denoted {\mu(m)}. Similarly, {\mu(w)} is the man matched to woman {w}. A matching {\mu} is male-optimal if there is no stable matching {\nu} such that {\nu(m) \succ_m \mu(m)} or {\nu(m) = \mu(m)} for all {m} with {\nu(j) \succ_j \mu(j)} for at least one {j \in M}. Similarly define female-optimal.

Theorem 2 The stable matching produced by the (male-proposal) Deferred Acceptance Algorithm is male-optimal.

Proof: Let {\mu} be the matching returned by the male-propose algorithm. Suppose {\mu} is not male optimal. Then, there is a stable matching {\nu} such that {\nu(m) \succ_m \mu(m)} or {\nu(m) = \mu(m)} for all {m} with {\nu(j) \succ_j \mu(j)} for at least one {j \in M}. Therefore, in the application of the proposal algorithm there must be an iteration where some man {j} proposes to {\nu(j)} before {\mu(j)} since {\nu(j) \succ_j \mu(j)} and is rejected by woman {\nu(j)}. Consider the first such iteration. Since woman {\nu(j)} rejects {j} she must have received a proposal from a man {i} she prefers to man {j}. Since this is the first iteration at which a male is rejected by his partner under {\nu} it follows that man {i} ranks woman {\nu(j)} higher than {\nu(i)}. Summarizing, {i \succ_{\nu(j)} j} and {\nu(j) \succ_i \nu(i)} implying that {\nu} is not stable, a contradiction. {\Box}

You can replace the word “male” by the word “female” in the statement of the theorem. There is no stable matching that is simultaneously optimal with respect to males and females.

A stable matching is immune to a pair of agents opting out of the matching (the blocking pair). We could ask that no subset of agents should have an incentive to opt out of the matching. Formally, a matching {\mu'} dominates a matching {\mu} if there is a set {S \subset M \cup W} that satisfies the following two conditions:

  1. {\mu'(m), \mu'(w) \in S} for all {m,w \in S}
  2. {\mu'(m) \succ_m \mu(m)} and {\mu'(w) \succ_w \mu(w)} for all {m,w \in S}.

Stability is a special case of this dominance condition when we restrict attention to sets~{S} consisting of a single couple. The set of undominated matchings is called the core of the matching game.

Theorem 3 The core of the matching game is the set of all stable matchings.

Now assume the preference orderings of the agents are private information to the agent.

Theorem 4 The direct mechanism associated with the male propose algorithm is strategy proof for the males.

Proof: If not, there is a preference profile {\pi =(\succ_{m_1}, \succ_{m_2}, \ldots, \succ_{m_n})} for the men, such that man {m_1}, say, can misreport his preferences and obtain a better match. Formally, let {\mu} be the stable matching obtained by applying the male proposal algorithm to the profile {\pi}. Suppose {m_1} reports the preference ordering {\succ_* } instead. Let {\nu} be the matching that results when the male-proposal algorithm is applied to the profile {\pi^1 = (\succ_*, \succ_{m_2}, \ldots, \succ_{m_n})}. For a contradiction suppose {\nu(m_1) \succ_{m_1} \mu(m_1)}. For convenience write {a \succeq _{m} b} to mean that {a \succ_m b} or {a=b}.

Let {R = \{m: \nu(m) \succ_m \mu(m)\}} be the set of men better off under matching {\nu} than matching {\mu}. We show that for any {m \in R} and {w = \nu(m)} that {m' = \mu(w) \in R}. In other words, {S = \{w: \nu(w) \in R\} = \{w: \mu(w) \in R\}}. For a contradiction suppose {m' \not \in R}. In particular {m' \neq m_1}. Now, {w \succ_m \mu(m)} because {m \in R}, and stability of {\mu} wrt {\pi} implies {m' \succ_w m}. Stability of {\nu} wrt {\pi^1} requires that {\nu(m') \succ_{m'} w}, otherwise {(m',w)} would block {\nu}. Therefore, {m' \in R}.

Now, {\mu(w) \succ_w \nu(w)} for any {w \in S}, because during execution of the male-propose algorithm on {\pi}, each {w \in S} rejects {\nu(w) \in R} at some iteration. Let {m} be the last man in {R} to make a proposal during execution of male-propose algorithm. This proposal is made to {w = \mu(m) \in S} who, by choice of {m}, must have rejected {\nu(w)} at some earlier iteration. This means that when {m} proposes to {w}, she must reject a pending proposal from {m' \not \in R} such that {m' \succ_w \nu(w)}. As {m' \not \in R}, we have {w \succ_{m'} \mu(m') \succeq_{m'} \nu(m')}. Therefore, {(m', w)} form a blocking pair for {\nu} wrt {\pi^1} (since {m' \neq m_1}). {\Box}

The mechanism associated with the male propose algorithm is not strategy proof for the females.

2. A Lattice Formulation

We describe a proof (due to Adachi; Fleiner also independently but later) of the existence of stable matchings using Tarski’s fixed point theorem.

Call an assignment of women to men such that each man is assigned to at most one woman (but a woman may be assigned to more than one man) a male semi-matching. The analogous object for women will be called a female semi-matching. For example, assigning each man his first choice would be a male semi-matching. Assigning each woman her third choice would be an example of a female semi-matching.

A pair of male and female semi-matchings will be called a semi-matching denoted {\mu}, {\nu}, etc. An example of a semi-matching would consist of each man being assigned his first choice and each woman being assigned her last choice.

The woman assigned to man {m} under the semi-matching {\mu} will be denoted {\mu(m)}. If man {m} is assigned to no woman under {\mu}, then {\mu(m) = m}. Similarly for {\mu(w)}. Next, define a partial order over the set of semi-matchings. Write {\mu \succeq \nu} if

  1. {\mu(m) \succ_m \nu(m)} or {\mu(m) = \mu(m)} for all {m \in M}, and,
  2. {\mu(w) \prec_w \nu(w)} or {\mu(w) = \nu(w)} for all {w \in W}.

Therefore {\mu \succeq \nu} if all the men are better off under {\mu} than in {\nu} and all the women are worse off under {\mu} than in {\nu}.

Next, define the meet and join operations. Given two semi-matchings {\mu} and {\nu} define {\lambda = \mu \vee \nu} as follows:

  1. {\lambda(m) = \mu(m)} if {\mu(m) \succ_m \nu(m)} otherwise {\lambda (m) = \nu (m)},
  2. {\lambda(w) = \mu(w)} if {\mu(w) \prec_w \nu(w)} otherwise {\lambda (w) = \nu (w).}

Define {\lambda' = \mu \wedge \nu} as follows:

  1. {\lambda'(m) = \mu(m)} if {\mu(m) \prec_m \nu(m)} otherwise {\lambda (m) = \nu (m)},
  2. {\lambda(w) = \mu(w)} if {\mu(w) \succ_w \nu(w)} otherwise {\lambda (w) = \nu (w).}

With these definitions it is easy to check that the set of semi-matchings form a compact lattice.

Now define a function {f} on the set of semi-matchings that is non-decreasing. Given a semi-matching {\mu} define {f(\mu)} to be the following semi-matching:

  1. {f(\mu)(m)} is man {m}‘s most preferred woman from the set {\{w: m \succeq_w \mu(w) \} }. If this set is empty, set {f(\mu)(m) = m}.
  2. {f(\mu)(w)} is woman {w}‘s most preferred man from the set {\{m: w \succeq_m \mu(m)\}}. If this set is empty set {f(\mu)(w) = w}.

It is clear that {f} maps semi-matchings into semi-matchings.

Theorem 5 There is a semi-matching {\mu} such that {f(\mu) = \mu} and that {\mu} is a stable matching.

Proof: We use Tarski’s theorem. To apply it we need to check that {f} is non-decreasing. Suppose {\mu\succeq \nu}. Pick any {m \in M}. From the definition of {\succeq}, the women are worse off under {\mu} than in {\nu}. Thus

\displaystyle \{w: m \succ_w \nu(w)\} \subseteq \{w: m \succ_w \mu(w)\}

and so {f(\mu)(m) \succ_m f(\nu)(m)} or {f(\mu)(m) = f(\nu)(m).} When considering the women, for each {w \in W} we must have (because the women are worse off under {\mu} than under {\nu}):

\displaystyle \{m: w \succ_m \mu(m)\} \subseteq \{m: w \succ_m \nu(m)\}.

Therefore, {f(\nu)(w) \succ_w f(\mu)(w) }.

As the conditions of Tarski’s theorem hold, there is a semi-matching {\mu} such that {f(\mu) = \mu}. We show that the semi-matching is a stable matching.

By the definition of a semi-matching we have for every {m \in M}, {\mu(m)} single valued as is {\mu(w)} for all {w \in W}. To show that {\mu} is a matching, suppose not. Then there is a pair {m_1, m_2 \in M}, say, such that {\mu(m_1) = \mu(m_2) = w^*}. Since {f(\mu) = \mu} it follows that {w^*} is {m_1}‘s top ranked choice in {\{w: m_1 \succeq_w \mu(w) \}} and {m_2}‘s top ranked choice in {\{w: m_2 \succeq_w \mu(w) \}}. From this it follows that if {\mu(w^*) = m_3}, then, {m_1, m_2 \succeq _{w^*} m_3}. As {m_1 \neq m_2}, we can assume, WLOG that {m_2 \succ_{w^*} m_3}. However, {m_3 =\mu(w^*) = f(\mu^*)(w^*)} which is woman {w^*}‘s top ranked choice in {\{m: w^* \succeq_m \mu(m) \}}. Since {m_1} belongs to this set, we get a contradiction.

To show that {\mu} is stable, suppose not. Then, there is a blocking pair {(m^*,w^*)}. Let {w' = \mu(m^*)} and {m' = \mu(w^*)}, {m' \neq m^*} and {w^* \neq w'}. As {(m^*,w^*)} blocks {\mu}, {m^* \succ_{w^*} m'} and {w^* \succ_{m^*} w'}. Now {w' = \mu(m^*) = f(\mu)(m^*)} which is man {m^*}‘s top ranked choice from {\{w: m^* \succ_w \mu(w), m^* = \mu(w) \} }. But this set contains {w^*} which is ranked higher by man {m^*} than {w'}, a contradiction. {\Box}

2.1. Splittable Stable Marriage

To show how versatile Adachi’s lattice approach is, we examine an extension due to Sethuraman and Teo called the splittable stable marriage problem.

We have two disjoint agents, say,`sellers” (indexed by {i} or {m}) and “buyers” (indexed by {j} or {w}). Imagine a “market” for a product that is infinitely divisible. Seller {i} has {s_i \geq 0} units to sell; buyer {j} wishes to buy {t_j \geq 0} and the quantity of sale between {i} and {j} can be at most {u_{ij}} (“capacity” constraint). Each agent has a strict preference ordering over agents on the opposite side. For convenience assume each agent ranks themselves last. Assume that the number of buyers is equal to the number of sellers. In the sequel the set of sellers will be {\{m_1, m_2, \ldots, m_n\}}, and the set of buyers will be {\{w_1, w_2, \ldots, w_n\}}.

A splittable marriage is defined as an {(n+1) \times (n+1)} matrix {Z = (z_{ij})} such that:

\displaystyle  \sum_{i=1}^{n+1} z_{ij} = t_j\,\, \forall j

\displaystyle \sum_{j=1}^{n+1} z_{ij} = s_i, \,\, \forall i

\displaystyle z_{ij} \leq u_{ij}, \,\, \forall i, j

\displaystyle z_{ij} \geq 0, \,\, \forall i, j

\displaystyle z_{n+1, n+1} = 0

Here {z_{i,j}} is the quantity of the product sold by seller {i} to buyer {j}, if neither {i} nor {j} is {n+1}; if {i = n+1} and {j < n+1}, then {z_{ij}} is the additional quantity that buyer {j} would like to buy but cannot; and if {i < n+1} and {j = n+1}, then {z_{ij}} is the additional quantity that seller {i} would like to sell but cannot.

A blocking pair for a splittable marriage {Z} is a pair of agents {m_i} and {w_j} such that

  1. {z_{ij} < u_{ij}};
  2. {\sum_{k: k \geq_i j} z_{ik} < s_i}; and
  3. {\sum_{k: k \geq_j i} z_{kj} < t_j}.

If {(i,j)} blocks {Z}, the agents {i} and {j} can do better by trading more with each other, and giving up some of the quantities they transact with less-preferred agents. A splittable stable marriage is a splittable marriage that is not blocked.

Next we define the appropriate generalization of semi-matching. A demand-list for seller {i} is a vector {x^i \equiv (x_{i,1}, x_{i,2}, \ldots, x_{i,n}, x_{i,(n+1)})} that satisfies (1) and (3). Similarly, a demand-list for buyer {j} is a vector {y^j \equiv (y_{1,j}, y_{2,j}, \ldots, y_{n,j}, y_{(n+1),j})} that satisfies (2) and (3).

The demand-lists of sellers and buyers can be encoded by matrices {X} and {Y}. Assume matrices {X} and {Y} have their rows indexed by the sellers and the columns indexed by the buyers. Specifically, {X} has {n} rows and {(n+1)} columns, whereas {Y} has {(n+1)} rows and {n} columns.

The seller demand-list matrix {X} is a valid “matching” of the sellers, but not a valid matching of the sellers to the buyers. Similarly, for the buyer demand-list matrix {Y}.

Let {L_s} and {L_b} be a collection of seller and buyer demand-lists respectively. Define partial orders {\geq_s} and {\geq_b} on the sets {L_s} and {L_b} respectively. For any two elements {X, \hat{X} \in L_s},

\displaystyle X \geq_s \hat{X} \;\;\;\; {\rm if} \;\;\; \forall i, j,\;\; 		\sum_{k:k \geq_i j} x_{ik} \geq 		\sum_{k:k \geq_i j} \hat{x}_{ik}.

Similarly, for any two elements {Y, \hat{Y} \in L_b},

\displaystyle Y \geq_b \hat{Y} \;\;\;\; {\rm if} \;\;\; \forall i, j,\;\; 		\sum_{l:l \geq_j i} x_{lj} \leq 		\sum_{l:l \geq_j i} \hat{x}_{lj}.

(Notice the change in the direction of the inequality in the definition of {\geq_b}.)

Given any {X, \hat{X} \in L_s}, let {M \equiv X \wedge \hat{X}} be defined as follows:

\displaystyle M_{ij} \; = \; \min \{ 		\sum_{k:k \geq_i j} x_{ik}, 		\sum_{k:k \geq_i j} \hat{x}_{ik} \} - 		\sum_{k:k >_i j} M_{ik}.

Similarly, let {J \equiv X \vee \hat{X}} be defined as follows:

\displaystyle J_{ij} \; = \; \max \{ 		\sum_{k:k \geq_i j} x_{ik}, 		\sum_{k:k \geq_i j} \hat{x}_{ik} \} - 		\sum_{k:k >_i j} J_{ik}.

The matrices {M} and {J} can be computed as follows: for each seller, compute the entries corresponding to his row, starting with the most preferred buyer. Clearly, {M} and {J} are in {L_s}, and are, respectively, the greatest lower bound and the least upper bound of {X} and {\hat{X}}. Again, the meet and join of any two elements of {L_b} can be defined in a similar way. Thus {(L_s, \vee, \wedge)} and {(L_b, \vee, \wedge)} are lattices. By replacing “min” and “max” with “inf” and “sup” respectively, we can infer that meets and joins exist for arbitrary subsets of elements of {L_s} or {L_b}; hence {(L_s, \vee, \wedge)} and {(L_b, \vee, \wedge)} are complete lattices. The lattice of interest is the product lattice {L \equiv (L_s, L_b)} with the partial order {\geq} defined as follows:

\displaystyle (X, Y) \geq (\hat{X}, \hat{Y}) \;\;\; {\rm if} 	\;\; X \geq_s \hat{X} \; {\rm and} \; 	Y \geq_b \hat{Y}.

From standard results from lattice theory we see that {L} is a complete lattice.

Call demand-lists {X} and {Y} compatible if {X_{ij} = Y_{ij}} for all { 1 \leq i, j \leq n}. Any splittable stable marriage {Z} can be viewed as a pair of compatible demand-lists: for any seller {i} and buyer {j}, set {X_{ij}} and {Y_{ij}} to be {Z_{ij}}; allocate the remaining supply and unfulfilled demand to the last entries in each row and column of {X} and {Y} respectively. The converse is false.

Next we need an appropriate monotone function on the lattice {L}. As intuition, suppose the buyers form a coalition and “agree” to hand over their demand-list {Y} to the sellers. How should the sellers respond? Clearly, the sellers would “update” their demand-lists as follows:

\displaystyle   X_{ij} = \min \{ u_{ij}, \max (t_j - \sum_{l: l >_j i} Y_{lj}, 0),\; \max (s_i - \sum_{k: k >_i j} X_{ik}, 0) \}. \ \ \ \ \ (1)

Notice that {X_{.,.}} appears on both sides of (1). One way to perform this update is as follows: for each seller, update the entries corresponding to his row, starting with the most preferred buyer. In this manner, the “old” demand-list of {X} is never used in the update. We call this the canonical computation.

Similar considerations suggest:

\displaystyle   Y_{ij} = \min \{ u_{ij}, \max (t_j - \sum_{l: l >_j i} Y_{lj}, 0),\; \max (s_i - \sum_{k: k >_i j} X_{ik}, 0) \}. \ \ \ \ \ (2)

Again, Eq.(2) can be computed as follows: for each buyer, update the entries corresponding to his row, starting with the most preferred seller.

Equations (1) and (2) have identical right hand sides. Unsurprisingly, compatible demand-lists arising from solutions of Equations (1) and (2) correspond to splittable stable marriages.

Lemma 6 If {X} and {Y} are solutions to Equations (1) and (2), then {Z \equiv (X_{ij})} for {1 \leq i, j \leq n} defines a splittable stable marriage.

Proof: By the hypothesis of the lemma, {X} and {Y} are identical. Thus,

\displaystyle  X_{ij} = \min \bigg\{ u_{ij}, \; \max\bigg(t_j - \sum_{l: l >_j i} X_{lj}, 0\bigg),\; \max\bigg(s_i - \sum_{k: k >_i j} X_{ik}, 0\bigg) \bigg\}.

So if {X_{ij} < u_{ij}}, either {\sum_{l: l \geq_j i} X_{lj} \; = \; t_j} or {\sum_{k: k \geq_i j} X_{ik} \; = \; s_i}. {\Box}

We next show that the computations of Eqs.(1) and (2) define monotone maps.

Theorem 7 For {Y \in L_b}, let {f(Y)} be the seller demand-list obtained from Eq.~(1). Then,

\displaystyle Y \geq_b \hat{Y} \;\; \Rightarrow \;\; f(Y) \geq_s f(\hat{Y}).

Proof: Consider seller {i} and buyer {j}. Since {Y \geq_b \hat{Y}}, by definition,

\displaystyle  \sum_{l:l \geq_j i} y_{lj} \; \leq \; 		\sum_{l:l \geq_j i} \hat{y}_{lj}.


\displaystyle  t_j - \sum_{l:l \geq_j i} y_{lj} \; \geq \; 		t_j - \sum_{l:l \geq_j i} \hat{y}_{lj}.

If {j} is {i}‘s most preferred buyer, {(f(Y))_{ij} \geq_i (f(\hat{Y}))_{ij}} follows from the definition. Inductively, we can argue along these lines that {f(Y) \geq_s f(\hat{Y})}. {\Box}


Theorem 8 For {X \in L_s}, let {g(X)} be the buyer demand-list obtained from Eq.~(2). Then,

\displaystyle X \geq_s \hat{X} \;\; \Rightarrow \;\; g(X) \geq_b g(\hat{X}).

Consider the function {h} defined on {L} as follows:

\displaystyle h(X,Y) := (g(Y), f(X)).

We claim that {h} is monotone. Why? Let {(X, Y) \geq (\hat{X}, \hat{Y})}. Then {Y \geq_b \hat{Y}}, and so {g(Y) \geq_s g(\hat{Y})}; similarly, {X \geq_s \hat{X}}, and so {f(X) \geq_b f(\hat{X})}. Thus {h(X,Y) \geq h(\hat{X}, \hat{Y})}. By Tarski’s theorem, {h} has a fixed point. This fixed point is clearly a splittable stable marriage.

One can generalize even further. Let {f} and {g} be two monotone integer valued submodular functions. Define a splittable matching to be any {z \in P(f) \cap P(g)}. Preferences would defined similarly. Each buyer seeks to get at much as possible from their most preferred seller subject to feasibility etc.

3. An LP Formulation

Vandevate identified a collection of linear inequalities that describe the convex hull of stable matchings. For each man {m} and woman {w} let {x_{mw} = 1} if man {m} is matched with woman {w} and zero otherwise.

Then, every stable matching must satisfy the following.

\displaystyle  \sum_{w \in W}x_{mw} = 1\,\, \forall m \in M

\displaystyle \sum_{m \in M}x_{mw} =1\,\, \forall w \in W

\displaystyle \sum_{j \prec_m w}x_{mj} + \sum_{i \prec_w m}x_{iw} + x_{mw} \leq 1\,\, \forall m \in M, w \in W

\displaystyle x_{mw} \geq 0 \,\, \forall m \in M, w \in W

Let {P} be the polyhedron defined by these inequalities.

The first two constraints of {P} ensure that each agent is matched with exactly one other agent of the opposite sex. The third constraint ensures stability by ruling out blocking pairs (call it the blocking constraint). To see why, suppose {\sum_{j \prec_m w}x_{mj} = 1} and {\sum_{i \prec_w m} x_{iw} = 1}. Then man {m} is matched to a woman, {j} that he ranks below {w}. Similarly, woman {w} is matched to a man she ranks below {m}. This would make the pair {(m,w)} a blocking pair.

Lemma 9 Suppose {P \neq \emptyset}. Let {x \in P}. Then, {x_{ij} > 0} implies that

\displaystyle \sum_{j \prec_m w}x_{mj} + \sum_{i \prec_w m}x_{iw} + x_{mw} = 1.

Proof: Consider {\min \{\sum_i \sum_j x_{ij}: x \in P\}}.The dual to this program is

\displaystyle  \max \sum_{i \in M}\alpha_i + \sum_{j \in W}\beta_j - \sum_{i \in M}\sum_{j \in W} \nu_{ij}

subject to

\displaystyle  \alpha_i + \beta_j - \sum_{k \in W: k \succeq_{i} j}\nu_{ik} - \sum_{k \in M: k \succ_{j} i}\nu_{kj} \leq 1\,\, \forall\,\, i \in M \,\,,j \in W

\displaystyle \nu_{ij} \geq 0\,\, \forall\,\, i \in M \,\,,j \in W

Set {\alpha_i = \sum_{j \in W}\nu_{ij}} and {\beta_j = \sum_{i \in M}\nu_{ij}}. Substituting this into the dual constraints yields:

\displaystyle  \sum_{k \in W:k \prec_i j}\nu_{ik} + \sum_{k \in M:k \prec_j i}\nu_{kj} + \nu_{ij} \leq 1\,\, \forall\,\, i \in M \,\,,j \in W.

Choose any {x^* \in P} and set {\nu_{ij} = x^*_{ij}}. This choice of {\nu} is clearly dual feasible. It has an objective function value of {\sum_{i \in M}\sum_{j \in W}x^*_{ij}} and so, is dual optimal. The lemma now follows by complementary slackness.{\Box}

The proof below is due to Sethuraman and Teo.

Theorem 10 Suppose {P \neq \emptyset}. Then, {P} is the convex hull of stable matchings.

Proof: Choose any weight vector {\{c_{ij}\}_{i \in M, j \in W}} and let

\displaystyle x^* \in \arg \max \{\sum_i \sum_j c_{ij}x_{ij}: x \in P\}.

With each member of {M \cup W} we associate an interval {(0,1]}. For each {i \in M}, partition the associated interval {(0,1]_i} into subintervals of length {x^*_{ij}} for all {j \in W}. Arrange these subintervals left to right by by man {i}‘s decreasing preference over {W}. For each woman {j \in W}, partition the associated interval {(0,1]_j} into subintervals of length {x^*_{ij}} for all {i \in M}. Arrange these subintervals left to right by by woman {j}‘s increasing preference over {M}.

Lemma 6 means the subinterval spanned by {x^*_{ij}} in {(0,1]_i} and {(0,1]_j} coincides. Pick a random number {U} uniformly in (0, 1] and construct a semi-matching in the following way:

  1. Match {i \in M} to {k \in W} if {U} lies in the subinterval of {(0,1]_i} spanned by {x^*_{ik}}.
  2. Match {j \in W} to {i \in M} if {U} lies in the subinterval of {(0,1]_j} spanned by {x^*_{ij}}.

By Lemma 6, {i \in M} is matched to {j \in W} iff {j \in W} is matched to {i \in M}. Furthermore, no two men can be matched to the same woman, and similarly no two women can be matched to the same man. So the above semi-matching gives rise to a perfect matching. To show this matching is stable, consider man {i} who is matched to {j \in W} but prefers {k \in W}. Then, in {(0,1]_i}, the subinterval corresponding to {x^*_{ik}} is to the left of the subinterval corresponding to {x^*_{ij}}. Because {i \in M} is matched to {k \in W}, it means that {U} is to the right of the subinterval corresponding to {x^*_{ik}} in {(0,1]_i}. Therefore, {U} is to the left of the subinterval corresponding to {x^*_{ik}} in {(0,1]_j}. In other words, {j \in W} is matched to someone she prefers to {i \in M}. Set {X^U_{ij} = 1} iff man {i} is matched to woman {j} by the randomized scheme above. Then

\displaystyle E(\sum_{i \in M}\sum_{j \in W}c_{ij}X^U_{ij}) = \sum_{i \in M}\sum_{j \in W}c_{ij}E(X^U_{ij})) = \sum_{i \in M}\sum_{j \in W}c_{ij}x^*_{ij}.

Thus, we have exhibited a probability distribution over stable matchings whose expected objective function value coincides with {c \cdot x^*}. It follows then, that there is a stable matching with objective function value {c \cdot x^*}.{\Box}

3.1. A Subgradient Algorithm

The LP approach requires that we know that {P \neq \emptyset}. It would be nice to verify this independently of the proposal algorithm. Here I will outline (not warranted correct) how one can show directly from the LP that {P \neq \emptyset}. If woman { j} is man { i}‘s first choice set { r_{ij} = n}. If woman { j} is man { i}‘s second choice set { r_{ij} = n-1} and so on. Consider:

\displaystyle \max \{\sum_{m \in M}\sum_{w \in W}r_{mw}x_{mw}: x \in P\}.

Let {\nu} be the multiplier associated with the blocking constraints and {p} the multiplier associated with the constraint {\sum_{w \in W}x_{mw} = 1} for all {m}. Consider the following Lagrangean relaxation of the optimization problem:

\displaystyle  \max \sum_{m \in M}\sum_{w \in W} [r_{mw} - p_w - \nu_{mw} - \sum_{k \in W:k \succ_m w } \nu_{mk} - \sum_{k \in M: k \succ_w m} \nu_{km}]x_{mw}

subject to

\displaystyle \sum_{w \in W}x_{mw} = 1\,\, \forall m \in M

\displaystyle  x_{mw} \geq 0\,\, \forall (m,w)

This is easy to solve. For each man {m \in M} choose the woman {w \in W} that maximizes

\displaystyle r_{mw} - p_w - \nu_{mw} - \sum_{k \in W:k \succ_m w } \nu_{mk} - \sum_{k \in M: k \succ_w m} \nu_{km}.

In case of a tie, chooses the top ranked woman.

Initially set {(p, \nu) = 0}. Let {(p^t, \nu^t)} be the value of the multipliers at start of iteration {t}. Denote by {x^t} the optimal choice of {x} given the multipliers {(p^t, \nu^t)}. If {x^t_{mw} = 1} we will say that man {m} proposed to woman {w}. Given {x^t} we adjust {(p^t, \nu^t)} by {(p, \nu)} as follows:

  1. Set {p_j = 0}
  2. Set {\nu_{ij} = x^t_{ij}} for all {i,j}.
  3. Set {(p^{t+1}, \nu^{t+1}) = (p^t, \nu^t) + (p, \nu)}
  4. Set {r^{t+1}_{mw} = r^t_{mw} - x^t_{mw} - \sum_{k \in W:k \succ_m w } x^t_{mk} - \sum_{k \in M: k \succ_w m} x^t_{km}.}

Stop once {x^t} is an assignment. By complementary slackness this must be an optimal stable assignment. Call {r^t} the dual adjusted rank.

Suppose at iteration {t}, {x^t_{mw} = 1}.

Case 1 If among all {i \in M} such that {x^t_{iw} = 1}, man {m} is woman {w}‘s top ranked choice, {r^{t+1}_{mw} = r^t_{mw} - 1}.
Because {i} is woman {w}‘s top ranked choice it follows that {\sum_{k \in M: k \succ_w m} x^t_{km} = 0}. Because {x^t_{mw} = 1} it follows that {\sum_{k \in W:k \succ_m w } x^t_{mk} = 0}.

Case 2 If among all {i \in M} such that {x^t_{iw} = 1}, man {m} is not woman {w}‘s top ranked choice, {r^{t+1}_{mw} \leq r^t_{mw} - 2}.
Because {i} is not woman {w}‘s top ranked choice it follows that {\sum_{k \in M: k \succ_w m} x^t_{km} \geq 1}. Because {x^t_{mw} = 1} it follows that {\sum_{k \in W:k \succ_m w } x^t_{mk} = 0}.

Case 3 If {w \succ_m j} and {x^t_{kj} = 0} for all {k \in M} such that {k \succ_j m}, then {r^{t+1}_{mj} = r^t_{mj}-1}.
Clearly {\sum_{k \in M: k \succ_w m} x^t_{km} = 0} and {x^t_{mj} = 0}. Also, {\sum_{k \in W:k \succ_m w } x^t_{mk} = 1}.

Case 4 If {w \succ_m j} and {x^t_{kj} = 1} for at least one {k \in M} such that {k \succ_j m}, then {r^{t+1}_{mj} \leq r^t_{mj} - 2}.
Clearly {\sum_{k \in M: k \succ_w m} x^t_{km} \geq 1} and {x^t_{mj} = 0}. Also, {\sum_{k \in W:k \succ_m w } x^t_{mk} = 1}.

Case 5 If {j \succ_m w} then, then {r^{t+1}_{mj} = r^t_{mj} - \sum_{k \in M: k \succ_j m} x^t_{km}}.
Clearly, {x^t_{mj} = 0}. Also, {\sum_{k \in W:k \succ_m w } x^t_{mk} = 0}.

Let {P^t = \{w \in W: \sum_{m \in M}x_{mw} \geq 1\}}. I show first that {P^t \subseteq P^{t+1}}. Consider first the case {t=0}. Suppose man {m} proposed to woman {w} and man {m} is woman {w}‘s top choice among those who have proposed. Then the dual adjusted rank of woman {w} by man {m} declines by exactly 1 (see case 1). The dual adjusted rank of all other women decline by at least 1 (case 2 to 4). Note that case 5 does not apply in this iteration. Therefore, in the next iteration man {m} will continue to propose to woman {w}. More generally, any woman who receives a proposal in iteration {0} continues to receive a proposal in iteration {1}. Suppose this holds until iteration {t}. Suppose man {m} proposes to woman {w} in iteration {t} and man {m} is woman {w}‘s top choice among those who have proposed at iteration {t}. By cases (1-4) the dual adjusted rank for all woman ranked below {w} declines by at least 1. Consider a woman {j \succ_m w}. By case 5

\displaystyle r^{t+1}_{mj} = r^t_{mj} - \sum_{k \in M: k \succ_j m} x^t_{km}.

However, {\sum_{k \in M: k \succ_j m} x^t_{km} \geq 1} because any woman who received a proposal before iteration {t} continues to receive a proposal at iteration {t}. Thus the dual adjusted rank of all women ranked above {w} decline by at least 1. Because man {m} did not proposes to these women in iteration {t} their dual adjusted rank must be at least 2 smaller than {r^t_{mw}}. Thus, woman {w} continues to have the highest dual adjusted rank for man {m} and at iteration {t+1}, man {m} will continue to propose to woman {w}.

Next, I show there must be some {t} such that {P^t = W}. If not, then, it must be the case that {P^t = P^{t+1} = \ldots \subset W}. Because {P^t \subset W} there must be a woman {w} who is overdemanded. Consider a man {m} who has proposed to her at iteration {t} who is not her top choice among those proposing to her. Notice the following:

  1. By case 2, {r^{t+1}_{mw} \leq r^t_{mw} - 2.}
  2. By case 5 for all {j \in P^t} such that {w \succ_m j}, {r^{t+1}_{mj} \leq r^t_{mj} -2}.
  3. By case 5, for all {j \in P^t} such that {j \succ_m w}, {r^{t+1}_{mj} \leq r^t_{mj} -1}.
  4. By case 3, for all {j \not \in P^t}, {r^{t+1}_{mj} = r^t_{mj} - 1}. Note, for all such {j} it must be that {w \succ_m j}.

To summarize, the dual adjusted rank of all women {j \in P^t} such that {w \succeq_m j}, goes down by at least 2. That means in all subsequent iterations, man {m} must propose to a woman ranked at or above {w} in {P^t}. Eventually, there is a woman, {w' \in P^t}, say, that man {m} keeps proposing to. This means that the dual adjusted rank of all women in {P^t} declines by at least 2 while the dual adjusted rank of all women outside of {P^t} declines by exactly 1. Eventually, some woman outside of {P^t} must have a dual adjusted rank that is largest and man {m} will propose to her, contradiction.

A mathematical detour today. Will state and prove some basic results in combinatorial optimization that have applications in matching and auction theory. The first is the max-flow-min-cut theorem. I’ll use it to prove Hall’s marriage theorem as well as derive the Border-Matthews characterization of implementable interim allocation rules. It has other applications. Itai Sher, for example, uses it in the analysis of a special case of the Glazer-Rubinstein model of persuasion. Mobius et al use it in their study of capital flows in social networks (the Peruvian village paper). Olszewski and myself have a forthcoming paper where we use it to talk about team formation.

Then, its on to polymatroids which can be used to show the assignment problem retains its integrality property in the presence of additional constraints. These additional constraints arise in, for example, school choice problems. See the paper by Budish, Che, Kojima and Milgrom for a list of examples.

1. The Maximum Flow Problem

Let {G= (V,E)} be a network with vertex set {V} and edge set {E}. Assume, for convenience, that edges are `bi-directional’. In {V} there are two special vertices. One called {s}, the source and the other {t}, the sink. Each edge {(i,j) \in E} has a capacity {c_{ij}} that is the maximum flow that can traverse {(i,j)} (in either direction). Assume {c_{ij}} is integral for all {(i,j) \in E}. To avoid trivialities assume at least one path in {G} from {s} to {t}.

The maximum flow problem is to find the maximum amount of flow that can be sent from {s}, through the network, to {t}. That there is only source-sink pair is not a restriction. An instance with multiple sources and sinks can be converted into one with a single {s-t} pair through the introduction of a dummy super source and super sink.

A cut in {G} is a set of edges whose removal disconnects {s} from {t}. Equivalently, a cut is a set of edges that intersects every path from {s} to {t} in {G}. A cut can also be defined in terms of a partition of {V} into two sets {S} and {T} such that {s \in S} and {t \in T}. A cut would be the set of edges with one endpoint in {S} and the other in {T}. The capacity of a cut is the sum of the capacities of the edges in the the cut.

The maximum {s-t} flow is clearly bounded above by the capacity of any cut. Indeed, the max {s-t} flow must be bounded above by the minimum capacity cut. Remarkably, this upper bound is tight, i.e., the max flow is the min cut. Note that the min cut must be integer valued but it is not obvious that the maximum flow should be integer valued.

We can formulate the problem of finding a maximum flow as a linear program. To do this, let {x_{ij}} denote the amount of flow on edge {(i,j)} from {i} to {j}. Then, the maximum flow problem can be expressed as follows:

\displaystyle  \max f

subject to

\displaystyle  f - \sum_{(s,j) \in E}x_{sj} = 0

\displaystyle \sum_{i \in V: (i,j) \in E}x_{ij} - \sum_{k \in V: (j,k) \in E}x_{jk} = 0 \quad \forall j \in V\setminus\{s,t\}

\displaystyle  -f + \sum_{(i,t) \in E}x_{it} = 0

\displaystyle 0 \leq x_{ij} \leq c_{ij}\quad, \forall (i,j) \in E

The first constraint says that the flow, {f}, out of {s} must use edges of the form {(s,j)}. The second constraint imposes flow conservation in each vertex, i.e., flow in must equal flow out. The third constraint ensures that the total flow {f} entering {t} must use edges of the form {(j,t)}.

The constraint matrix is a network matrix and so TUM. You can see this by noting that each variable {x_{ij}} appears in at most two constraints with opposite sign. Once as an `incoming’ edge and once as an `outgoing’ edge. The variable {f} also appears exactly twice with opposite sign.

TUM of the constraint matrix means that the flow on each edge and as well as the amount of flow is integer valued.

Theorem 1 The maximum flow is equal to the capacity of the minimum cut.

Proof: Let {f^*} be the value of the maximum flow. By LP duality:

\displaystyle  f^* = \min \sum_{(i,j) \in E}c_{ij}z_{ij}

\displaystyle  z_{ij} \geq |y_i - y_j| \quad \forall (i,j) \in E

\displaystyle y_s - y_t = 1

\displaystyle  z_{ij} \geq 0 \quad \forall (i,j) \in E

To see where it came from, observe that we have one dual variable {y_i} for each {i \in V} and one variable {z_{ij}} for each {(i,j) \in E}. Recall the bidirectional assumption. So, {z_{ij}} is different from {z_{ji}}. Thus one should have two constraints for each pair {(i,j)}:

\displaystyle z_{ij} \geq y_i - y_j

\displaystyle z_{ji} \geq y_j - y_i

I have simply folded these into a single constraint {z_{ij} \geq |y_i - y_j|}. Notice that at optimality at most one of {z_{ij}} and {z_{ji}} can be positive which is why this folding is legitimate.

Notice also, that we can WLOG fix {y_s =1} and {y_t = 0}. This allows us to right the dual as:

\displaystyle  f^* = \min \sum_{(i,j) \in E}c_{ij}z_{ij}

\displaystyle z_{ij} \geq |y_i - y_j| \quad \forall (i,j) \in E

\displaystyle y_s = 1

\displaystyle  y_t = 0

\displaystyle  z_{ij} \geq 0 \quad \forall (i,j) \in E

By TUM of the constraint matrix, we know that an optimal solution to this program that is integral exists. An integer solution to this program corresponds to a cut. Let {S = \{i: y_i =1\}} and {T = \{i: y_i = 0\}}. Observe

\displaystyle f^* = \sum_{i \in S, j \in T}c_{ij},

which is the capacity of the cut associated with the partition. It is easy to see that every cut corresponds to a feasible integer solution to the program.{\Box}

I can’t resist offering a probabilistic proof due to Bertsimas, Teo and myself. Let {(y^*, z^*)} be an optimal solution to the dual program. Suppose, for a contradiction that it is fractional. Consider the following randomization scheme:

  1. Position node {i} in {[0,1]} according to the value of {y^*_i}.
  2. Generate a single random variable {U} uniformly distributed in {[0,1]}. Round all nodes {i} with {y^*_i \leq U} to {y_i=0} and all nodes {i} with {y^*_i > U} to {_i=1}.
  3. Set {z_{ij} = |y_i-y_j|}.

The randomization scheme clearly returns a feasible cut. More importantly the randomization scheme defines a probability distribution over the set of cuts. Now lets compute the expected capacity of the cut generated by the randomization scheme:

\displaystyle \sum_{(i,j) \in E}c_{ij} P\{\min (y^*_i, y^*_j) \leq U \leq \max (y^*_i, y^*_j)\}

\displaystyle = \sum_{(i,j) \in E}c_{ij}|y^*_i - y^*_j| = \sum_{(i,j) \in E}c_{ij}z^*_{ij}.

Thus, the expected capacity of our random cut is the same as the optimal objective function value. Because the randomization scheme is a distribution over cuts, it follows that there must be a cut with capacity equal to {f^*}.

1.1. Hall Marriage Theorem

Recall, that in the DGS auction from the previous session, an important step relied on the Hall Marriage theorem. Here, I will use maxflow-mincut to prove it.

Let {(V_1, V_2, E)} be a bipartite graph with vertices {V_1 \cup V_2} and edge set {E}. Suppose {V_1| = |V_2| = n}. Note edges run between {V_1} and {V_2} only. A perfect matching is a subset of edges {M} such that each vertex is incident to exactly one edge in {M}. A matching is a subset of edges {M} such that each vertex is incident to at most one edge in {M}.

For any {S \subseteq V_1}, let {N(S) \subseteq V_2} be the set of neighbors of {S}. That is {N(S) = \{j \in V_2: (i,j) \in E,\,\, i \in S\}}.

Theorem 2 A bipartite graph, {(V_1, V_2, E)} contains a perfect matching iff. { |S| \leq |N(S)|} for all {S \subseteq V_1}.

Proof: This is clearly necessary. So, suppose { |S| \leq |N(S)|} for all {S \subseteq V_1} and for a contradiction assume that the graph does not contain a perfect matching.

Introduce a dummy source vertex {s} with an edge directed to each vertex in {V_1}. To each edge {(s,i)} with {i \in V_1}, assign a capacity of 1 unit. Introduce a dummy sink {t} with an edge directed from each vertex of {V_2} to {t}. To each edge of the form {(j,t)} assign a capacity of 1 unit. To each edge in {E} assign a capacity of {\infty}.

Consider now a maximum flow from {s} to {t}. By Theorem 1 we can assume the flow is integral, i.e., the flow on each edge is an integer amount. Second, it is easy to see that no edge will contain more than 1 unit of flow. Third, the set of edges, {M} in {E} that carry a non-zero amount of flow must be a matching. Why? No more than one unit of flow can leave any {i \in V_1} and no more than one unit of flow can enter an {j \in V_2}. Fourth, each matching corresponds to a feasible flow. Thus, the value of the maximum flow gives us the size of a maximum cardinality matching. Because we assumed no perfect matching, the value of the maximum flow must be less than {n}. i.e., {|M| < n}.

Now a maximum flow must equal the value of a minimum cut. Clearly, no minimum cut can contain edges in {E} because they have infinite capacity. So, the minimum cut must consist of edges of the form {(s,i)} and {(j,t)}. Let {A_1 \subseteq V_1} and {A_2 \subseteq V_2} be the set of vertices incident to the edges in a minimum cut. The capacity of this cut will be {|A_1| + |A_2| < n}. Because they form a cut, removing them from the graph would delete all paths from {s} to {t}. Do that.

Consider any {i \in V_1 \setminus A_1}. There cannot be a {j \in V_2 \setminus A_2} such that {(i,j) \in E}. If not, there would be a path {s \rightarrow i \rightarrow j \rightarrow t}. That means every neighbor of a vertex {i ]\in V_1 \setminus A_1} resides in {A_2}. Hence, by assumption:

\displaystyle |A_2| \geq |V_1 \setminus A_1| = n- |A_1| \rightarrow |A_1| + |A_2| \geq n,

a contradiction. {\Box}

The Hall marriage theorem is easily generalized to something called Gale’s demand theorem. Suppose each vertex in {i \in V_1} is a demand vertex, demanding {d_i} units of a homogenous good. Each vertex {j \in V_2} is a supply vertex, supplying {s_j} units of that same good. Supply can be shipped to demand nodes only along the edges in {E}. Is there a feasible way to match supply with demand? Yes, iff. for all {S \subseteq V_1} we have

\displaystyle \sum_{i \in S}d_i \leq \sum_{j \in N(S)}s_j.

1.2. Border’s Theorem

For simplicity assume two agents, {T} be the type space of agent 1, {S} the type space of agent 2 and a single good. Suppose, also for simplicity, that types are independent draws. Denote by {f_i(x)} the probability of agent {i} having type {x}. Let {a_i(t,s)} the probability that the good is allocated to agent {i} when agent 1 reports {t} and agent 2 reports {s}. Feasibility requires that

\displaystyle a_1(t, s) + a_2(t, s) \leq 1 \,\, \forall t, s

\displaystyle a_i(t,s) \geq 0\,\, \forall i,\,\, \forall t, s

{a} is an allocation rule. Denote by {q_1(t)} the interim allocation probability to agent 1when she reports {t}. Note that

\displaystyle q_1(t) = \sum_{s \in S}a_1(t,s)f_2(s)

A similar expression applies for {q_2(s)}. Call an interim allocation probability implementable if there exists an allocation rule that corresponds to it. Here is the question posed originally posed by Steve Matthews (who conjectured the answer): characterize the implementable {q}‘s. The high level idea first. Roll up all the inequalities relating {a} to {q} and feasibility into matrix form:

\displaystyle {\cal I}q - {\cal P}a = 0

\displaystyle {\cal A}a \leq 1

\displaystyle a \geq 0

Denote by {F} the set of {(q,a)} that satisfy the system above. The purely mathematical question is this: for which {q}‘s does there exist at least one {a} such that {(q,a) \in K}? In other words, we want to identify the set {K}:

\displaystyle K = \{q: \exists a\,\, with\,\, (q,a) \in F\}.

The set {K} is called the reduced form. Geometrically, {K} is just the projection of {F} onto the {q}-space. By the Farkas lemma, the set {K} can be characterized in the following way. Let {C = \{(u,v): v \geq 0, -u{\cal P} + v{\cal A} = 0\}}. {C} is a cone. Then,

\displaystyle K = \{q: u \cdot q \leq v \cdot 1\,\, \forall (u,v) \in C\}.

It would seem that {K} is described by a continuum of inequalities. However, because {C} is a cone, it suffices to consider only the generators of the cone {C} and there are finitely many of them. The challenge now reduces to identifying the generators of {C}. When the matrices {{\cal A}} and {{\cal P}} have suitable structure this is possible and is the case here. For convenience assume {T = \{t, t'\}} and {S = \{s,s'\}}. Here are all the inequalities:

\displaystyle  a_1(t,s) + a_2(t,s) \leq 1

\displaystyle  a_1(t',s) + a_2(t',s) \leq 1

\displaystyle  a_1(t',s') + a_2(t',s') \leq 1

\displaystyle  a_1(t,s') + a_2(t,s') \leq 1

\displaystyle  a_1(t,s)f_2(s) + a_1(t,s')f_2(s') = q_1(t)

\displaystyle  a_1(t',s)f_2(s) + a_1(t',s')f_2(s') = q_1(t')

\displaystyle  a_2(t,s)f_1(t) + a_2(t',s)f_1(t') = q_2(s)

\displaystyle  a_2(t,s')f_1(t) + a_2(t',s')f_1(t') = q_2(s')

Now some appropriate scaling:

\displaystyle  f_1(t)f_2(s) a_1(t,s) + f_1(t)f_2(s)a_2(t,s) \leq f_1(t)f_2(s)

\displaystyle  f_1(t')f_2(s)a_1(t',s) + f_1(t')f_2(s)a_2(t',s) \leq f_1(t')f_2(s)

\displaystyle  f_1(t')f_2(s')a_1(t',s') + f_1(t')f_2(s')a_2(t',s') \leq f_1(t')f_2(s')

\displaystyle  f_1(t)f_2(s')a_1(t,s') + f_1(t)f_2(s')a_2(t,s') \leq f_1(t)f_2(s')

\displaystyle  a_1(t,s)f_1(t)f_2(s) + a_1(t,s')f_1(t)f_2(s') = f_1(t)q_1(t)

\displaystyle  a_1(t',s)f_1(t')f_2(s) + a_1(t',s')f_1(t')f_2(s') = f_1(t')q_1(t')

\displaystyle  a_2(t,s)f_1(t)f_2(s) + a_2(t',s)f_1(t')f_2(s) = f_2(s)q_2(s)

\displaystyle  a_2(t,s')f_1(t)f_2(s') + a_2(t',s')f_1(t')f_2(s') = f_2(s')q_2(s')

Now a change of variables to make things nice and neat: {x_i(u,v) = a_i(u,v)f_1(u)f_2(v)}.

\displaystyle  x_1(t,s) + x_2(t,s) \leq f_1(t)f_2(s)

\displaystyle x_1(t',s) + x_2(t',s) \leq f_1(t')f_2(s)

\displaystyle  x_1(t',s') + x_2(t',s') \leq f_1(t')f_2(s')

\displaystyle  x_1(t,s') + x_2(t,s') \leq f_1(t)f_2(s')

\displaystyle  x_1(t,s)+ x_1(t,s') = f_1(t)q_1(t)

\displaystyle  x_1(t',s) + x_1(t',s') = f_1(t')q_1(t')

\displaystyle  x_2(t,s)+ x_2(t',s) = f_2(s)q_2(s)

\displaystyle  x_2(t,s') + x_2(t',s') = f_2(s')q_2(s')

This system corresponds to the flow balance conditions for a flow problem on a bipartite graph. For each profile of types {(u,v)} introduce a supply vertex with supply {f_1(u)f_2(v)}. For each {u \in T} introduce a demand vertex with demand {f_1(t)q_1(t)}. From each supply vertex of the form {(u,v)}, introduce two edges. One directed from {(u,v)} to demand vertex {f_1(u)q_1(t)}; the flow on this edge is {x_1(u,v)}. The other edge directed into demand vertex {f_2(v)q_2(v)}; the flow on this edge is {x_2(u,v)}.

The first four inequalities say that the flow out of a supply vertex cannot exceed its available supply. The second four ensure that the flow into a demand vertex matches the demand.

Using this flow interpretation we can use Gale’s generalization of the Hall marriage theorem to determine when a feasible flow exists. Existence of a feasible flow means that {q} is implementable. Let {A \subseteq T} and {B \subseteq S}. Then, for all such {A} and {B},

\displaystyle \sum_{t \in A}f_1(t)q_1(t) + \sum_{s \in B}f_2(t)q_2(s) \leq \sum_{t \in A}f_1(t)[\sum_{s \in S}f_2(s)] + \sum_{s \in B}f_2(s)[\sum_{t \in T}f_1(t)] = \sum_{t \in A}f_1(t) + \sum_{s \in B}f_2(s)

is necessary and sufficient for {q} to be implementable. The left hand side of this inequality is the expected quantity of the good allocated to agents when their types come from {A} and {B}. The right hand side is the probability that at least one agent has a type in {A \cup B}. In fact we can rewrite this inequality to read

\displaystyle \sum_{t \in A}f_1(t)q_1(t) + \sum_{s \in B}f_2(t)q_2(s) \leq 1 - \sum_{(u,v) \in A^c \times B^c}f_1(u)f_2(v).

Here, {A^c} and {B^c} are the complements of {A} and {B}.

As you will see later one can generalize this analysis to cases where the allocation rule, {a} must satisfy additional constraints. For details see Che, Kim and Mierendorff as well as Federgruen and Groenvelt.

2. Polymatroids

Let {E} be a ground set. A real valued function {f} defined on subsets of {E} is

  • non-decreasing if {S \subseteq T \Rightarrow f(S) \leq f(T)}, and
  • {f} is submodular if {\forall S, T \subset E}

    \displaystyle f(S) + f(T) \geq f(S \cup T) + f(S \cap T).

  • {f} is supermodular if {-f} is submodular.

Let {f} be a submodular function on {E}. The polytope:

\displaystyle P(f) = \{x \in \Re^n_+: \sum_{j \in S}x_j \leq f(S)\quad \forall S \subseteq E\}

is the polymatroid associated with {(E,f)}. Notice that {P(f) \neq \emptyset} iff {f(S) \geq 0\,\, \forall S \subseteq E}. Polymatroids arise in a number of economic settings. One that we have just discussed are the Border inequalities. Suppose a single object to allocate among {n} bidders whose types are independent draws from the same distribution (with density {\pi}). Restrict attention to allocation rules that do not depend on the identity of the agent. Let {q(t)} be the interim allocation probability if an agent reports type {t}. Then, {q} is implementable iff.

\displaystyle n \sum_{t \in S}\pi(t)q(t) \leq 1 - \Pi_{t \not \in S}\pi(t)\,\, \forall S \subseteq T.

Call the right hand side of the above {f(S)} and consider the change of variables {x_t = n\pi(t)q(t)}:

\displaystyle \sum_{t \in S}x_t \leq f(S)\,\, \forall S \subseteq T.

Now {f} is a monotone polymatroid. So, the feasible region described by the Border inequalities is a polymatroid. The theorem below tells us that every extremal interim allocation probability can be implemented as a priority rule. That is, types are assigned priorities and in any profile the good is allocated to the agent with the highest priority. This connection to priority rules is useful for analyzing optimal auctions in dynamic settings (see my survey paper on dynamic mechanism design) as well as competition between auctions (see Pai’s jmp).

Theorem 3 Let {f} be a non-decreasing, integer valued, submodular function on {E} with {f( \emptyset) =0}. Then, the polymatroid {P(f)} has all integral extreme points.

Proof: Choose a weight vector {c} and order the elements of {E} by decreasing weight:

\displaystyle c_1 \geq c_2 \ldots \geq c_k \geq 0 > c_{k+1} \ldots \geq c_n.

let {S^0 = \emptyset} and {S^j = \{1, 2, \ldots, j\}} for all {j \in E}. Consider the vector {x} obtained by the following greedy procedure:

  1. {x_j = f(S^j) - f(S^{j-1})} for {1 \leq j \leq k}
  2. {x_j = 0} for {j \geq k+1}.

I show that {x \in \arg \max\{cz : z \in P(f)\}.} Note {x} is integral and non-negative because {f} is non-decreasing. To verify that {x \in P(f)} observe:

\displaystyle  \sum_{j \in T}x_j = \sum_{j \in T\cap S^k}[f(S^j) - f(S^{j-1})] \leq \sum_{j\in T\cap S^k}[f(S^j \cap T) - f(S^{j-1} \cap T)] \leq f(S^k \cap T) - f(\emptyset) \leq f(T).

The dual to our optimization problem is:

\displaystyle  \min \sum_{S \subseteq E}f(S)y_S

subject to

\displaystyle  \sum_{S \ni j}y_S \geq c_j\quad \forall j \in E

\displaystyle  y_S \geq 0 \quad \forall S \subseteq E

To show that {x} is an optimal primal solution we identify a dual feasible solution with an objective function value of {cx}. Set {y_{S^j} = c_j - c_{j+1}} for {1 \leq j < k}. Set {y_{S^k} = c_k} and {y_{S^j} = 0} for {j \geq k+1}.

Notice that {y_S \geq 0} for all {S \subseteq E} and is feasible in the dual because:

\displaystyle \sum_{S \ni j}y_S = y_{S^j} + \ldots y_{S^k} = c_j

for all {j \leq k} and

\displaystyle \sum_{S \ni j}y_S \geq 0 \geq c_j

if {j \geq k+1}.

The dual objective function value is:

\displaystyle \sum_{j=1}^{k-1}(c_j - c_{j+1})f(S^j) + c_kf(S^k) = \sum_{j=1}^kc_j[f(S^j) - f(S^{j-1})] = cx.

Now suppose {P(f)} has a fractional extreme point. Choose {c} to be the unique weight vector that is optimized at that point. However, as shown above, every weight vector has an integral point in {P(f)} that optimizes it, contradiction.{\Box}

If we drop the requirement that {f} be non-decreasing then {\max\{cz: z \in P(f)\}} can still be solved by using the greedy algorithm because

\displaystyle \max\{cz: z \in P(f)\} = \max \{cz: z \in P(f_{mon})\}.

Here {f_{mon}(S) = \min\{f(T): S \subseteq T \subseteq E\}} which is submodular and non-decreasing. Now I describe a non-constructive proof of Theorem 3 which uses an `uncrossing’ trick that is good to know. Let {y^*} be an optimal solution to the polymatroid optimization problem. Let {{\cal F} = \{S: y^*_S > 0\}}. Then, we can assume that {{\cal F}} is laminar. If true, this means that the optimal basis in the dual is TUM. But this is also an an optimal basis for the primal. Thus, there is an optimal primal solution that is integral. Ok, suppose {{\cal F}} is not laminar. Then there exist {S, T \in {\cal F}} such that {S \cap T \neq \emptyset}. Construct a new dual solution as follows: Increase {y^*_{S \cup T}, y^*_{S \cap T}} by {\epsilon} each. Decrease {y^*_S}, {y^*_T} by {\epsilon} each. This new dual solution feasible. The change in objective function value is

\displaystyle \epsilon f(S \cup T) + \epsilon f(S \cap T) - \epsilon f(S) - \epsilon f(T)

which cannot exceed zero by submodularity. This is the uncrossing step. Repeated applications produces an optimal dual solution which is laminar.

The following remarkable result on polymatroid intersection is due to Edmonds.

Theorem 4 Let {f} and {g} be non-decreasing, integer valued, submodular functions on {E} with {f( \emptyset) = g(\emptyset) =0}. Then {P(f) \cap P(g)} is integral.

Proof: Pick an arbitrary integral {c} and consider {max \{cx: x \in P(f) \cap P(g)\}}. The dual to this linear program is

\displaystyle  \min \sum_{S \subseteq E}f(S)y_S + \sum_{S \subseteq E}g(S)z_S

subject to

\displaystyle  \sum_{S \ni j}y_S + \sum_{S \ni j}z_S \geq c_j\quad \forall j \in E

\displaystyle  y_S, z_S \geq 0 \quad \forall S \subseteq E

Let {(y^*,z^*)} be an optimal dual solution. Let {{\cal F} = \{S: y^*_S > 0\}} and {{\cal G} = \{S: z^*_S > 0\}}. By uncrossing we can assume that each of these laminar. Recall from the first session that the constraint matrix associated with {{\cal F}} can be arranged so that it has exactly one non-zero entry in each row. Similarly with {{\cal G}}. Hence, the constraint matrix formed from the concatenation of {{\cal F}} and {{\cal G}} (which is also the optimal basis) is TUM. {\Box}

From the proof we get another result for free. Let {{\cal F}} and {{\cal G}} be two laminar collections on the same ground set {E}. Consider the following polyhedron:

\displaystyle  \sum_{j \in T} x_j \leq b_T\,\, \forall T \in {\cal F}

\displaystyle \sum_{j \in T} x_j \leq b'_T\,\, \forall T \in {\cal G}

As long as the {b}‘s are integral, this polyhedron is integral.

How is this useful? Return to the SS model. For any set of edges, {K} (of the form {(i,j)} with {i \in B} and {j \in S}) define {f(K)} to be the number of vertices in {B} that the edges in {K}are incident to. Similarly, define {g(K)} to be the number of vertices in {S} that edges in {K} are incident to. Note {f} and {g} are submodular. Here is an equivalent formulation of the problems of finding an efficient assignment:

\displaystyle \max \sum_{i \in B}\sum_{j \in S}u_{ij}x_{ij}

subject to

\displaystyle  x \in P(f) \cap P(g)

Why is this useful? Notice, by choosing {f} and {g} differently we can model more constraints than just the number of edges incident to a vertex is at most 1.

Bit the bullet and started a (reading) course on market design. The plan is to begin with some  lectures that cover some of  the usual stuff that is covered followed by readings. First session was on the Shapley-Shubik (SS) assignment model. A summary can be obtained by clicking on mktdesign1

The readings cover the following topics:

1) Credit Default Swaps

du & zhu

gupta & sundarjan

chernov, gorbenko & makarov

2) Bankruptcy
The Economics of Bankruptcy Reform Author(s): Philippe Aghion, Oliver Hart, John Moore Source: Journal of Law, Economics, & Organization, Vol. 8, No. 3 (Oct., 1992), pp. 523-546

New Approach to Corporate reorganization by Lucien Bebchuk, Harvard Law Review, 1988

Analysis of secured debt, Stultz and johnson, JFE, 1985

3) Healthcare Markets

Ho, Katherine 2009. “Insurer-Provider Networks in the Medical Care Market.” American Economic Review, 99(1): 393–430.

Fong and Schwarz

Arrow & cast of thousands

4)  IPO’s

Jaganathan et al

Jovanovich et al

5)  Affirmative Action
Fryer & Loury 

5) Wages, centralized matching, unraveling
Bulow & Levin

1. Shapley-Shubik Model

There is a set of buyers {B} and a set of sellers {S} each selling one unit of a good (could be divisible or not). Let {v_{ij} \geq 0} be the monetary value that buyer {j \in B} assigns to seller {i \in S}‘s object. In other words, we assume quasi-linear preferences. No buyer wants more than one of any good. By adding dummy sellers or buyers we can ensure that {|B|=|S|}.

Depending on the {v_{ij}}‘s you can assign a couple of interpretations to the SS model. First, {v_{ij} = u_j - c_i}. Here {u_i} is the value to buyer {i} of acquiring a good irrespective who sells it. In effect the sellers all sell the same good. {c_i} is the opportunity cost to seller {i}. Under this interpretation, SS is a model of bilateral trade.

Second, {v_{ij}} is the output generated when worker {j} is matched with firm {i}. Sometimes one assumes a specific functional form form for {v_{ij}}, for example, {v_{ij} = a_ib_j}. Here each agent (firm or worker) has a type and their joint output is a function of their types. Third, each seller is an advertising slot and each buyer an advertiser. Then, {a_i} is the the click through rate on slot {i} and {b_j} is the value per click to advertiser {j}. Because billions (and if one were Carl Sagan one would repeat the word `billions’) of $’s are transacted this way, it is obligatory to mention this example in a class on market design to give the whole enterprise a certain gravitas. There is a natural optimization problem one can associate with SS, which is to find an assignment of goods to buyers to maximize the total value achieved so that no buyer consumes more than one good and no seller gives more than one good. In other words, find an efficient allocation. This is called the assignment problem or maximum weight bipartite matching. The problem has a long history going back to Monge (1700’s). Indeed, well before Becker, it was Monge who considered the problem under the assumption that {v_{ij} = a_ib_j}. Since Monge’s time, the literature has split. One branch, via Kantorovich, interprets the problem in terms of coupling of measures and this has blossomed into an active and influential area of Mathematics called optimal transport of measure. It has important applications in pde’s and concentration of measure. There are also some applications in economic theory. I won’t go down that branch. The other, via Koopmans and Hitchcock, sticks to the discrete set up described here. That is the branch I will follow.

2. Shapley-Shubik (Divisible)

So that the problem of finding an efficient allocation can be expressed as an LP suppose the goods are divisible and each buyers utility for goods is linear in quantity. However, no buyer wants to consume more than one unit in total (unit demand). Let {x_{ij}} be the quantity of good {i} allocated to buyer {j}. For any {N \subseteq B} and {M \subseteq S}, let

\displaystyle  V(M,N) = \max \sum_{j \in N}\sum_{i \in M}v_{ij}x_{ij}

subject to

\displaystyle  \sum_{j \in N}x_{ij} \leq 1\quad \forall i \in M

\displaystyle  \sum_{i \in M}x_{ij} \leq 1\quad \forall j \in N

\displaystyle  x_{ij} \geq 0\quad, \forall i \in M,\: j \in N

The first constraint arises from the unit demand assumption. The second ensures no seller supplies more than they have. These constraints imply that {x_{ij} \leq 1} for all {i} and {j}.

Let {p_i} be the dual variable associated with each constraint {\sum_{j \in N}x_{ij} \leq 1} and {s_j} the dual variable associated with each constraint {\sum_{i \in M}x_{ij} \leq 1}. The dual to the above program is:

\displaystyle  \min \sum_{j \in N}s_j + \sum_{i \in M}p_i

subject to

\displaystyle  s_j + p_i \geq v_{ij}\,\, \forall j \in N,\,\forall i \in M

\displaystyle  s_j, p_i \geq 0 \,\, \forall j \in N,\,\forall i \in M

The dual has an interesting interpretation. Think of {p_i} as the price of object {i}. Given a collection of prices, the optimal solution to the dual is found by setting each {s_j} to {\{\max_{i \in M}(v_{ij} - p_i), 0\}^+.} Thus, each {s_j} represents the maximum surplus that agent {j} can receive from the consumption of a single object at prices {\{p_i\}_{i \in M}}.

At prices {\{p_i\}_{i \in M}}, buyer {j} must solve:

\displaystyle  \max \sum_{i \in M}(v_{ij}- p_j)x_{ij}

subject to

\displaystyle  \sum_{i \in M}x_{ij} \leq 1\quad \forall j \in N

\displaystyle  x_{ij} \geq 0\quad, \forall i \in M,\: j \in N

to determine its demand. It is easy to check that the optimal solution is any convex combination of objects that yield maximum surplus to the buyer.

Suppose {x^*} is an optimal primal solution and {(s^*,p^*)} an optimal dual solution. Then, the prices {p^*} `support’ the efficient allocation {x^*} in the following sense. Suppose we post a price {p^*_i} for each {i \in M}. Ask each buyer to name their demand correspondance at the posted prices. Then, it is possible to give each agent one of their demanded bundles and not exceed supply . To see why, recall complementary slackness:

\displaystyle (s^*_j + p^*_i - v_{ij})x^*_{ij} = 0.

So, if {x^*_{ij} > 0} it follows that {s^*_j = v_{ij} - p^*_i = \{\max_{r \in M}(v_{rj} - p^*_r), 0\}^+.} Hence, in this economy there is a set of prices that can be posted for each good so as to balance supply with demand. In other words, LP duality gives us the existence of Walrasian prices in this economy.

The set of feasible dual solutions forms a lattice with respect to the partial order {(s, p) \succeq (s', p')} if and only if {(-s,p) \geq (-s',p')}. In this partial order there is a smallest element, which corresponds to the smallest Walrasian price. This can be determined by choosing from among all optimal dual solutions one that minimizes {\sum_{i \in M}p_i} (equivalently maximizes total surplus to buyers).

Hold {M} fixed and consider {V( \cdot, M)} as a function of subsets of {B}. Then, {V( \cdot, M)} is non-decreasing. It is also submodular. Why? From the dual we see that it is obtained by minimizing a linear (i.e. submodular) function over a lattice. As submodularity is preserved under minimization, it follows that {V(N, M)} is submodular in {N}. A similar argument shows that {V(N,M)} is supermodular in {M}. This submodularity/supermodularity is useful in deriving certain incentive properties of the efficient allocation.

One can associate a tu co-op game with the assignment model. For any coalition of buyers ({N}) and sellers ({N}) let {V(N,M)} be the value of the coalition. The core, {C(B,S)} will be the set of solutions to the following:

\displaystyle \sum_{j \in B}s_j + \sum_{i \in S}p_i = V(B,S)

\displaystyle \sum_{j \in N}s_j + \sum_{i \in M}p_i \geq V(N,M)\,\, \forall N \subseteq B, \,\, M \subseteq S

The core has the usual interpretation. It is the set of allocations that cannot be blocked by any coalition of buyers and sellers.

Consider the following dual program (DP):

\displaystyle  V(B,S) = \min \sum_{j \in B}s_j + \sum_{i \in S}p_i

subject to

\displaystyle  s_j + p_i \geq v_{ij}\,\, \forall j \in B,\,\,\forall i \in S

\displaystyle  s_j, p_i \geq 0 \,\, \forall j \in B,\,\forall i \in S

It is straight forward to check that every optimal solution to (DP) is in {C(B,S)} and every point in {C(B,S)} corresponds to an optimal solution to (DP). Why is this interesting? It says that most of the constraints that define the core are redundant. It suffices to consider only pairs of agents consisting of one buyer and one seller. If the core is the set of efficient allocations immune to deviations by any subset of agents, then equivalence to (DP) says that it suffices to consider only pairwise deviations.

From the core, consider the following two relations:

\displaystyle \sum_{j \in B}s_j + \sum_{i \in S}p_i = V(B,S)

\displaystyle \sum_{j \in B \setminus k}s_j + \sum_{i \in S}p_i \geq V(B \setminus k,S)

Negate the second and add to the first:

\displaystyle s_k \leq V(B,S) - V(B \setminus k, M).

The term on the right hand side is buyer {k}‘s marginal product. Hence, no point in the core can give any buyer more than their marginal product. Submodularity of {V(\cdot, S)} implies that there is a point in the core that gives each buyer their marginal product (an old result of Shapley’s). What is this point? It is the one that corresponds to the minimal Walrasian price, i.e., the optimal solution to (DP) that maximizes {\sum_{j \in B}s_j}.

Suppose sellers are non-strategic and we run a Vickrey auction to allocate the goods amongst the buyers. Then, each buyer would walk away with a surplus equal to their marginal product (which is what the Vickrey auction gives buyers to give them the incentive to reveal their valuations). Thus, in this assignment economy, minimal Walrasian prices correspond to the prices that would be charged in a Vickrey auction.

3. Shapley-Shubik (Indivisible)

Now suppose the goods in SS are indivisible. I’m going to show the results outlined above carry through unchanged. To do that I need to tell you about totally unimodular matrices.

3.1. Total Unimodularity

A matrix is called totally unimodular (TUM) iff the determinant of each square submatrix has value 1, -1 or 0. If a matrix is TUM then so is its transpose. If {A} and {E} are TUM, then so is {AE}.

Example 1 The following matrix

\displaystyle  \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ \end{bmatrix}

is TUM. The following is not:

\displaystyle  \begin{bmatrix} 1 & 1 & 0 \\ 0 & 1 & 1 \\ 1 & 0 &1\\ \end{bmatrix}

Every proper square submatrix has determinant with absolute value 0 or 1, but the determinant of the entire matrix is 2.

Theorem 1 Let {A} be a {m \times n} TUM matrix. Let {b} be a {m \times 1} integral vector. Then, every extreme point of {\{Ax = b, x \geq 0\}} is integral.

Proof: To every extreme point {w} of {\{Ax = b, x \geq 0\}} there is a basis of {A} such that {w = B^{-1}b}. By Cramer’s rule, we can write {B^{-1} = {B^*}/{\det B}} where {B^*} is the adjoint of {B}. Since {A} has all integer entries, {B^*} has all integer entires. Since {A} is TUM and {B} is non-singular, it follows that {|\det B| = 1}. Hence {B^{-1}} has all integer entries. Thus, {B^{-1}b} is integral.{\Box}

Paul Seymour gave a complete (polytime) characterization of TUM matrices. The upshot of it is that most TUM matrices are network matrices. {A} is a network matrix if {a_{ij} = 0, 1, -1} for all {i,j} and each column contains at most two non-zero entires of opposite sign.

Theorem 2 If {A} is a network matrix, then {A} is TUM.

Proof: Proof is by induction on the size of a submatrix. Consider a {k \times k} square submatrix of {A}, call it {C}. If each column of {C} has exactly two non-zero entries then {\det C = 0}. Why? Summing up the rows of {C} gives us zero, meaning that the rows of {C} are linearly dependent. If there is a column of {C} that contains exactly one non-zero entry, then compute the determinant of {C} using the minor associated with this entry. By induction the determinant must have value, 0,1, -1. A column with all zeros means that {\det C = 0}.{\Box}

3.2. Back to SS

I’ll show that the constraint matrix for the assignment program that defines {V(B,S)} is TUM. This would mean that there is always an efficient allocation which produces an integral allocation.

Fix a good {i} and agent {j}. Consider the column associated with the variable {x_{ij}}. The variable appears with a coefficient of 1 in exactly two rows. One occurs in a row corresponding to agent {j} and the other to a row corresponding to object {i}. Let {L} consist of all rows corresponding to objects and {R} the set of all rows corresponding to agents. Multiply all the rows in {L} by -1. We now have a constraint matrix where each column contains exactly two non-zero entries of opposite sign.

3.3. Ascending Auctions

The equivalence between minimal Walrasian prices and Vickrey prices in SS means that the Vickrey outcome can be obtained from a t\^{a}tonnement process that terminates in the minimal Walrasian price. Two have been proposed in the literature. One by Crawford and Knoer and the other Demange, Gale and Sotomayor (see also Leonard). Both a variations of dual ascent algorithms for solving the LP formulation of the assignment model.

I’ll outline Demange, Gale and Sotomayor (DGS). Assume that the {v_{ij}}‘s are integral. In each iteration we have a vector of prices {\{p_i\}_{i \in S}}. Initially all prices are set to zero. In each iteration buyers report their demand correspondance. That is, the set of goods that maximize their surplus at current prices. Let {D_j(p)} be the demand correspondance of buyer {j}. Consider now the bipartite graph defined on {B \cup S} as follows: an edge {(i,j)} exists iff {i \in D_j(p)}. If this graph contains a perfect matching, stop. At current prices demand equals supply. A perfect matching means that there is a way to give each {j \in B} an {i \in S} such that {i \in D_j(p)} and no good is allocated to more than one buyer. If a perfect matching does not exist, by the Hall marriage theorem (I will state and prove this later), there must exists a set {N \subseteq B} such that {|N| > |\cup_{j \in N}D_j(p)|.} The set {\cup_{j \in N}D_j(p)} is called overdemanded. Identify a minimally overdemanded set and increase the price of each good in this set by 1 unit. Repeat.

4. Bilateral Trade

Lets put this machinery to work on bilateral trade. This is the case when {v_{ij} = u_j-c_i}. Suppose {u_j} is the private information of buyer {j} and {c_i} is the private information of seller {i}.

  1. The core of the associated tu game is non-empty.
  2. The point in the core that maximizes the total surplus to buyers makes it a dominant strategy for them to reveal their private information.
  3. The point in the core that maximizes the total profit to the sellers makes it a dominant strategy for them to reveal their private information.
  4. In general, there is no point in the core that is jointly `best’ for buyers and sellers. Hence, there is no way to obtain an outcome in the core for which it is a dominant strategy for both sides to reveal their private information.

We have the outlines of an archetypal matching paper. First, show that a stable or core outcome exists. Second, show that when only one side is strategic, one can implement the core outcome in an incentive compatible way. Third, observe that it is impossible to implement the core outcome when both sides are strategic. Fourth, show that as the economy gets large, one can implement something asymptotically close to the core in an incentive compatible way (or implement a core outcome in an asymptotically incentive compatible way). So, lets do the fourth.

The idea is due to Preston McAfee. Order the buyers so that {u_1 \geq u_2 \geq \ldots \geq u_n}. Order the sellers so that {c_1 \leq c_2 \leq \ldots \leq c_n}. The efficient allocation can be computed by determining the largest {k} such that {u_k \geq c_k}. Buyer {i} is matched with seller {i} for all {i \leq k}. McAfee suggests stopping at {k-1}. Charge all buyers {i \leq k-1} a price of {b_k}. Pay all sellers {i \leq k-1}, {c_k}. What each agent pays or receives does not depend on their reports. So, reporting their private information is a dominant strategy. Further, the mechanism runs a slight surplus and is individual rational. However, it is not efficient. The efficiency loss is {u_k-c_k}. Assuming {u_i}‘s and {c_i}‘s are independent draws from a distribution with bounded support, the percentage loss in efficiency approaches zero as {n \rightarrow \infty}.

Alternatively, one can implement the Vickrey outcome. In this case each buyer pays {b_{k+1}} and each seller receives {c_{k+1}}. The deficit of the Vickrey auction will grow like {k|b_{k+1} - c_{k+1}|}. One can then use properties of order statistics and the Kolmogorv-Smirnov bounds to show that the deficit goes to zero as {n \rightarrow \infty}.

5. More on TUM

Recall the constraints for the assignment model:

\displaystyle  \sum_{j \in B}x_{ij} \leq 1\quad \forall i \in S

\displaystyle  \sum_{i \in S}x_{ij} \leq 1\quad \forall j \in B

An integer solution, {x^*}, to these constraints defines a permutation matrix whose {(i,j)^{th}} entry is {x^*_{ij}}. A fractional solution to these constraints is a doubly stochastic matrix. TUM of the constraint matrix means that every doubly stochastic matrix is a convex combination of permutation matrices. This is is known as the Birkhoff-von Neuman theorem. Alternatively, the assignment constraints define the convex hull of doubly stochastic matrices.

For our purposes, TUM means that every fractional solution to the assignment constraints can be interpreted as a lottery over integer assignments. Thus, the assignment constraints give us a succinct description of the set of all lotteries over integer assignments. This is will be useful in applications when we search for randomized allocation rules satisfying other properties.

Network matrices are an important class of TUM matrices. There are matrices that don’t appear to be network matrices but after certain elementary row operations can be converted into network matrices. One such class is the set of 0-1 matrices with the consecutive 1’s property (C1). A 0-1 matrix has the consecutive 1’s property if there is a permutation of its rows such that the non-zero entries in each column are consecutive. The following matrix is C1:

\displaystyle  \begin{bmatrix} 1 & 0 & 0 \\ 1 & 1 & 0 \\ 1 & 1 & 0\\ 1 & 0 & 1\\ 0 & 0 & 1\\ \end{bmatrix}

C1 matrices arise in a variety of applications (interval graphs, cyclic staffing). Fulkerson and Gross were the first to give a polytime algorithm for recognizing C1 matrices. The following is due to Veinott and Wagner (1962).

Theorem 3 If {A} is a C1 matrix, then, {A} is TUM.

Proof: Suppose the rows of {A} have already been permuted so that the columns have the consecutive 1’s property. Suppose that {A} is {n \times n}. Define {E} to be the following {n \times n} matrix:

  1. For all {i < n}, {e_{ii} = 1}, {e_{i, i+1} = -1}.
  2. For {i = n}, {e_{nn} = 1}.
  3. For all {i} and {j \neq i+1}, {e_{ij} = 0}.

Here is a {5 \times 5} example of {E}:

\displaystyle  \begin{bmatrix} 1 & -1 & 0 & 0 & 0 \\ 0 & 1 & -1 & 0 & 0\\ 0 & 0 & 1 & -1 & 0\\ 0 & 0 & 0 & 1 & -1\\ 0 & 0 & 0 & 0 & 1\\ \end{bmatrix}

To complete the proof it suffices to verify that {E} is TUM and {EA} is a network matrix. Note that pre-multiplying {A} by {E} corresponds to negating row {i+1} of {A} and adding it to row {i} of {A}. {\Box}

I turn now to a class of C1 matrices that will be useful later.

Let {N} be a ground set of elements and {{\cal F}} a collection of subsets of {N}. {{\cal F}} is called laminar if for any {S, T \in {\cal F}} either {S \subset T}, {T \subset S} or {S \cap T = \emptyset}. If one drops the condition that {S \cap T = \emptyset}, then {{\cal F}} is called a chain.

Given a collection of subsets, {{\cal F}} we can represent it using a 0-1 matrix as follows. A column for each member of {{\cal F}} and a row for each element of {N}. Set {a_{ij} = 1} if the set corresponding to column {j} contains {i \in N}. Its easy to see that if {{\cal F}} is laminar, then {A} is C1. Call a 0-1 matrix that arises in this way a laminar matrix.

In fact, {A} is `equivalent’ to a 0-1 matrix with exactly one non-zero entry in each row. Here is is how. Pick any two columns of {A}, {j} and {k}. Let {S_j} and {S_k} be the sets they correspond to in {{\cal F}}. Suppose {S_j \subseteq S_k}. Negate column {j} and add it to column {k}. Note that this can at most flip the sign of the determinant of any square submatrix of {A}. Repeat. The result is a 0-1 matrix whose columns are disjoint, i.e., exactly one non-zero entry in each row.


Get every new post delivered to your Inbox.

Join 162 other followers