You are currently browsing the tag archive for the ‘expert testing’ tag.
Department of self-promotion: sequential tests, Blackwell games and the axiom of determinacy.
Recap: At every day an outcome is realized (`‘ means a rainy day, `‘ a non-rainy day). An expert claims to know the distribution of the stochastic process that generates the sequence of outcomes. A sequential test is given by a Borel set of infinite sequences of predictions and outcomes. An expert who delivers forecast fails on a realization if where is the prediction made by about given the previous outcomes .
Let me remind you what I mean by `a test that does not reject the truth with probability ‘. I am going to write an equivalent definition to the one I used up to now, this time using the language of stochastic processes.
We are going to prove the following theorem:
So a charlatan, who doesn’t know anything about the true distribution of the process, can randomize a forecast according to and pass the test with high probability regardless of the actual distribution.
— Discussion —
Before we delve into the proof of the theorem, a couple of words about where we are. Recall that a forecast specifies the Expert’s prediction about rain tomorrow after every possible history . We denote by the set of all such forecasts. The most general tests are given by a function , and specify for every such the set of realizations over which the forecast fails. Since is infinite we know that there exist tests that passes a true expert and are not manipulable by a strategic charlatan.
Sequential tests have the additional property that the test’s verdict depends only on predictions made by along the realized path: When deciding whether a forecast passes or fails when the realization is the test only considers the predictions made by along x. We also say that the test does not depend on counter-factual predictions, i.e. predictions about the probability of rainy day after histories that never happen. It seems that counter-factual predictions would be irrelevant to testing anyway, but, as the theorem shows, if the test does not use counter-factual prediction then it is manipulable.
One situation in which sequential tests are the only available tests is when, instead of providing his entire forecast before any outcome is realized, at every day the expert only provides his prediction about the outcome of day just before it is realized. At infinity, all the information available to the tester is the sequence of predictions and realized outcomes.
— Sketch of Proof —
We can transform the expert’s story to a two-player normal form zero-sum game as we did before: Nature chooses a realization and Expert chooses a forecast . Then Expert pays Nature if fails on and otherwise. The fact that the test does not reject the true expert translates to the fact that the maximin of the game is small. If we knew that the minimax is also small then an optimal mixed strategy for the Expert will satisfy (3). We only need to prove the existence of value, or as game theorists say, that the game is determined.
Unfortunately, this time we cannot use Fan’s Theorem since we made no topological assumption about the set , so there is no hope to get semi-continuity of the payoff function. Indeed, as we shall see in a moment, the Normal form representation misses an important part of the expert’s story. Instead of using a normal form game, we are going to write the game in extensive form. I will call this game .
- The game is played in stages .
- At stage Nature chooses an outcome and Expert chooses a prediction simultaneously and independently.
- Nature does not monitor past actions of Expert.
- Expert monitors past actions of Nature.
- At infinity, Expert pays Nature if and otherwise.
Now I am going to assume that you are familiar with the concept of strategy in extensive form game, and are aware of Kuhn’s Theorem about the equivalence between behavioral strategies and mixtures of pure strategies (I will make implicit uses of both directions of Kuhn’s Theorem in what follows). We can then look at the normal form representation of this game, in which the players choose pure strategies. A moment’s thought will convince you that this is exactly the game from the previous paragraph: Nature’s set of pure strategies is , Expert’s set of pure strategies is and the payoff for a strategy profile is . So far no real gain. Extensive form games in which one of the players don’t monitor opponent’s actions need not be determined. In order to get a game with a value we are going to twist the game , and allow Nature to observe past actions of the Expert player. This makes life more difficult for the Expert. Up to a minor inaccuracy which I will comment about later, the resulting game is what’s called Blackwell game and it admits a value by (a seminal theorem of Donald Martin).
Here is the game after the twist. I call this game .
- The game is played in stages .
- At stage Nature chooses an outcome and Expert chooses a prediction simultaneously and independently.
- Each player monitors past actions of the opponent.
- At infinity, Expert pays Nature if and otherwise.
Now if you internalized the method of proving manipulability that I was advocating in the previous two episodes, you know what’s left to prove: that the maximin of is small, i.e. that, fore every strategy of Nature, Expert has a response that makes the payoff at most . We know this is true for the game but in Nature is more powerful.
Here is the most important insight of the proof: The fact that an expert who knows the distribution of the process can somehow pass the test implies that the maximin in is small, but this fact alone doesn’t say anything about the maximin of . To show that the maximin of is also small we will use the fact that the way such an expert passes the test is by providing the correct forecast. Until now the distinction was not really important to us. Now it comes into play.
Let be a behavioral strategy of Nature in , i.e. a contingent plan that specifies the probability that Nature play after every. Let be the pure strategy of Expert in that is given by
So the pure action taken by the Expert player at day is the mixed action that Nature is going to take at day according to her strategy . Now assume that Nature follows and Expert follows . Let and be the random variables representing the actions taken by Expert and Nature at day . Then the stochastic process satisfies (2). Therefore from (1) we get that the expected payoff when Nature plays and Expert plays is indeed smaller than .
Now for the minor inaccuracy that I mentioned: For Martin’s Theorem we need the set of actions at every stage to be finite. We can handle this obstacle by restricting Expert’s action at every stage to a grid and applying the coupling argument.
— Non-Borel Tests —
What about pathological sequential tests that are given by a non-Borel set ? Well, if, like me, you find it preposterously impossible to choose one sock from each of infinitely many pairs of socks, then perhaps you live in the AD paradise. Here every set is measurable, Blackwell Games are determined even when the payoff function is not Borel, and Theorem 2 is true without the assumption that is Borel. See, the AD universe is a paradise for charlatans, since they can do as well as the true experts.
If, on the other hand, you subscribe to the axiom of choice, then you have a non-manipulable test:
— Summary —
If you plan to remember one conclusion from my last three posts, I suggest you pick this: There exist non-manipulable tests, but they must rely on counter-factual predictions, or be extremely pathological.
Muchas gracias to everyone who read to this point. Did I mention that I have a paper about this stuff ?
In which I talked about Olszewksi and Sandroni’s paper `Manipulability of future independent tests’ and coupling of stochastic processes.
Recap Every day Nature randomizes the weather for that day ( for rain, for no rain). An Expert delivers a forecast which he claims governs Nature’s selection, so that the expert claims that is the probability of rain at day when the weather in previous days was . We denote by the set of all forecasts. We compare the expert’s forecast to the actual infinite realization of weather. A test function is given by . An expert who provided a forecast is rejected if . We say that the test does not reject the true expert with probability if for every forecast , where is the distribution over realizations induced by .
We now look at a special class of tests, which are sequential and also reject the expert on finite times. Such tests are given by a set of finite sequences of predictions and outcomes: Let be the outcome of day and be the prediction about day generated by the forecast . Then the expert fails if after some day it turns out that . Thus,
Note that it is possible (for some choices of — see comment below) that the expert is never off the hook — there is always the potential that he will fail the test. (On the other hand, in Sandroni’s Theorem with set of realizations at day there is a final verdict, either the expert pass or failed.)
Theorem 1 (Olszewski and Sandroni (2009)) Let be a test of the form (1) that does not reject the true expert with probability . Then for every there exists such that
Again, if the expert randomizes his forecasts according to a distribution , he is unlikely to fail the test regardless of the actual realization . In fact, Olszewksi and Sandroni state and prove their theorem under weaker condition on the test, which they call `future independence’. The proof is the same as the proof given below, so I can skip the definition the future independence without depriving you of any math.
The proof of the theorem follows the idea of the proof of Sandroni’s Theorem for the case of finite set of realizations. We define a game between two players, Nature and Expert: Nature chooses a realization and Expert chooses a forecast , and the payoff that Expert pays Nature is given by , where is the indicator function of . In order to use Fan’s Theorem we will need a small twist on the game: We are going to restrict the Expert’s pure strategies set in the game to be such that the prediction of day is in some finite grid. (it is a good idea at this point to figure out for yourself what goes wrong if we don’t add this restriction):
Fix . We denote by the set of forecasts defined below, in which the experts predictions are restricted at every day to a finite grid in :
The following Lemma, which I will prove later, will imply that the Expert player in the game does not lose much if we restrict his strategy set to .
— Proof of Theorem —
Consider a two-player zero sum games between Nature and Expert: Nature chooses a realization and Expert, simultaneously and independently, chooses a forecast . Then Expert pays Nature an amount .
We first claim that the maximin of the game is at most . Indeed, let be a mixed strategy of Nature. Then for some . Let satisfy (3). If Nature plays and Expert responds with then the expected payoff that Expert pays Nature is
where the equality follows from the definition of , the first inequality from Lemma 2 and the second inequality from the fact that does not reject the truth with probability .
If we can show that the minimax is smaller than then an optimal mixed strategy of the Expert in the game will satisfy (2). All is left to do is to check that the conditions of Fan’s Theorem are satisifed, so that the minimax equals the maximin. Indeed, the set of pure strategies of the Expert is compact as a countable product of finite sets, each equipped with the discrete topology. Moreover, in this topology is l.s.c. for every (Skip the next paragraph if this is obvious).
Lower semi-continuity means that whenever . For the case of indicator functions the inequality means that if then for sufficiently large . So let and assume that . Then for some where . Convergence of to in the product topology is pointwise convergence, so for every . Since for every the last limit means that for sufficiently large . Take large enough so that the equality is satisfied for every of length or smaller. Then for such the predictions made by along equal those made by until day . It follows that , as desired.
— Weather coupling —
The best way I know to prove Lemma 2 is by using coupling. This is a powerful technique to prove relations between two random variables by embedding them in the same probability space.
Imagine that Zeus randomizes the weather in Athens according to and Baal randomizes the weather in Askelon according to .
One way Zeus can perform the randomization task is using a sequence of i.i.d random variables distributed uniformly on : After he decided on the weather Zeus makes the -th rainy (i.e. decide that ) iff . Assume now that Baal uses the same technique only with instead of to randomize the weather in Askelon and that Baal uses the same sequence of i.i.d uniform random variables that Zeus uses.
Then the distribution of the weather in Askelon is . Moreover because of (3), the weather in Askelon is likely to be the same as the weather in Athens. In fact the probability that the weather will differ at some day is at most ! Now is the probability that the realized weather sequence in Athens is in and is the probability that the realized weather sequence in Askelon is in . Since the probability that these sequences differ is smaller than it follows that , as desired.
I am visiting the rationality center in the Hebrew University, and I am presenting some papers from the expert testing literature. Here are the lecture notes for the first talk. If you read this and find typos please let me know. The next paragraph contains the background story, and can be safely skipped.
A self-proclaimed expert opens a shop with a sign at the door that says `Here you can buy probabilities’. So the expert is a kind of a fortune-teller, he provides a service, or a product, and the product that the expert provides is a real number: the probability of some event or more generally the distribution of some random variable. You can ask for the probability of rain tomorrow, give the expert some green papers with a picture of George Washington and receive in return a paper with a real number between 0 and 1. The testing literature asks whether you can, after the fact, check the quality of the product you got from the expert, i.e. whether the expert gave you the correct probability or whether he just emptied your pocket for a worthless number.
So, let be a set of realizations. Nature randomizes an element from according to some distribution and an expert claims to know Nature’s distribution. A test is given by a function : the expert delivers a forecast and fails if the realization turned out to be in . A good test will be such that only `true’ experts, i.e. those who deliver the correct , will not fail.
— Manipulability —
I start with Sandroni’s paper (pdf). The following definition formalises the idea that the true expert is unlikely to fail the test
Definition 1 The test does not reject the truth with probability if for every .
If a test does not reject the truth with probability then a true expert, who knows how nature randomizes the realization, is unlikely to fail the test. Alas, a charlatan who acts strategically is also unlikely to fail
Theorem 2 (Sandroni (2003)) Let be finite. If is a test that does not reject the true expert with probability then there exists such that for every .
So, a charlatan who knows nothing about how Nature chooses can randomize a forecast according to , and is unlikely to fail, regardless of the realization. We say that the test is manipulable.
For Sandroni’s Theorem we do not need to assume any structure on . However the situation we have in mind is that for some finite set of outcome. So at every day Nature randomizes an outcome of and the Expert claims to know the stochastic process that governs Nature’s choices.
Proof: Let be a finite set such that .
Consider the following two-player zero-sum game with the players called Nature and Expert. Nature is the maximizer with pure strategies set , and Expert is the minimizer, with pure strategies set . If Nature plays and Expert plays then Expert pays Nature if and otherwise.
By von-Neumann’s Minimax Theorem the game admits a value in mixed strategies, so that the maximin equals the minimax. We claim that . Indeed, let be a mixed strategy of Nature and let be such that . Then the expected payoff that Expert pays Nature if Nature plays and Expert plays is since the game does not reject the true expert with probability .
Now let be a mixed optimal strategy of the Expert in the game. Then
for every pure strategy of Nature, since the left hand side is the expected payoff that the Expert pays Nature if he plays and Nature plays .
The argument in the proof of Sandroni’s Theorem captures the essence of all the manipulability theorems we will see. We define a zero-sum game between two players, Nature and Expert. The minimax theorem is the core of the proof: The fact that the maximin is smaller than follows from the assumption that the test is unlikely to reject the true expert. The fact that minimax is smaller than implies that the test is manipulable. In the middle, we use the most wonderful miracle that the minimax and maximin are equal.
— Fan’s Theorem —
Here is the more general minimax theorem which we will use later: In a zero-sum game, if one of the players have a compact set of pure strategies, and the payoff to that player is u.s.c. in his own strategy then the game admits a value in mixed strategies.
Proposition 3 (Ky Fan)Consider a two-player zero-sum game in normal form with pure strategy sets and payoff function . If is a compact metrizable space and is upper-semi continuous for every then the game has a value in mixed strategies, i.e.
and all suprema are attained.
The set is not topologized and is the set of distributions over with finite support, so the integral on the right side is just a summation. The challenge in using Fan’s Theorem is to find a topology over which is sufficiently weak to make compact and sufficiently strong to make the functions upper semi-continuous.
— Non-manipulability —
Sandroni’s Theorem does not hold when is infinite, precisely because the game we defined in the proof will have no value. We now take a game without a value and translate it to a non-manipulable test for an infinite set of realizations. I only know about one game without a value (`I know a larger such game’ quipped Sergiu when I said it in the talk): Two players pick simultaneously and independently natural numbers and the player that chose the largest number wins. Here is the translation of the game to a test:
Example 1 Let . Let be the test that is given by
Then does not reject the true expert with probability , and for every there exists a realization such that
Not only the test is not manipulable, also for every that a charlatan might employs to randomize his worthless forecast , there is some realization under which he fails with high probability.
Proof: For every let ( for tail) be given by . Then can be equivalently written as .
We first show that the test has small probability to reject a true expert. Indeed,
for every , where the first equality follows from the definition of and the second from the definition of .
Now let . Let be the push-forward of under and let . Then
The first equality follows from the definition of , the second from the definition of and the inequality from the choice of .
— Sequential Tests —
So, Sandroni’s Theorem only holds for finite sets . We are now aiming at proving a manipulability theorem for infinite set by adding some structure on the set and some restrictions on the test. From now on we assume that . The set is the set of outcomes. Everything works for arbitrary finite set of outcomes but I stick with two outcomes for notational convenience. Note that a compact metrizable space in the product topology.
There is an almost equivalent representation of and element as the conditional probability that given for a -randomly chosen . Let be the set of functions , then this gives rise to natural map , which is surjective and continuous when the domain is equipped with the product topology and the range with the weak topology. It will be useful to identify with , so I will think of a test as a function .
For every and every realization , the sequence of forecasts of along is given by where .
Definition 4 A test is sequential if implies for every and every realization such that the forecasts made by and along are the same.
Equivalently, a sequential test is given by a subset ( for rejection): iff for every realization and every forecast , where is the sequence of predictions of along .
— Next Episode (Thursday 12:00) —
I will talk about Olszewski and Sandroni’s paper `The manipulability of future independent tests’ and my paper `Many inspections are manipulable’ . Main goal is to prove that sequential tests are always manipulable.