You are currently browsing the tag archive for the ‘learning’ tag.
In the lasts posts I talked about a Bayesian agent in a stationary environment. The flagship example was tossing a coin with uncertainty about the parameter. As time goes by, he learns the parameter. I hinted about the distinction between `learning the parameter’, and `learning to make predictions about the future as if you knew the parameter’. The former seems to imply the latter almost by definition, but this is not so.
Because of its simplicity, the i.i.d. example is in fact somewhat misleading for my purposes in this post. If you toss a coin then your belief about the parameter of the coin determines your belief about the outcome tomorrow: if at some point your belief about the parameter is given by some then your prediction about the outcome tomorrow will be the expectation of . But in a more general stationary environment, your prediction about the outcome tomorrow depends on your current belief about the parameter and also on what you have seen in the past. For example, if the process is Markov with an unknown transition matrix then to make a probabilistic prediction about the outcome tomorrow you first form a belief about the transition matrix and then uses it to predict the outcome tomorrow given the outcome today. The hidden markov case is even more complicated, and it gives rise to the distinction between the two notions of learning.
The formulation of the idea of `learning to make predictions’ goes through merging. The definition traces back at least to Blackwell and Dubins. It was popularized in game theory by the Ehuds, who used Blackwell and Dubins’ theorem to prove that rational players will end up playing approximate Nash Equilibrium. In this post I will not explicitly define merging. My goal is to give an example for the `weird’ things that can happen when one moves from the i.i.d. case to an arbitrary stationary environment. Even if you didn’t follow my previous posts, I hope the following example will be intriguing for its own sake.
A Bayesian agent is observing a sequence of outcomes in . The agent does not know the outcomes in advance, so he forms some belief over sequences of outcomes. Suppose that the agent believes that the number of successes in consecutive outcomes is distributed uniformly in and that all configuration with successes are equally likely:
for every where .
You have seen this belief already though maybe not in this form. It is a belief of an agent who tosses an i.i.d. coin and has some uncertainty over the parameter of the coin, given by a uniform distribution over .
In this post I am gonna make a fuss about the fact that as time goes by the agent learns the parameter of the coin. The word `learning’ has several legitimate formalizations and today I am talking about the oldest and probably the most important one — consistency of posterior beliefs. My focus is somewhat different from that of textbooks because 1) As in the first paragraph, my starting point is the belief about outcome sequences, before there are any parameters and 2) I emphasize some aspects of consistency which are unsatisfactory in the sense that they don’t really capture our intuition about learning. Of course this is all part of the grand marketing campaign for my paper with Nabil, which uses a different notion of learning, so this discussion of consistency is a bit of a sidetrack. But I have already came across some VIP who i suspect was unaware of the distinction between different formulations of learning, and it wasn’t easy to treat his cocky blabbering in a respectful way. So it’s better to start with the basics.
When I give a presentation about expert testing there is often a moment in which it dawns for the first time on somebody in the audience that I am not assuming that the processes are stationary or i.i.d. This is understandable. In most modeling sciences and in statistics stationarity is a natural assumption about a stochastic process and is often made without stating. In fact most processes one comes around are stationary or some derivation of a stationary process (think the white noise, or i.i.d. sampling, or markov chains in their steady state). On the other hand, most game theorists and micro-economists who work with uncertainty don’t know what is a stationary process even if they have heard the word (This is a time for you to pause and ask yourself if you know what’s stationary process). So a couple of introductory words about stationary processes is a good starting point to promote my paper with Nabil
First, a definition: A stationary process is a sequence of random variables such that the joint distribution of is the same for all -s. More explicitly, suppose that the variables assume values in some finite set of outcomes. Stationarity means that for every , the probability is independent in . As usual, one can talk in the language of random variables or in the language of distributions, which we Bayesianists also call beliefs. A belief about the infinite future is stationary if it is the distribution of a stationary process.
Stationarity means that Bob, who starts observing the process at day , does not view this specific day as having any cosmic significance. When Alice arrives two weeks later at day and starts observing the process she has the same belief about her future as Bob had when he first arrives (Note that Bob’s view at day about what comes ahead might be different from Alice’s since he has learned something meanwhile, more on that later). In other words, each agent can denote by the first day in which they start observing the process, but there is nothing in the process itself that day corresponds to. In fact, when talking about stationary processes it will clear our thinking if we think of them as having infinite past and infinite future . We just happen to pop up at day .