This post describes the main theorem in my new paper with Nabil. Scroll down for open questions following this theorem. The theorem asserts that a Bayesian agent in a stationary environment will learn to make predictions as if he knew the data generating process, so that the as time goes by structural uncertainty dissipates. The standard example is when the sequence of outcomes is i.i.d. with an unknown parameter. As times goes by the agent learns the parameter.
The formulation of `learning to make predictions’ goes through merging, which traces back to Blackwell and Dubins. I will not give Blackwell and Dubin’s definition in this post but a weaker definition, suggested by Kalai and Lehrer.
A Bayesian agent observes an infinite sequence of outcomes from a finite set . Let
represent the agent’s belief about the future outcomes. Suppose that before observing every day’s outcome the agent makes a probabilistic prediction about it. I denote by
the element in
which represents the agent’s prediction about the outcome of day
just after he observed the outcomes
of previous days. In the following definition it is instructive to think about
as the true data generating process, i.e., the process that generates the sequence of outcomes, which may be different from the agent’s belief.
Definition 1 (Kalai and Lehrer) Let
. Then
merges with
if for
-almost every realization
it holds that
Assume now that the agent’s belief is stationary, and let
be its ergodic decomposition. Recall that in this decomposition
ranges over ergodic beliefs and
represents structural uncertainty. Does the agent learn to make predictions ? Using the definition of merging we can ask, does
merges with
? The answer, perhaps surprisingly, is no. I gave an example in my previous post.
Let me now move to a weaker definition of merging, that was first suggested by Lehrer and Smorodinsky. This definition requires the agent to make correct predictions in almost every period.
Definition 2 Let
. Then
weakly merges with
if
-almost every realization
it holds that
for a set
of periods of density
.
The definition of weak merging is natural: patient agents whose belief weakly merges with the true data generating process will make almost optimal decisions. Kalai, Lehrer and Smorodinsky discuss these notions of mergings and also their relationship with Dawid’s idea of calibration.
I am now in a position to state the theorem I have been talking about for two months:
Theorem 3 Let
be stationary, and let
be its ergodic decomposition. Then
weakly merges with
for
-almost every
.
In words: An agent who has some structural uncertainty about the data generating process will learn to make predictions in most periods as if he knew the data generating process.
Finally, here are the promised open questions. They deal with the two qualification in the theorem. The first question is about the “-almost every
” in the theorem. As Larry Wasserman mentioned this is unsatisfactory in some senses. So,
Question 1 Does there exists a stationary
(equivalently a belief
over ergodic beliefs) such that
weakly merges with
for every ergodic distribution
?
The second question is about strengthening weak merging to merging. We already know that this cannot be done for arbitrary belief over ergodic processes, but what if
is concentrated on some natural family of processes, for example hidden markov processes with a bounded number of hidden states ? Here is the simplest setup for which I don’t know the answer.
Question 2 The outcome of the stock market at every day is either U or D (up or down). An agent believes that this outcome is a stochastic function of an unobserved (hidden) state of the economy which can be either G or B (good or bad): When the hidden state is B the outcome is U with probability
(and D with probability
), and when the state is G the outcome is U with probability
. The hidden state changes according to a markov process with transition probability
,
. The parameter is
and the agent has some prior
over the parameter. Does the agent’s belief about outcomes merge with the truth for
-almost every
?.
1 comment
October 1, 2014 at 3:04 pm
Somewhere else, part 169 | Freakonometrics
[…] https://theoryclass.wordpress.com/2014/09/27/learning-the-ergodic-decomposition/ … “Learning the ergodic […]