What does it mean to describe a probability distribution over, say, ${\{0,1\}^\mathbb{N}}$ ? I am interested in this question because of the expert testing literature (pdf), in which an expert is supposed to provide a client with the distribution of some stochastic process, but the question is also relevant for Bayesiansts, at least those of us who think that the Bayesian paradigm captures (or should capture) the reasoning process of rational agents, in which case it makes little sense to reason about something you cannot describe.

So, what does it mean to describe your belief/theory about a stochastic process ${X_1,X_2,\dots}$ with values in ${\{0,1\}}$?  Here are some examples of descriptions:

${X_n}$ are I.I.D with ${P(X_n=0)=3/4}$.

${X_0=X_1=1}$ and ${\mathop{\mathbb P}(X_n=0|X_0,\dots,X_{n-1})=1/eX_{n-1}+0.4X_{n-2}}$

${X_n=Y_{2n}\cdot Y_{2n-1}}$ where ${Y_0,Y_1,\dots}$ are I.I.D ${(1/2,1/2)}$.

Everyone agrees that I have just described processes, and that the first and last are different descriptions of the same process. Also everyone probably agrees that there are only countably many processes I can describe, since every description is a sentence in English/Math and there are only countably many such sentences.

Now the standard approach to formalizing the notion of describability is using effective computability. Before explaining how one can do it for distributions let me say what it means to describe a more familiar creature: real number. Again here are a couple of descriptions of real numbers which a good definition of description will capture: ${1/2,~0.12121212...,~ e, ~\sqrt{2+\pi}}$.

Definition 1 A description of a real number ${p}$ is a computable function ${\phi:\mathbb{Q}\rightarrow\mathbb{Q}}$ such that ${|p-\phi(\varepsilon)|<\varepsilon}$ for every rational ${\varepsilon>0}$.

So this definition means that we can effectively approximate the number to arbitrary degree. As an example, consider the number ${e}$. Probably your first calculus professor described this number to you by saying something like:

${e}$ is a shortening to ${\sum_{k\geq 0}1/k!}$.

Hopefully he added that the error ${\sum_{k\geq (n+1)}1/k!}$ is smaller than, say, ${1/n}$. So you can get a sufficiently good approximation of ${e}$ by a sufficiently long partial sum of the infinite series. Note that an approximation to ${e}$ up to a certain precision is what your computer provides when you type ${e}$. Anyway, what’s important for me is not the exact definition but the methodology: We define describability using computability, so to describe something is to tell you an effective way to create it. The notion of computability has a formalization that matches your intuition (don’t trust me, trust Church and Turing).

Now what does it mean to describe a ${\{0,1\}}$-valued random variable ? Again we can take the same approach. To describe the variable, I have to tell you how to effectively create such a creature. Well, let’s assume that you can effectively produce a purely random ${(1/2,1/2)}$ coin toss. As I have often said on this blog, I don’t really know what `purely random’ means, but there is an iphone application that produces such a thing. What if we want a random coin toss with probability ${1/e}$ for success ? No problem, we can start producing a uniformly distributed real number ${\alpha\in[0,1]}$ by producing its binary expansion one bit one after another using the iphone application for random coin tosses. Then, since we can approximate ${1/e}$ to a desired precision, almost surely at some point we will have sufficient information about ${\alpha}$ to decide whether ${\alpha>1/e}$ (fail) or ${\alpha<1/e}$ (success). So we can effectively produce a random coin with probability ${p}$ of success whenever ${p}$ is describable according to Definition 1.

What about generating a stochastic process ${X_0,X_1,\dots}$ ? In order to be able to generate such a thing, all I have to describe to you is the probability that ${X_n=0}$ given ${X_0,\dots,X_{n-1}}$. Then you can create the process one variable after another. Hopefully at this point the following definition is self explanatory:

Definition 2 A description of a ${\{0,1\}}$-valued stochastic process is a computable function ${\phi}$ whose domain is the set ${\{0,1\}^{<\mathbb{N}}}$ of finite sequences of bits, and for every such sequence ${s}$, ${\phi(s)}$ is a description of a real number in ${[0,1]}$