Reference: [Ambegaokar, Troyer, Am. J. Phys. (2010), https://doi.org/10.1119/1.3247985 http://arxiv.org/abs/0906.0943]

The Binning Analysis provides a powerful way of determining error bars for integrals calculated using the Metropolis-Hastings algorithm.

Suppose we have a set of \( N \) correlated samples \( \{ x_i \} \) which we have obtained using a Metropolis-Hastings random walk over a probability measure \( P(x)dx \). We may have already taken one sample in every \( N_\mathrm{sweep} \) in order to decorrelate the samples a bit already.

Suppose our goal is to approximate the integral

\[ \langle f\rangle = \int f\left(x\right)\, P\left(x\right)\,dx \ . \]

We'll do so by calculating the average of the function evaluated at all our samples,

\[ f_\mathrm{MH} = \frac1N\sum_i f\left(x_i\right)\ . \]

The question now is, what is the error bar on this approximation? The naive value, valid for independent samples, is (writing \( f_i := f(x_i) \))

\[ \Delta_\mathrm{naive} = \sqrt{ \frac{ \frac1N\sum_i f_i^2 - \left(\frac1N\sum_i f_i\right)^2 }{N-1} } \]

(see [Ambegaokar/Troyer, Eq. (10)]).

However this formula is wrong if the samples are correlated. For this, we introduce the binning analysis.

Take the sequence of samples \( f_i =: f_i^{(0)} \), and combine them pairwise to take the average of their two values, to generate a new sequence \( f_i^{(1)} \) of half the initial length; this can be repeated to generate new pairwise averaged samples of each previous binning level:

\[ f_i^{(n)} := \frac12\left( f_{2i}^{n-1} + f_{2i+1}^{n-1} \right)\ . \]

(We assume \( i \) counts starting at \( 0 \).)

Intuitively, the samples at each binning level are less and less correlated, and the naive estimate as used in \( \Delta_\mathrm{naive} \) gets better and better. So calculate the "naive" error bars at each level:

\[ \Delta^{(n)} := \sqrt{ \frac{ \frac1{N^{(n)}}\sum_i (f_i^{(n)})^2 - \left(\frac1{N^{(n)}}\sum_i f_i^{(n)}\right)^2 }{N^{(n)}-1} } \]

(with \( N^{(n)} = N / 2^{n} \)).

These errors should converge as the binning level increases (See [Ambegaokar/Troyer, Fig. 5]). If this is the case, the naive errors at each level converge to the true error. If not, there are not enough samples and the error bar hasn't converged.

Make sure that at the last binning level you still have enough samples to get a reliable estimate of \( \Delta^{(n)} \) from those samples.

This analysis is done using the class Tomographer::BinningAnalysis.