2.1 Monte Carlo Integration

The application of probabilistic models to data often leads to inference problems that require the integration of complex, high dimensional distributions. MCMC is a computational approach that replaces analytic integration. Intractable problems often become possible to solve using some form of MCMC. In this chapter, we will discuss two forms of MCMC: Metropolis-Hastings and Gibbs sampling.

Many problems in probabilistic inference require the calculation of complex integrals or summations over very large outcome spaces.

Calculate the expectation of a function $g(x)$ under $p(x)$ distribution: $E(g(x))=int g(x)p(x)dx$ or $E(g(x))=sum_{x} g(x)p(x)$. When $g(x)=x$, it calculates the mean of random variable $x$ under distribution $p(x)$. The general idea of Monte Carlo integration is to use samples to approximate the expectation of a complex distribution. That is:

(1) Draw $N$ independent samples $x^t,~t=1,2,3.....,N$ from $p(x)$.

(2) Approximate the expectation $E(g(x))=int g(x)p(x)dx approx frac{sum_{t=1}^{N}g(x^t)}{N}$.

It replace analytic integration with summation over a large set of samples. Generally, the accuracy of the approximation can be made as accurate as needed by increasing $N$.

Homework:

I. Approximate the mean of a random variable under $Beta(alpha, eta)$ distribution with $alpha=3,~eta=4$ using Monte Carlo.

Note that the analytic solution is $frac{alpha}{alpha+eta} approx 0.4285$. Here $g(x)=x$. The code is:

1 N=100000
2 sum(betarnd(3,4,1,N))/N

II. Approximate the variance of a random variable under $Gamma(a,b)$ with $a=1.5,~b=4$ Using Monte Carlo.

Note that the analytic solution is $ab^2 approx 24$. Here we first compute $E(x)$, then $DX=E(g(x))=int g(x)p(x)dx=int (x-E(x))^2p(x)dx approx frac{sum_{t=1}^{N}g(x^t)}{N}$. The code is:

1 N=100000;
2 a=1.5; b=4;
3 samples=gamrnd(a,b,1,N);
4 Ex=sum(samples)/N;%Monte Carlo for E(x)
5 Dx=(samples-Ex)*(samples-Ex)'/N%Monte Carlo for D(x)