12.2 Gibbs Sampling

Gibbs sampling (or Gibbs sampler) of Geman and Geman (1984) and Gelfand and Smith (1990) is perhaps the most popular MCMC method. We introduce the idea of Gibbs sampling by using a simple problem with three parameters. Here the word parameter is used in a very general sense. A missing data point can be regarded as a parameter under the MCMC framework. Similarly, an unobservable variable such as the “true” price of an asset can be regarded as N parameters when there are N transaction prices available. This concept of parameter is related to data augmentation and becomes apparent when we discuss applications of the MCMC methods.

Denote the three parameters by θ1, θ2, and θ3. Let X be the collection of available data and M the entertained model. The goal here is to estimate the parameters so that the fitted model can be used to make inference. Suppose that the likelihood function of the model is hard to obtain, but the three conditional distributions of a single parameter given the others are available. In other words, we assume that the following three conditional distributions are known:

12.1 12.1

where inline denotes the conditional distribution of the parameter θi given the data, the model, and the other two parameters. In application, we do not need to know the exact forms of the conditional distributions. What is needed is the ability to draw a random number from each of the three conditional distributions.

Let θ2, 0 and θ3, 0 be two arbitrary starting values of θ2 and θ3. The Gibbs sampler proceeds as follows:

1. Draw a random sample from inline. Denote the random draw by θ1, 1.

2. Draw a random sample from inline. Denote the random draw by θ2, 1.

3. Draw a random sample from inline. Denote the random draw by θ3, 1.

This completes a Gibbs iteration and the parameters become θ1, 1, θ2, 1, and θ3, 1.

Next, using the new parameters as starting values and repeating the prior iteration of random draws, we complete another Gibbs iteration to obtain the updated parameters θ1, 2, θ2, 2, and θ3, 2. We can repeat the previous iterations for m times to obtain a sequence of random draws:

inline

Under some regularity conditions, it can be shown that, for a sufficiently large m, (θ1, m, θ2, m, θ3, m) is approximately equivalent to a random draw from the joint distribution inline of the three parameters. The regularity conditions are weak; they essentially require that for an arbitrary starting value (θ1, 0, θ2, 0, θ3, 0), the prior Gibbs iterations have a chance to visit the full parameter space. The actual convergence theorem involves using the Markov chain theory; see Tierney (1994).

In practice, we use a sufficiently large n and discard the first m random draws of the Gibbs iterations to form a Gibbs sample, say,

12.2 12.2

Since the previous realizations form a random sample from the joint distribution inline, they can be used to make inference. For example, a point estimate of θi and its variance are

12.3 12.3

The Gibbs sample in Eq. (12.2) can be used in many ways. For example, if we are interested in testing the null hypothesis H01 = θ2 versus the alternative hypothesis Ha1 ≠ θ2, then we can simply obtain the point estimate of θ = θ1 − θ2 and its variance as

inline

The null hypothesis can then be tested by using the conventional t-ratio statistic inline.

Remark

The first m random draws of a Gibbs sampling, which are discarded, are commonly referred to as the burn-in sample. The burn-ins are used to ensure that the Gibbs sample in Eq. (12.2) is indeed close enough to a random sample from the joint distribution inline.        □

Remark

The method discussed before consists of running a single long chain and keeping all random draws after the burn-ins to obtain a Gibbs sample. Alternatively, one can run many relatively short chains using different starting values and a relatively small n. The random draw of the last Gibbs iteration in each chain is then used to form a Gibbs sample.        □

From the prior introduction, Gibbs sampling has the advantage of decomposing a high-dimensional estimation problem into several lower dimensional ones via full conditional distributions of the parameters. At the extreme, a high-dimensional problem with N parameters can be solved iteratively by using N univariate conditional distributions. This property makes the Gibbs sampling simple and widely applicable. However, it is often not efficient to reduce all the Gibbs draws into a univariate problem. When parameters are highly correlated, it pays to draw them jointly. Consider the three-parameter illustrative example. If θ1 and θ2 are highly correlated, then one should employ the conditional distributions inline and inline whenever possible. A Gibbs iteration then consists of (a) drawing jointly (θ1, θ2) given θ3, and (b) drawing θ3 given (θ1, θ2). For more information on the impact of parameter correlations on the convergence rate of a Gibbs sampler, see Liu, Wong, and Kong (1994).

In practice, convergence of a Gibbs sample is an important issue. The theory only states that the convergence occurs when the number of iterations m is sufficiently large. It provides no specific guidance for choosing m. Many methods have been devised in the literature for checking the convergence of a Gibbs sample. But there is no consensus on which method performs best. In fact, none of the available methods can guarantee 100% that the Gibbs sample under study has converged for all applications. Performance of a checking method often depends on the problem at hand. Care must be exercised in a real application to ensure that there is no obvious violation of the convergence requirement; see Carlin and Louis (2000) and Gelman et al. (2003) for convergence checking methods. In application, it is important to repeat the Gibbs sampling several times with different starting values to ensure that the algorithm has converged.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset