In the case of full particles for importance sampling, we used to generate particles from another distribution, and then, to compensate for the difference, we used to associate a weighting to each particle. Similarly, in the case of collapsed particles, we will be generating particles for the variables and getting the following dataset:
Here, the sample is generated from the distribution Q. Now, using this set of particles, we want to find the expectation of relative to the distribution :
Let's take an example using the late-for-school model, as shown in Fig 4.22. Let's consider that we have the evidence that , , and partition the variables as and . So, we will generate particles over the variable . Also, each such particle is associated with the distribution . Now, assuming some query (say ), our indicator function will be . We will now evaluate for each particle:
After this, we will compute the average of these probabilities using the weightings of the samples.
Now, the question is, how do we define the distribution Q and find the weightings for the particles?. We begin by partitioning the evidence variables into two parts, namely and , where and . As the collapsed importance sampling was a hybrid process, we deal with the evidence accordingly, using as evidence in importance sampling and as evidence in exact inference.
Let's consider an arbitrary distribution Q:
Using this, we can reformulate as follows:
Let's put this result back into the previous equation:
From the preceding equation we get the following:
Now, computing the mean of importance weights, we get the following estimator:
So, we get the final equation as follows:
In the preceding discussion, we didn't place any restriction on the selection of the distribution Q. The two main points to consider for the selection of the distribution Q are as follows:
In the case of collapsed particles, we will generate particles from the distribution . However, as we saw in the case of full particles, we had to sample over the variable's parents before sampling the variable. In the case of collapsed particles, it is quite possible that the parents of a variable are not in . The simplest solution to this problem is to construct the set in such a way that for every , holds as well. To do this, we must use a simple approach to start with the nodes having no parents, include them in , and then work downwards from there.