Applying Gibbs sampling in language processing

With the help of Gibbs sampling, Markov chain is built by sampling from the conditional probability. When the iteration over all the parameters is completed, then one cycle of the Gibbs sampler is completed. When it is not possible to sample from conditional distribution, then Metropolis-Hastings can be used. This is referred to as Metropolis within Gibbs. Gibbs sampling may be defined as Metropolis-hastings with special proposal distribution. On each iteration, we draw a proposal for a new value of a specific parameter.

Consider an example of throwing two coins that is characterized by the number of heads and the number of tosses of a coin:

def bern(theta,z,N):
"""Bernoulli likelihood with N trials and z successes."""
return np.clip(theta**z*(1-theta)**(N-z),0,1)
def bern2(theta1,theta2,z1,z2,N1,N2):
"""Bernoulli likelihood with N trials and z successes."""
return bern(theta1,z1,N1)*bern(theta2,z2,N2)
def make_thetas(xmin,xmax,n):
xs=np.linspace(xmin,xmax,n)
widths=(xs[1:]-xs[:-1])/2.0
thetas=xs[:-1]+widths
returnt hetas
def make_plots(X,Y,prior,likelihood,posterior,projection=None):
fig,ax=plt.subplots(1,3,subplot_kw=dict(projection=projection,aspect='equal'),figsize=(12,3))
ifprojection=='3d':
ax[0].plot_surface(X,Y,prior,alpha=0.3,cmap=plt.cm.jet)
ax[1].plot_surface(X,Y,likelihood,alpha=0.3,cmap=plt.cm.jet)
ax[2].plot_surface(X,Y,posterior,alpha=0.3,cmap=plt.cm.jet)
else:
ax[0].contour(X,Y,prior)
ax[1].contour(X,Y,likelihood)
ax[2].contour(X,Y,posterior)
ax[0].set_title('Prior')
ax[1].set_title('Likelihood')
ax[2].set_title('Posteior')
plt.tight_layout()
thetas1=make_thetas(0,1,101)
thetas2=make_thetas(0,1,101)
X,Y=np.meshgrid(thetas1,thetas2)

For Metropolis, the following values are considered:

a=2
b=3

z1=11
N1=14
z2=7
N2=14

prior=lambdatheta1,theta2:stats.beta(a,b).pdf(theta1)*stats.beta(a,b).pdf(theta2)
lik=partial(bern2,z1=z1,z2=z2,N1=N1,N2=N2)
target=lambdatheta1,theta2:prior(theta1,theta2)*lik(theta1,theta2)

theta=np.array([0.5,0.5])
niters=10000
burnin=500
sigma=np.diag([0.2,0.2])

thetas=np.zeros((niters-burnin,2),np.float)
foriinrange(niters):
new_theta=stats.multivariate_normal(theta,sigma).rvs()
p=min(target(*new_theta)/target(*theta),1)
ifnp.random.rand()<p:
theta=new_theta
ifi>=burnin:
thetas[i-burnin]=theta
kde=stats.gaussian_kde(thetas.T)
XY=np.vstack([X.ravel(),Y.ravel()])
posterior_metroplis=kde(XY).reshape(X.shape)
make_plots(X,Y,prior(X,Y),lik(X,Y),posterior_metroplis)
make_plots(X,Y,prior(X,Y),lik(X,Y),posterior_metroplis,projection='3d')

For Gibbs, the following values are considered:

a=2
b=3

z1=11
N1=14
z2=7
N2=14

prior=lambda theta1,theta2:stats.beta(a,b).pdf(theta1)*stats.beta(a,b).pdf(theta2)
lik=partial(bern2,z1=z1,z2=z2,N1=N1,N2=N2)
target=lambdatheta1,theta2:prior(theta1,theta2)*lik(theta1,theta2)

theta=np.array([0.5,0.5])
niters=10000
burnin=500
sigma=np.diag([0.2,0.2])

thetas=np.zeros((niters-burnin,2),np.float)
foriinrange(niters):
theta=[stats.beta(a+z1,b+N1-z1).rvs(),theta[1]]
theta=[theta[0],stats.beta(a+z2,b+N2-z2).rvs()]

ifi>=burnin:
thetas[i-burnin]=theta
kde=stats.gaussian_kde(thetas.T)
XY=np.vstack([X.ravel(),Y.ravel()])
posterior_gibbs=kde(XY).reshape(X.shape)
make_plots(X,Y,prior(X,Y),lik(X,Y),posterior_gibbs)
make_plots(X,Y,prior(X,Y),lik(X,Y),posterior_gibbs,projection='3d')

In the preceding codes of Metropolis and Gibbs, 2D and 3D plots of prior, likelihood, and posterior would be obtained.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset