Develop a back-off mechanism for MLE

Katz back-off may be defined as a generative n gram language model that computes the conditional probability of a given token given its previous information in n gram. According to this model, in training, if n gram is seen more than n times, then the conditional probability of a token, given its previous information, is proportional to the MLE of that n gram. Else, the conditional probability is equivalent to the back-off conditional probability of (n-1) gram.

The following is the code for Katz's back-off model in NLTK:

def prob(self, word, context):
"""
Evaluate the probability of this word in this context using Katz Backoff.
: param word: the word to get the probability of
: type word: str
:param context: the context the word is in
:type context: list(str)
"""
context = tuple(context)
if(context+(word,) in self._ngrams) or (self._n == 1):
return self[context].prob(word)
else:
return self._alpha(context) * self._backoff.prob(word,context[1:])
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset