How does it work?

Here we discuss a general setup for a statistical inference problem. At the first place, from the data, we estimate the desired quantity and there might be unknown quantities too that we would like to estimate. It could be simply a response variable or predicted variable, a class, a label, or simply a number. If you are familiar with the frequentist approach, you might know that in this approach the unknown quantity say θ is assumed to be a fixed (nonrandom) quantity that is to be estimated by the observed data.

However, in the Bayesian framework, an unknown quantity say θ is treated as a random variable. More specifically, it is assumed that we have an initial guess about the distribution of θ, which is commonly referred to as the prior distribution. Now, after observing some data, the distribution of θ is updated. This step is usually performed using Bayes' rule (for more details, refer to the next section). This is why this approach is called the Bayesian approach. However, in short, from the prior distribution, we can compute predictive distributions for future observations.

This unpretentious process can be justified as the appropriate methodology to uncertain inference with the help of numerous arguments. However, the consistency is maintained with the clear principles of the rationality of these arguments. In spite of this strong mathematical evidence, many machine learning practitioners are uncomfortable with, and a bit reluctant of, using the Bayesian approach. The reason behind this is that often they view the selection of a posterior probability or prior as being arbitrary and subjective; however, in reality, this is subjective but not arbitrary.

Inappropriately, many Bayesians don't really think in true Bayesian terms. One can, therefore, find many pseudo-Bayesian procedures in the literature, in which models and priors are used that cannot be taken seriously as expressions of prior belief. There may also be computational difficulties with the Bayesian approach. Many of these can be addressed using Markov chain Monte Carlo methods, which are another main focus of my research. The details of this approach will be clearer as you go through this chapter.

Table of Contents for How does it work?

Create new playlist

Sign In

Sign Up

Table of Contents for
How does it work?