Understanding stacked generalization

Stacked generalization is an ensemble of a diverse group of models that introduces the concept of a meta-learner. A meta-learner is a second-level machine learning algorithm that learns from an optimal combination of base learners:

"Stacked generalization is a means of non-linearly combining generalizers to make a new generalizer, to try to optimally integrate what each of the original generalizers has to say about the learning set. The more each generalizer has to say (which isn't duplicated in what the other generalizers have to say), the better the resultant stacked generalization."

- Wolpert (1992), Stacked Generalization

The steps for stacking are as follows:

Split your dataset into a training set and a testing set.
Train several base learners on the training set.
Apply the base learners on the testing set to make predictions.
Use the predictions as inputs and the actual responses as outputs to train a higher-level learner.

Because the predictions from the base learners are blended together, stacking is also referred to as blending.

The following diagram gives us a conceptual representation of stacking:

It's of significance for stack generalization that the predictions from the base learners are not correlated with each other. In order to get uncorrelated predictions from the base learners, algorithms that use different approaches internally may be used to train the base learners. Stacked generalization is used mainly for minimizing the generalization error of the base learners, and can be seen as a refined version of cross-validation. It uses a strategy that's more sophisticated than cross-validation's winner-takes-all approach for combining the predictions from the base learners.

Table of Contents for Understanding stacked generalization

Create new playlist

Sign In

Sign Up

Table of Contents for
Understanding stacked generalization