Boosting 

A weak learner is an algorithm that performs relatively poorly—generally, the accuracy obtained with the weak learners is just above chance. It is often, if not always, observed that weak learners are computationally simple. Decision stumps or 1R algorithms are some examples of weak learners. Boosting converts weak learners into strong learners. This essentially means that boosting is not an algorithm that does the predictions, but it works with an underlying weak ML algorithm to get better performance. 

A boosting model is a sequence of models learned on subsets of data similar to that of the bagging ensembling technique. The difference is in the creation of the subsets of data. Unlike bagging, all the subsets of data used for model training are not created prior to the start of the training. Rather, boosting builds a first model with an ML algorithm that does predictions on the entire dataset. Now, there are some misclassified instances that are subsets and used by the second model. The second model only learns from this misclassified set of data curated from the first model's output.

The second model's misclassified instances become input to the third model. The process of building models is repeated until the stopping criteria is met. The final prediction for an observation in the unseen dataset is arrived by averaging or voting the predictions from all the models for that specific, unseen observation.

There are subtle differences between the various and numerous algorithms in the boosting algorithms family, however we are not going to discuss them in detail as the intent of this chapter is to get a generalized understanding of ML ensembles and not to gain in-depth knowledge of various boosting algorithms.

While obtaining better performance, measurement is the biggest advantage with the boosting ensemble; difficulty with model interpretability, higher computational times, and model overfitting are some of the issues encountered with boosting. Of course, these problems can be overruled through the use of specialized techniques.

Boosting algorithms are undoubtedly super-popular and are observed to be used by winners in many Kaggle and similar competitions. There are a number of boosting algorithms available such as gradient boosting machines (GBMs), adaptive boosting (AdaBoost) , gradient tree boosting, extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM). In this section, we will learn the theory and implementation of two of the most popular boosting algorithms such as GBMs and XGBoost. Prior to learning the theoretical concept of boosting and its pros and cons, let's first start focusing on implementing the attrition prediction models with GBMs and XGBoost.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset