Multiclass classification

Multiclass (also referred to as multinomial) classification is the duty of classifying elements of a prearranged set into one of three or more groups. Again, the product documentation provides a good example of this: training a model to predict which product category is most likely to interest a customer in an outdoor equipment store.

In the previous section, the example used the same data, but this example use case is looking to determine a product category (group) as the result, rather than a particular purchase decision. The model built in this example will predict which product line is most likely to interest a given customer.

Stepping through the example process, we'll have the same training data and the same feature columns (as the prior example) but a different label column: PRODUCT_LINE. Also, in this example, rather than picking Automatic, choose Manual so that you can choose the specific algorithms the model uses.

So, to train this model, you will specify the preceding label and feature columns and then pick the machine learning technique: Multiclass Classification. Another difference in this exercise is that we want to add two estimators (algorithm choices) for the model to use so that we can compare each performance:

Click Add Estimators to view the estimators (algorithms) that are available to use with the multiclass classification technique in the model builder.
Click the card labeled Naive Bayes and then click on Add.
Click on Add Estimators again.
Click the card labeled Random Forest Classifier and then click on Add:

A neat feature with the model builder is that, after the training completes, you can see evaluations of both algorithm choices (as seen in the following screenshot):

In the preceding screenshot, you can see that the performance evaluation of the model using Naive Bayes is rated as Poor, while the performance of the model using random forest classification is Excellent. Consider the following, as I have mentioned already in this chapter, and as is even stated in the product documentation.

"To find the best solution for a given machine learning problem, you sometimes have to experiment with your training data, the model design, and/or the algorithms used. With the model builder, you can easily compare the results of different algorithms used (to better understand what the best choice should be).”

This is extremely good and practical advice, especially to those relatively new to machine learning.

Let's now move on to the final topic of this chapter: regression.

Table of Contents for Multiclass classification

Create new playlist

Sign In

Sign Up

Table of Contents for
Multiclass classification