The choice of which algorithm to deploy to answer a business question depends on a variety of parameters, and there is no one good answer. The choice of algorithm generally depends on the nature of the predictor and output variables; also, the overarching nature of the business problem at hand—whether it is a numerical prediction, classification, or an aggregation problem. Based on these preliminary criteria, one can shortlist a few existing methods to apply on the dataset.
Each method will have its own pros and cons, and the final decision should be taken keeping in mind the business context. The decision for the best-suited algorithm is usually taken based on the following two requirements:
The following table summarizes the algorithms that should be chosen depending upon the type of predictor and outcome variables and the question needed to be answered in the business context:
Type of variables |
Business contexts/questions |
Algorithm/Model |
---|---|---|
A continuous numerical variable as an output variable; a mix of categorical and numerical variables as predictor variables. |
To answer quantifiable questions such as how many, how much, and so on. |
Linear regression, polynomial regression, and regression tree. |
A binary or categorical variable as an output variable; a mix of categorical and numerical variables as predictor variables. |
Classification problems. To answer questions with yes/no, fail/success, and 0/1 answers. |
Logistic regression. |
No output variable; a mix of categorical and numerical variables as predictor variables. |
Grouping/aggregation and targeted marketing. To answer what data points are similar to each other? How many such groups can be created? These groups are earlier non-existent. |
Clustering and segmentation. |
A categorical or numerical variable as an output variable; a mix of categorical and numerical variables as predictor variables. |
Classification problems. Classifying data points into already existing groups. |
Decision Trees, k-Nearest Neighbor, Bayes' Classifier, Support Vector Machines, and so on. |