Decision trees

Decision trees can be used for both classification and regression. Decision trees answer sequential questions with a yes/no, true/false response. Based upon those responses, the tree follows predetermined paths to reach its goal. Trees are more formally a version of what is known as a directed acyclic graph. Finally, a decision tree is built using the entire dataset and all features.

Here is an example of a decision tree. You may not know it as a decision tree, but for sure you know the process. Anyone for a doughnut?

As you can see, the flow of a decision tree starts at the top and works its way downward until a specific result is achieved. The root of the tree is the first decision that splits the dataset. The tree recursively splits the dataset according to what is known as the splitting metric at each node. Two of the most popular metrics are Gini Impurity and Information Gain.

Here is another depiction of a decision tree, albeit without the great donuts!

The depth of a decision tree represents how many questions have been asked so far. This is the deepest that the tree can go (the total number of questions that may be asked), even if some results can be achieved using fewer questions. For instance, using the preceding diagram, some results can be obtained after 1 question, some after 2. Therefore, the depth of that decision tree would be 2.

Table of Contents for Decision trees

Create new playlist

Sign In

Sign Up

Table of Contents for
Decision trees