Getting ready

To build our model with q decision tree algorithm, we will use the backorders.csv file, which can be downloaded from the following GitHub.

This dataset has 23 columns. The target variable is went_on_backorder. This identifies whether a product has gone on back order. The other 22 variables are the predictor variables. A description of the data is provided in the code that comes with this book:

We will start by importing the required libraries:

# import os for operating system dependent functionalities
import os

# import other required libraries
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix, roc_curve, auc
import itertools
from sklearn import tree

import seaborn as sns
import matplotlib.pyplot as plt

We set our working directory with the os.chdir() command:

# Set your working directory according to your requirement
os.chdir(".../Chapter 4/Decision Tree")

# Check Working Directory 
os.getcwd()

Let's read our data. As we have done previously, we are going to prefix the name of the DataFrame with df_ to make it easier to understand:

df_backorder = pd.read_csv("BackOrders.csv")

Table of Contents for Getting ready

Create new playlist

Sign In

Sign Up

Table of Contents for
Getting ready