There's more...

We discussed earlier in this chapter how to use the AnalyzeLocal utility class to find out missing values. We can also perform extended data analysis using AnalyzeLocal. We can create a data analysis object that holds information on each column present in the dataset. It can be created by calling analyze(), as we discussed in the previous chapter. If you try to print out the information on the data analysis object, it will look like the following:

It will calculate the standard deviation, mean, and the min/max values for all the features in the dataset. The count of features is also calculated, which will be helpful toward identifying missing or invalid values in features.

Both screenshots on the above indicate the data analysis results returned by calling analyze() method. For the customer churn dataset, we should have a total count of 10,000 for all features as the total number of records present in our dataset is 10,000. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset