Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Summary

In this chapter, we explored the fundamental ideas surrounding issues and concerns with data quality and how to categorize quality issues by their type, as well as presented ideas for tidying up your data.

In order to compare the performance of the different models that one may create, we went on to establish some fundamental notions of model performance, such as the mean squared error (MSE) for regression and the classification error rate for classification.

We also introduced cross-validation as a generic assessment technique to be used in cases where there is a limited amount of data available.

Finally, learning curves were discussed as a way to judge the ability of a model to improve its scores or ability to learn.

With a firm grounding in the basics of the predictive modeling process, we will look at linear regression in the next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Summary

Create new playlist

Sign In

Sign Up

Summary

Table of Contents for
Summary