Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Summary

Outliers must be considered while exploring real data, and this chapter has given some techniques for spotting them as a part of a recommended systematic process that allows the root cause behind the creation of the outlier to be determined. In addition, automated handling could be implemented while bearing in mind that it may be dangerous to give complete autonomy to a system because it may delete perfectly good data. It is better perhaps to implement automated checking to highlight outliers in unseen data so as to allow a human to get involved.

Bear in mind that real data never behaves as well as fake data. What matters is being able to quickly determine what data could be an outlier, then work out whether it is or not. This chapter has given some tools to help you with this.

Another big issue with real data is missing values. As we shall see in the next chapter, it is important to determine some rules to handle these.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Summary

Create new playlist

Sign In

Sign Up

Summary

Table of Contents for
Summary