Sketching our roadmap

As we'll be building a system using real data that is very noisy, this chapter is not for the fainthearted, as we will not arrive at the golden solution of a classifier that achieves 100% accuracy; often, even humans disagree about whether an answer was good or not (just take a look at some of the StackOverflow comments). We will find out that some challenges, such as the one in this chapter, are so hard that we have to adjust our initial goals along the way. We will start with the nearest-neighbor approach that we learned in the previous chapters, find out why it is not very good for the task in this chapter, switch over to logistic regression, and arrive at a solution that will achieve good enough prediction quality, but on a smaller part of the answers. Finally, we will spend some time looking at how to extract the winner to deploy it on the target system.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset