Training the classifier

We have to pass the feature array together with the previously defined Y label to the kNN learner to obtain a classifier:

from sklearn.neighbors import KNeighborsClassifier
X = np.asarray([extract_features_from_body(text) for post_id, text in
fetch_posts(fn_sample) if post_id in all_answers])
knn = KNeighborsClassifier()
knn.fit(X, Y)

Using the standard parameters, we just fitted a 5NN (meaning NN with k=5) to our data. Why 5NN? Well, at the current state of our knowledge about the data, we really have no clue what the right k should be. Once we have more insight, we will have a better idea of how to set k.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset