We have to pass the feature array together with the previously defined Y label to the kNN learner to obtain a classifier:
from sklearn.neighbors import KNeighborsClassifier
X = np.asarray([extract_features_from_body(text) for post_id, text in
fetch_posts(fn_sample) if post_id in all_answers])
knn = KNeighborsClassifier()
knn.fit(X, Y)
Using the standard parameters, we just fitted a 5NN (meaning NN with k=5) to our data. Why 5NN? Well, at the current state of our knowledge about the data, we really have no clue what the right k should be. Once we have more insight, we will have a better idea of how to set k.