Approximating XOR with Multilayer perceptrons

Let's train a multilayer perceptron to approximate the XOR function. At the time of writing, multilayer perceptrons have been implemented as part of a 2014 Google Summer of Code project, but have not been merged or released. Subsequent versions of scikit-learn are likely to include this implementation of multilayer perceptrons without any changes to the API described in this section. In the interim, a fork of scikit-learn 0.15.1 that includes the multilayer perceptron implementation can be cloned from https://github.com/IssamLaradji/scikit-learn.git.

First, we will create a toy binary classification dataset that represents XOR and split it into training and testing sets:

>>> from sklearn.cross_validation import train_test_split
>>> from sklearn.neural_network import MultilayerPerceptronClassifier
>>> y = [0, 1, 1, 0] * 1000
>>> X = [[0, 0], [0, 1], [1, 0], [1, 1]] * 1000
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=3)

Next we instantiate MultilayerPerceptronClassifier. We specify the architecture of the network through the n_hidden keyword argument, which takes a list of the number of hidden units in each hidden layer. We create a hidden layer with two units that use the logistic activation function. The MultilayerPerceptronClassifier class automatically creates two input units and one output unit. In multi-class problems the classifier will create one output unit for each of the possible classes.

Selecting an architecture is challenging. There are some rules of thumb to choose the numbers of hidden units and layers, but these tend to be supported only by anecdotal evidence. The optimal number of hidden units depends on the number of training instances, the noise in the training data, the complexity of the function that is being approximated, the hidden units' activation function, the learning algorithm, and the regularization employed. In practice, architectures can only be evaluated by comparing their performances through cross validation.

We train the network by calling the fit() method:

>>> clf = MultilayerPerceptronClassifier(n_hidden=[2],
>>>                                      activation='logistic',
>>>                                      algorithm='sgd',
>>>                                      random_state=3)
>>> clf.fit(X_train, y_train)

Finally, we print some predictions for manual inspection and evaluate the model's accuracy on the test set. The network perfectly approximates the XOR function on the test set:

>>> print 'Number of layers: %s. Number of outputs: %s' % (clf.n_layers_, clf.n_outputs_)
>>> predictions = clf.predict(X_test)
>>> print 'Accuracy:', clf.score(X_test, y_test)
>>> for i, p in enumerate(predictions[:10]):
>>>     print 'True: %s, Predicted: %s' % (y_test[i], p)
Number of layers: 3. Number of outputs: 1
Accuracy: 1.0
True: 1, Predicted: 1
True: 1, Predicted: 1
True: 1, Predicted: 1
True: 0, Predicted: 0
True: 1, Predicted: 1
True: 0, Predicted: 0
True: 0, Predicted: 0
True: 1, Predicted: 1
True: 0, Predicted: 0
True: 1, Predicted: 1
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset