Neural network in R

Let's load up the libraries that we need. We are going to use the neuralnet package. The neuralnet package is a flexible package that is created for the training of neural networks using the backpropagation method. We discussed the backpropagation method previously in this chapter.

Let's install the package using the following command:

install.packages("neuralnet")

Now, let's load the library:

library(neuralnet)

We need to load up some data. We will use the iris quality dataset from the UCI website, which is installed along with your R installation. You can check that you have it, by typing in iris at the Command Prompt. You should get 150 rows of data.

If not, then download the data from the UCI website, and rename the file to iris.csv. Then, use the Import Dataset button on RStudio to import the data.

Now, let's assign the iris data to the data command. Now, let's look at the data to see if it is loaded correctly. It's enough to look at the first few rows of data, and we will do this by using the head command:

head(data)

Let's look and see how many rows and columns we have for the wine dataset. This will help later, when we look at how many rows we want in the training and the test set:

dim(data)

When we run this command, we see that we have 150 rows and 5 columns. Let's plot the wine in RStudio so we can see how it looks:

plot(data)

When we run this command, we get a lattice plot that compares all the variables together. Here is an example:

Neural network in R

This visualization is quite difficult to read. Let's move forward with the issue of adding more contexts to the data. The following code creates a new column for each of the iris types, and populates the corresponding column with TRUE if the iris is of the given type. So, for example, if the iris type is setosa, then the code returns TRUE in the setosa column:

data$setosa <- c(data$Species == 'setosa')
data$versicolor <- c(data$Species == 'versicolor')
data$virginica <- c(data$Species == 'virginica')

Once we have run these commands, we can use the head command again to see the values. Here is an example result:

Neural network in R

We could normalize the data before we use it for the neural net. In theory, we don't always need to standardize the inputs to the neural net. The reason for this is that any rescaling of an input could be undone, or redone by any amendment of the corresponding weights and biases. In practice, however, standardizing inputs can make R faster. Therefore, normalizing is one technique that you could consider, particularly when we are handling a lot of data.

Let's start producing Train and Test datasets. We can use the following formula to count the number of rows. Then, we can create an index, which will be used to assign data to either the test or training sets.

Now, we will train the data using set.seed, which allows us to reproduce a particular sequence of random numbers. The seed itself carries no inherent meaning; it's simply a way of telling the random number generator where to start. Here, we are going to set the seed to make the partition reproducible:

set.seed(123)

Next, we will work out the training and test set. Firstly, we calculate the total number of rows and then we sample the data so that 75 percent of the data is training, and the remainder is test data:

totalrows <- nrow(data)
totalrows
samplesize <- floor(0.75 * nrow(data))

The samples are indexed separately, using the marker iris_ind. Data with the marker iris_ind goes to the training dataset, and rest of the data goes to the test dataset.

Next, we can assign the data to the training set or the test set:

train <- winequality[wine_ind, ]
test <- winequality[-wine_ind, ]

Now, we will call the neuralnet function to create a neural network. We are training the data, so we are going to use the training set of data. As a starting point, we are going to work with three hidden layers. The neural network is going to be assigned to the variable nn:

nn <- neuralnet(train$setosa + train$versicolor + train$virginica ~ train$Sepal.Length + train$Sepal.Width + train$Petal.Length + train$Petal.Width, data, hidden=3, lifesign='full')

Now we've created the model, let's try to use it with the test data. To do this, we use the predict command, specifying the first four columns:

predict <- compute(nn, test[1:4])

Now, let's test use this model to predict our results with the test data. In the neuralnet package, this method is used to predict objects of class nn, typically produced by neuralnet.

Firstly, the dataframe is changed by a mean response value, and the data error is worked out between the original response and the new mean response. Then, all duplicate rows are removed to clarify the data.

Eventually, we get our predictions, and we can start to visualize the predicted results of the neural network using Tableau.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset