How to do it...

The section shows how to build neural network using H20.

Load the occupancy train and test datasets in R:

# Load the occupancy data 
occupancy_train <-read.csv("C:/occupation_detection/datatraining.txt",stringsAsFactors = T)
occupancy_test <- read.csv("C:/occupation_detection/datatest.txt",stringsAsFactors = T)

The following independent (x) and dependent (y) variables will be used to model GLM:

# Define input (x) and output (y) variables
x = c("Temperature", "Humidity", "Light", "CO2", "HumidityRatio")
y = "Occupancy"

Based on the requirement by H2O, convert dependent variables to factors as follows:

# Convert the outcome variable into factor
occupancy_train$Occupancy <- as.factor(occupancy_train$Occupancy)
occupancy_test$Occupancy <- as.factor(occupancy_test$Occupancy)

Then convert the datasets to H2OParsedData objects:

# Convert Train and Test datasets into H2O objects
occupancy_train.hex <- as.h2o(x = occupancy_train, destination_frame = "occupancy_train.hex")
occupancy_test.hex <- as.h2o(x = occupancy_test, destination_frame = "occupancy_test.hex")

Once the data is loaded and converted to H2OParsedData objects, build a multilayer Feedforward neural network using the h2o.deeplearning function. In the current setup, the following parameters are used to build the NN model:

- Single hidden layer with five neurons using hidden
- 50 iterations using epochs
- Adaptive learning rate (adaptive_rate) instead of a fixed learning rate (rate)
- Rectifier activation function based on ReLU
- Five-fold cross validation using nfold

# H2O based neural network to Train the model 
occupancy.deepmodel <- h2o.deeplearning(x = x, 
                                        y = y, 
                                        training_frame = occupancy_train.hex, 
                                        validation_frame = occupancy_test.hex, 
                                        standardize = F, 
                                        activation = "Rectifier", 
                                        epochs = 50, 
                                        seed = 1234567, 
                                        hidden = 5, 
                                        variable_importances = T,
                                        nfolds = 5,
                                        adpative_rate = TRUE)

In addition to the command described in the recipe Performing logistic regression using H2O, you can also define other parameters to fine-tune the model performance. The following list does not cover all the functional parameters, but covers some based on importance. The complete list of parameters is available in the documentation of the h2o package.

- Option to initialize a model using a pretrained autoencoder model.
- Provision to fine-tune the adaptive learning rate via an option to modify the time decay factor (rho) and smoothing factor (epsilon). In the case of a fixed learning rate (rate), an option to modify the annealing rate (rate_annealing) and decay factor between layers (rate_decay).
- Option to initialize weights and biases along with weight distribution and scaling.
- Stopping criteria based on the error fraction in the case of classification and mean squared errors with regard to regression (classification_stop and regression_stop). An option to also perform early stopping.
- Option to improve distributed model convergence using the elastic averaging method with parameters such as moving rate and regularization strength.

Table of Contents for How to do it...

Create new playlist

Sign In

Sign Up

Table of Contents for
How to do it...