Introduction of DNNs

With the advent of big data processing infrastructure, GPU, and GP-GPU, we are now able to overcome the challenges with shallow neural networks, namely overfitting and vanishing gradient, using various activation functions and L1/L2 regularization techniques. Deep learning can work on large amounts of labeled and unlabeled data easily and efficiently.

As mentioned, deep learning is a class of machine learning wherein learning happens on multiple levels of neuron networks. The standard diagram depicting a DNN is shown in the following figure:

From the analysis of the previous figure, we can notice a remarkable analogy with the neural networks we have studied so far. We can then be quiet, unlike what it might look like, deep learning is simply an extension of the neural network. In this regard, most of what we have seen in the previous chapters is valid. In short, a DNN is a multilayer neural network that contains two or more hidden layers. Nothing very complicated here. By adding more layers and more neurons per layer, we increase the specialization of the model to train data but decrease the performance on the test data.

As we anticipated, DNN are derivatives of ANN. By making the number of hidden layers more than one, we build DNNs. There are many variations of DNNs, as illustrated by the different terms shown next:

  • Deep Belief Network (DBN): It is typically a feed-forward network in which data flows from one layer to another without looping back. There is at least one hidden layer and there can be multiple hidden layers, increasing the complexity.
  • Restricted Boltzmann Machine (RBM): It has a single hidden layer and there is no connection between nodes in a group. It is a simple MLP model of neural networks.
  • Recurrent Neural Networks (RNN) and Long Short Term Memory (LSTM): These networks have data flowing in any direction within groups and across groups.

As with any machine learning algorithm, even DNNs require building, training, and evaluating processes. A basic workflow for deep learning in shown in the following figure:

The workflow we have seen in the previous figure remembers very closely that typical of a supervised learning algorithm. But what makes it different from other machine learning algorithms?

Almost all machine learning algorithms demonstrate their limits in identifying the characteristics of raw input data, especially when they are complex and lacking an apparent order, such as in images. Usually, this limit is exceeded through the help of humans, who are concerned with identifying what the machine can not do. Deep learning removes this step, relying on the training process to find the most useful models through input examples. Also in that case human intervention is necessary in order to make choices before starting training, but automatic discovery of features makes life much easier. What makes the neural networks particularly advantageous, compared to the other solutions offered by machine learning, is the great generalization ability of the model.

These features have made deep learning very effective for almost all tasks that require automatic learning; although it is particularly effective in a case of complex hierarchical data. Its underlying ANN forms highly nonlinear representations; these are usually composed of multiple layers together with nonlinear transformations and custom architectures.

Essentially, deep learning works really well with messy data from the real world, making it a key instrument in several technological fields of the next few years. Until recently, it was a dark and daunting area to know, but its success has brought many great resources and projects that make it easier than ever to start.

Now that we know what the DNNs are, let's see what tools the R development environment offers us to deal with this particular topic.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset