Chapter 4. Building Neural Networks

In this chapter, we will introduce Artificial Neural Networks (ANNs). We will study the basic representation of ANNs and then discuss several ANN models that can be used in both supervised and unsupervised machine learning problems. We also introduce the Enclog Clojure library to build ANNs.

Neural networks are well suited for finding patterns in some given data and have several practical applications, such as handwriting recognition and machine vision, in computing. ANNs are often combined or interconnected to model a given problem. Interestingly, they can be applied to several machine learning problems, such as regression and classification. ANNs have applications in several areas in computing and are not restricted to the scope of machine learning.

Unsupervised learning is a form of machine learning in which the given training data doesn't contain any information about which class a given sample of input belongs to. As the training data is unlabeled, an unsupervised learning algorithm must determine the various categories in the given data completely on its own. Generally, this is done by seeking out similarities between different pieces of data and then grouping the data into several categories. This technique is called cluster analysis, and we shall study more about this methodology in the following chapters. ANNs are used in unsupervised machine learning techniques mostly due to their ability to quickly recognize patterns in some unlabeled data. This specialized form of unsupervised learning exhibited by ANNs is termed as competitive learning.

An interesting fact about ANNs is that they are modeled from the structure and behavior of the central nervous system of higher-order animals that demonstrate learning capabilities.

Understanding nonlinear regression

By this time, the reader must be aware of the fact that the gradient descent algorithm can be used to estimate both linear and logistic regression models for regression and classification problems. An obvious question would be: what is the need of neural networks when we can use gradient descent to estimate linear regression and logistic regression models from the training data? To understand the necessity of ANNs, we must first understand nonlinear regression.

Let's assume that we have a single feature variable X and a dependent variable Y that varies with X, as shown in the following plot:

Understanding nonlinear regression

As illustrated in the preceding plot, it's hard, if not impossible, to model the dependent variable Y as a linear equation of the independent variable X. We could model the dependent variable Y to be a high-order polynomial equation of the dependent variable X, thus converting the problem into the standard form of linear regressions. Hence, the dependent variable Y is said to vary nonlinearly with the independent variable X. Of course, there is also a good chance that data cannot be modeled using a polynomial function either.

It can also be shown that calculating the weights or coefficients of all the terms in a polynomial function using gradient descent has a time complexity of Understanding nonlinear regression, where n is the number of features in the training data. Similarly, the algorithmic complexity of calculating the coefficients of all the terms in a third-order polynomial equation is Understanding nonlinear regression. It's apparent that the time complexity of gradient descent increases geometrically with the number of features of the model. Thus, gradient descent on its own is not efficient enough to model nonlinear regression models with a large number of features.

ANNs, on the other hand, are very efficient at modeling nonlinear regression models of data with a high number of features. We will now study the foundational ideas of ANNs and several ANN models that can be used in supervised and unsupervised learning problems.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset