Chapter 6. Statistical and Machine Learning

Machine learning enables you to create and use computer algorithms, learn from these algorithms, correct them, and improve them to draw any new patterns that were unknown in the past. You can also extract insights from these new patterns that were found from the data. For instance, one may be interested in teaching a computer how to recognize ZIP codes value in an image. Another example is if we have a specific task, such as to determine spam messages, then instead of writing a program to solve this directly, in this paradigm, you can seek methods to learn and become better at getting accurate results using a computer.

Machine learning has become a significant part of artificial intelligence in recent years. With the power of computing, it is very likely that we will be able to build intelligent systems using machine learning methods. With the power of computing that we have today, these tasks have become far simpler than they were two decades ago. The primary goal of machine learning is to develop algorithms that have promising value in the real world. Besides time and space efficiency, the amount of data that is required by these learning algorithms also plays a challenging role. As machine learning algorithms are driven by data, you can see why there are so many different algorithms already today in this subject area. In the following sections of this chapter, we will discuss the following topics with examples:

  • Classification methods—decision tree and linear and k-nearest neighbors
  • Naïve Bayes, linear regression, and logistic regression
  • Support vector machines
  • Tree-based regression and unsupervised learning
  • Principal component analysis
  • Clustering based on similarity
  • Measuring performance for classification

Classification methods

Machine learning algorithms are useful in many real-world applications, for example, if someone is interested in making accurate predictions about the climate or in the diagnosis of a disease. The learning is usually based on some known behavior or observations. This means that machine learning is about learning to improve on something in the future based on the experience or observations of the past.

Machine learning algorithms are broadly categorized as supervised learning, unsupervised learning, reinforced learning, and deep learning. The supervised learning method of classification (where the test data is labeled) is similar to a teacher who supervises different classes. Supervised learning relies on the algorithm to learn from data when we specify a target variable. Building an accurate classifier requires the following features:

  • A good set of training examples
  • A reasonably good performance on the training set
  • A classifier method that is closely related to prior expectations
Classification methods

A binary classifier takes the data items and places them in one of the two classes (for higher dimensions, the data items are placed in k classes). The examples of a binary classifier determines whether a person's results can be diagnosed with the possibility of being positive on some disease or negative. The classifier algorithm is probabilistic. With some margin of error, someone can be diagnosed as either positive or negative. In any of these algorithms, there is a general approach to accomplish this, which goes in the following order:

  • Collecting data from a reliable source.
  • Preparing or reorganizing data with a specific structure. For a binary classifier, a distance calculation is required.
  • Analyzing data with any appropriate method.
  • Training (this is not applicable for a binary classifier).
  • Testing (calculating the error rate).

In this chapter, the discussion will be to focus on what tools are available to visualize the input and results, but there is not much focus on the machine learning concepts. For greater depth on this subject, you can refer to the appropriate material. Let's take a look at an example and gradually walk through to see the various options to choose from.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset