Reinforcement learning

Reinforcement learning is a case where the machine is trained for a specific outcome with the sole purpose of maximizing efficiency and/or performance. The algorithm is rewarded for making the correct decisions, and penalized for making incorrect ones. Continual training is used to constantly improve performance. The continual learning process means less human intervention. Markov models are an example of reinforcement learning, and self-driving autonomous automobiles are a great example of just such an application. It constantly interacts with its environments, watches for obstacles, speed limits, distance, pedestrians, and so on to (hopefully) make the correct decisions.

Our biggest difference with reinforcement learning is that we do not deal with correct input and output data. The focus here is on performance, meaning somehow finding a balance between unseen data and what the algorithms have already learned.

The algorithm applies an action to its environment, receives a reward or a penalty based upon what it has done, repeats, and so on as shown in the following image. You can just imagine how many times per second this is happening in that nice little autonomous taxi that just picked you up at the hotel.

Table of Contents for Reinforcement learning

Create new playlist

Sign In

Sign Up

Table of Contents for
Reinforcement learning