Multiple linear regression

Multiple linear regression is a technique used to train a linear model, that assumes that there are linear relationships between multiple predictor variables () and a continuous target variable (). The general equation for a multiple linear regression with m predictor variables is as follows:

                                                                       

Training a linear regression model involves estimating the values of the coefficients for each of the predictor variables denoted by the letter . In the preceding equation,  denotes an error term, which is normally distributed, and has zero mean and constant variance. This is represented as follows:

Various techniques can be used to build a linear regression model. The most frequently used is the ordinary least square (OLS) estimate. The OLS method is used to produce a linear regression line that seeks to minimize the sum of the squared error. The error is the distance from an actual data point to the regression line. The sum of the squared error measures the aggregate of the squared difference between the training instances, which are each of our data points, and the values predicted by the regression line. This can be represented as follows:

In the preceding equation,  is the actual training instance and  is the value predicted by the regression line. 

In the context of machine learning, gradient descent is a common technique that can be used to optimize the coefficients of predictor variables by minimizing the training error of the model through multiple iterations. Gradient descent starts by initializing the coefficients to zero. Then, the coefficients are updated with the intention of minimizing the error. Updating the coefficients is an iterative process and is performed until a minimum squared error is achieved. 

In the gradient descent technique, a hyperparameter called the learning rate, denoted
by  is provided to the algorithm. This parameter determines how fast the algorithm moves toward the optimal value of the coefficients. If  is very large, the algorithm might skip the optimal solution. If it is too small, however, the algorithm might have too many iterations to converge to the optimum coefficient values. For this reason, it is important to use the right value for .

In this recipe, we will use the gradient descent method to train our linear regression model.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset