Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

4.2 Simple Linear Regression

In any regression model, there is an implicit assumption (which can be tested) that a relationship exists between the variables. There is also some random error that cannot be predicted. The underlying simple linear regression model is

Y = β_{0} + β_{1} X + ϵ

$Y = β_{0} + β_{1} X + ϵ$ (4-1)

where

\begin{array}{l} Y & = & dependent variable (response variable) \\ X & = & independent variable (predictor variable or explanatory variable) \\ β_{0} & = & intercept (value of Y when X = 0) \\ β_{1} & = & slope of regression line \\ ϵ & = & random error \end{array}

$\begin{array}{l} Y & = & dependent variable (response variable) \\ X & = & independent variable (predictor variable or explanatory variable) \\ β_{0} & = & intercept (value of Y when X = 0) \\ β_{1} & = & slope of regression line \\ ϵ & = & random error \end{array}$

The true values for the intercept and slope are not known, and therefore they are estimated using sample data. The regression equation based on sample data is given as

\hat{Y} = b_{0} + b_{1} X

$\hat{Y} = b_{0} + b_{1} X$ (4-2)

where

\begin{array}{l} \hat{Y} & = & predicted value of Y \\ b_{0} & = & estimate of β_{0}, based on sample results \\ b_{1} & = & estimate of β_{1}, based on sample results \end{array}

$\begin{array}{l} \hat{Y} & = & predicted value of Y \\ b_{0} & = & estimate of β_{0}, based on sample results \\ b_{1} & = & estimate of β_{1}, based on sample results \end{array}$

In the Triple A Construction example, we are trying to predict the sales, so the dependent variable (Y) would be sales. The variable we use to help predict sales is the Albany area payroll, so this is the independent variable (X). Although any number of lines can be drawn through these points to show a relationship between X and Y in Figure 4.1, the line that will be chosen is the one that in some way minimizes the errors. Error is defined as

\begin{array}{l} Error & = & (Actual Value) - (Predicted Value) \\ e & = & Y - \hat{Y} \end{array}

$\begin{array}{l} Error & = & (Actual Value) - (Predicted Value) \\ e & = & Y - \hat{Y} \end{array}$ (4-3)

Since errors may be positive or negative, the average error could be zero even though there are extremely large errors—both positive and negative. To eliminate the difficulty of negative errors canceling positive errors, the errors can be squared. The best regression line will be defined as the one with the minimum sum of the squared errors. For this reason, regression analysis is sometimes called least squares regression.

Statisticians have developed formulas that we can use to find the equation of a straight line that would minimize the sum of the squared errors. The simple linear regression equation is

\hat{Y} = b_{0} + b_{1} X

$\hat{Y} = b_{0} + b_{1} X$

The following formulas can be used to compute the slope and the intercept:

\begin{array}{l} \bar{X} & = & \frac{\sum X}{n} = Average (mean) of X value \\ \bar{Y} & = & \frac{\sum X}{n} = Average (mean) of Y values \\ b_{1} & = & \frac{\sum (X - \bar{X}) (Y - \bar{Y})}{\sum {(X - \bar{X})}^{2}} \end{array}

$\begin{array}{l} \bar{X} & = & \frac{\sum X}{n} = Average (mean) of X value \\ \bar{Y} & = & \frac{\sum X}{n} = Average (mean) of Y values \\ b_{1} & = & \frac{\sum (X - \bar{X}) (Y - \bar{Y})}{\sum {(X - \bar{X})}^{2}} \end{array}$ (4-4)

b_{0} = \bar{Y} - b_{1} \bar{X}

$b_{0} = \bar{Y} - b_{1} \bar{X}$ (4-5)

The preliminary calculations are shown in Table 4.2. There are other “shortcut” formulas that are helpful when doing the computations on a calculator, and these are presented in Appendix 4.1. They will not be shown here, as computer software will be used for most of the other examples in this chapter.

Table 4.2 Regression Calculations for Triple A Construction

Y	X	${(X - \bar{X})}^{2}$ ${(X - \bar{X})}^{2}$	$(X - \bar{X}) (Y - \bar{Y})$ $(X - \bar{X}) (Y - \bar{Y})$
6	3	${(3 - 4)}^{2} = 1$ ${(3 - 4)}^{2} = 1$	$(3 - 4) (6 - 7) = 1$ $(3 - 4) (6 - 7) = 1$
8	4	${(4 - 4)}^{2} = 0$ ${(4 - 4)}^{2} = 0$	$(4 - 4) (8 - 7) = 0$ $(4 - 4) (8 - 7) = 0$
9	6	${(6 - 4)}^{2} = 4$ ${(6 - 4)}^{2} = 4$	$(6 - 4) (9 - 7) = 4$ $(6 - 4) (9 - 7) = 4$
5	4	${(4 - 4)}^{2} = 0$ ${(4 - 4)}^{2} = 0$	$(4 - 4) (5 - 7) = 0$ $(4 - 4) (5 - 7) = 0$
4.5	2	${(2 - 4)}^{2} = 4$ ${(2 - 4)}^{2} = 4$	$(2 - 4) (4.5 - 7) = 5$ $(2 - 4) (4.5 - 7) = 5$
9.5	5	${(5 - 4)}^{2} = 1$ ${(5 - 4)}^{2} = 1$	$(5 - 4) (9.5 - 7) = 2.5$ $(5 - 4) (9.5 - 7) = 2.5$
$\begin{array}{l} Σ Y & = & 42 \\ \bar{Y} & = & 42 / 6 = 7 \end{array}$ $\begin{array}{l} Σ Y & = & 42 \\ \bar{Y} & = & 42 / 6 = 7 \end{array}$	$\begin{array}{l} Σ X & = & 24 \\ \bar{X} & = & 24 / 6 = 4 \end{array}$ $\begin{array}{l} Σ X & = & 24 \\ \bar{X} & = & 24 / 6 = 4 \end{array}$	$\sum {(X - \bar{X})}^{2} = 10$ $\sum {(X - \bar{X})}^{2} = 10$	$\sum (X - \bar{X}) (Y - \bar{Y}) = 12.5$ $\sum (X - \bar{X}) (Y - \bar{Y}) = 12.5$

Computing the slope and the intercept of the regression equation for the Triple A Construction Company example, we have

\begin{array}{l} \bar{X} & = & \frac{Σ X}{6} = \frac{24}{6} = 4 \\ \bar{Y} & = & \frac{Σ X}{6} = \frac{42}{6} = 7 \\ b_{1} & = & \frac{Σ (X - \bar{X}) (Y - \bar{Y})}{Σ {(X - \bar{X})}^{2}} = \frac{12.5}{10} = 1.25 \\ b_{0} & = & \bar{Y} - b_{1} \bar{X} = 7 - (1.25) (4) = 2 \end{array}

$\begin{array}{l} \bar{X} & = & \frac{Σ X}{6} = \frac{24}{6} = 4 \\ \bar{Y} & = & \frac{Σ X}{6} = \frac{42}{6} = 7 \\ b_{1} & = & \frac{Σ (X - \bar{X}) (Y - \bar{Y})}{Σ {(X - \bar{X})}^{2}} = \frac{12.5}{10} = 1.25 \\ b_{0} & = & \bar{Y} - b_{1} \bar{X} = 7 - (1.25) (4) = 2 \end{array}$

The estimated regression equation therefore is

\hat{Y} = 2 + 1.25 X

$\hat{Y} = 2 + 1.25 X$

Sales = 2 + 1.25 (Payroll)

$Sales = 2 + 1.25 (Payroll)$

If the payroll next year is $600 million $(X = 6),$ $(X = 6),$ then the predicted value would be

\hat{Y} = 2 + 1.25 (6) = 9.5

$\hat{Y} = 2 + 1.25 (6) = 9.5$

or $950,000.

One of the purposes of regression is to understand the relationship among variables. This model tells us that each time the payroll increases by $100 million (represented by X), we would expect the sales to increase by $125,000, since $b_{1} = 1.25$ $b_{1} = 1.25$ ($100,000). This model helps Triple A Construction see how the local economy and company sales are related.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 4.2 Simple Linear Regression

Create new playlist

Sign In

Sign Up

Table of Contents for
4.2 Simple Linear Regression