2
Location and Simple Linear Models

In statistical literature, various methods of estimation of the parameters of a given model are available, primarily based on the least squares estimator (LSE) and maximum likelihood estimator (MLE) principle. However, when uncertain prior information of the parameters is known, the estimation technique changes. The methods of circumvention, including the uncertain prior information, are of immense importance in the current statistical literature. In the area of classical approach, preliminary test (PT) and the Stein‐type (S) estimation methods dominate the modern statistical literature, side by side with the Bayesian methods.

In this chapter, we consider the simple linear model and the estimation of the parameters of the model along with their shrinkage version and study their properties when the errors are normally distributed.

2.1 Introduction

Consider the simple linear model with slope images and intercept images, given by

(2.1)equation

If images, the model (2.1) reduces to

(2.2)equation

where images is the location parameter of a distribution.

In the following sections, we consider the estimation and test of the location model, i.e. the model of (2.2), followed by the estimation and test of the simple linear model.

2.2 Location Model

In this section, we introduce two basic penalty estimators, namely, the ridge regression estimator (RRE) and the least absolute shrinkage and selection operator (LASSO) estimator for the location parameter of a distribution. The penalty estimators have become viral in statistical literature. The subject evolved as the solution to ill‐posed problems raised by Tikhonov (1963) in mathematics. In 1970, Hoerl and Kennard applied the Tikhonov method of solution to obtain the RRE for linear models. Further, we compare the estimators with the LSE in terms of images‐risk function.

2.2.1 Location Model: Estimation

Consider the simple location model,

(2.3)equation

where images, imagesimages‐tuple of 1's, and images images‐vector of i.i.d. random errors such that images and images, images is the identity matrix of rank images (images), images is the location parameter, and, in this case, images may be unknown.

The LSE of images is obtained by

(2.4)equation

Alternatively, it is possible to minimize the log‐likelihood function when the errors are normally distributed:

equation

giving the same solution (2.4) as in the case of LSE. It is known that the images is unbiased, i.e. images and the variance of images is given by

equation

The unbiased estimator of images is given by

(2.5)equation

The mean squared error (MSE) of images, any estimator of images, is defined as

equation

Test for images when images is known:

For the test of null‐hypothesis images vs. images, we use the test statistic

(2.6)equation

Under the assumption of normality of the errors, images, where images Hence, we reject images whenever images exceeds the threshold value from the null distribution. An interesting threshold value is images.

For large samples, when the distribution of errors has zero mean and finite variance images, under a sequence of local alternatives,

(2.7)equation

and assuming images and images (images), images, the asymptotic distribution of images is images. Then the test procedure remains the same as before.

2.2.2 Shrinkage Estimation of Location

In this section, we consider a shrinkage estimator of the location parameter images of the form

(2.8)equation

where images. The bias and the MSE of images are given by

(2.9)equation

Minimizing images w.r.t. images, we obtain

(2.10)equation

So that

(2.11)equation

Thus, images is an increasing function of images and the relative efficiency (REff) of images compared to images is

(2.12)equation

Further, the MSE difference is

(2.13)equation

Hence, images outperforms the images uniformly.

2.2.3 Ridge Regression–Type Estimation of Location Parameter

Consider the problem of estimating images when one suspects that images may be 0. Then following Hoerl and Kennard (1970), if we define

(2.14)equation

Then, we obtain the ridge regression–type estimate of images as

(2.15)equation

or

(2.16)equation

Note that it is the same as taking images in (2.8).

Hence, the bias and MSE of images are given by

(2.17)equation

and

(2.18)equation

It may be seen that the optimum value of images is images and MSE at (2.18) equals

(2.19)equation

Further, the MSE difference equals

(2.20)equation

which shows images uniformly dominates images.

The REff of images is given by

equation

2.2.4 LASSO for Location Parameter

In this section, we define the LASSO estimator of images introduced by Tibshirani (1996) in connection with the regression model.

Donoho and Johnstone (1994) defined this estimator as the “soft threshold estimator” (STE).

2.2.5 Bias and MSE Expression for the LASSO of Location Parameter

In order to derive the bias and MSE of LASSO estimators, we need the following lemma.

Using Lemma 2.1, we can find the bias and MSE expressions of images.

2.2.6 Preliminary Test Estimator, Bias, and MSE

Based on Saleh (2006), the preliminary test estimators (PTEs) of images under normality assumption of the errors are given by

(2.27)equation

Thus, we have the following theorem about bias and MSE.

2.2.7 Stein‐Type Estimation of Location Parameter

The PT heavily depends on the critical value of the test that images may be zero. Thus, due to down effect of discreteness of the PTE, we define the Stein‐type estimator of images as given here assuming images is known

(2.28)equation

The bias of images is images, and the MSE of images is given by

(2.29)equation

The value of images that minimizes images is images, which is a decreasing function of images with a maximum at images and maximum value images. Hence, the optimum value of MSE is

(2.30)equation

The REff compared to LSE, images is

(2.31)equation

In general, the images decreases from images at images, then it crosses the 1‐line at images, and for images, images performs better than images.

2.2.8 Comparison of LSE, PTE, Ridge, SE, and LASSO

We know the following MSE from previous sections:

equation

Hence, the REff expressions are given by

equation

Table 2.1 Table of relative efficiency.

images images images images images images
0.000 1.000 images 4.184 2.752 9.932
0.316 1.000 11.000 2.647 2.350 5.694
0.548 1.000  4.333 1.769 1.849 3.138
0.707 1.000  3.000 1.398 1.550 2.207
1.000 1.000  2.000 1.012 1.157 1.326
1.177 1.000  1.721 0.884 1.000 1.046
1.414 1.000  1.500 0.785 0.856 0.814
2.236 1.000  1.200 0.750 0.653 0.503
3.162 1.000  1.100 0.908 0.614 0.430
3.873 1.000  1.067 0.980 0.611 0.421
4.472 1.000  1.050 0.996 0.611 0.419
5.000 1.000  1.040 0.999 0.611 0.419
5.477 1.000  1.033 1.000 0.611 0.419
6.325 1.000  1.025 1.000 0.611 0.419
7.071 1.000  1.020 1.000 0.611 0.419

It is seen from Table 2.1 that the RRE dominates all other estimators uniformly and LASSO dominates UE, PTE, and images in an interval near 0. From Table 2.1, we find images in the interval images while outside this interval images. Figure 2.1 confirms that.

Graph depicting the relative efficiencies of LSE, RRE, PTE, SE, and LASSO-type estimators.

Figure 2.1 Relative efficiencies of the estimators.

2.3 Simple Linear Model

In this section, we consider the model (2.1) and define the PT, ridge, and LASSO‐type estimators when it is suspected that the slope may be zero.

2.3.1 Estimation of the Intercept and Slope Parameters

First, we consider the LSE of the parameters. Using the model (2.1) and the sample information from the normal distribution, we obtain the LSEs of images as

(2.32)equation

where

(2.33)equation

The exact distribution of images is a bivariate normal with mean images and covariance matrix

(2.34)equation

An unbiased estimator of the variance images is given by

(2.35)equation

which is independent of images, and images follows a central chi‐square distribution with images degrees of freedom (DF)

2.3.2 Test for Slope Parameter

Suppose that we want to test the null‐hypothesis images vs. images. Then, we use the likelihood ratio (LR) test statistic

(2.36)equation

where images follows a noncentral chi‐square distribution with 1 DF and noncentrality parameter images and images follows a noncentral images‐distribution with images, where images is DF and also the noncentral parameter is

equation

Under images, images follows a central chi‐square distribution and images follows a central images‐distribution. At the images‐level of significance, we obtain the critical value images or images from the distribution and reject images if images or images; otherwise, we accept images.

2.3.3 PTE of the Intercept and Slope Parameters

This section deals with the problem of estimation of the intercept and slope parameters images when it is suspected that the slope parameter images may be images.

From (2.30), we know that the LSE of images is given by

(2.37)equation

If we know images to be images exactly, then the restricted least squares estimator (RLSE) of images is given by

(2.38)equation

In practice, the prior information that images is uncertain. The doubt regarding this prior information can be removed using Fisher's recipe of testing the null‐hypothesis images against the alternative images. As a result of this test, we choose images or images based on the rejection or acceptance of images. Accordingly, in case of the unknown variance, we write the estimator as

(2.39)equation

called the PTE, where images is the images‐level upper critical value of a central images‐distribution with images DF and images is the indicator function of the set images. For more details on PTE, see Saleh (2006), Ahmed and Saleh (1988), Ahsanullah and Saleh (1972), Kibria and Saleh (2012) and, recently Saleh et al. (2014), among others. We can write PTE of images as

(2.40)equation

If images, images is always chosen; and if images, images is chosen. Since images, images in repeated samples, this will result in a combination of images and images. Note that the PTE procedure leads to the choice of one of the two values, namely, either images or images. Also, the PTE procedure depends on the level of significance images.

Clearly, images is the unrestricted estimator of images, while images is the restricted estimator. Thus, the PTE of images is given by

(2.41)equation

Now, if images, images is always chosen; and if images, images is always chosen.

Since our interest is to compare the LSE, RLSE, and PTE of images and images with respect to bias and the MSE, we obtain the expression of these quantities in the following theorem. First we consider the bias expressions of the estimators.

Next, we consider the expressions for the MSEs of images, images, and images along with the images, images, and images.

2.3.4 Comparison of Bias and MSE Functions

Since the bias and MSE expressions are known to us, we may compare them for the three estimators, namely, images, images, and images as well as images, images, and images. Note that all the expressions are functions of images, which is the noncentrality parameter of the noncentral images‐distribution. Also, images is the standardized distance between images and images. First, we compare the bias functions as in Theorem 2.4, when images is unknown.

For images or under images,

equation

Otherwise, for all images and images,

equation

The absolute bias of images is linear in images, while the absolute bias of images increases to the maximum as images moves away from the origin, and then decreases toward zero as images. Similar conclusions hold for images.

Now, we compare the MSE functions of the restricted estimators and PTEs with respect to the traditional estimator, images and images, respectively. The REff of images compared to images may be written as

(2.42)equation

The efficiency is a decreasing function of images. Under images (i.e. images), it has the maximum value

(2.43)equation

and images, accordingly, as images. Thus, images performs better than images whenever images; otherwise, images performs better images.

The REff of images compared to images may be written as

(2.44)equation

where

(2.45)equation

Under the images, it has the maximum value

(2.46)equation

and images according as

(2.47)equation

Hence, images performs better than images if images; otherwise, images is better than images. Since

(2.48)equation

we obtain

(2.49)equation
Graph depicting the relative efficiencies of four location parameters.

Figure 2.2 Graph of images and images for images and images.

As for the PTE of images, it is better than images, if

(2.50)equation

Otherwise, images is better than images. The

(2.51)equation

Under images,

(2.52)equation

See Figure 2.2 for visual comparison between estimators.

2.3.5 Alternative PTE

In this subsection, we provide the alternative expressions for the estimator of PT and its bias and MSE. To test the hypothesis images vs. images, we use the following test statistic:

(2.53)equation

The PTE of images is given by

(2.54)equation

where images.

Hence, the bias of images equals images, and the MSE is given by

(2.55)equation

Next, we consider the Stein‐type estimator of images as

(2.56)equation

The bias and MSE expressions are given respectively by

(2.57)equation

As a consequence, we may define the PT and Stein‐type estimators of images given by

(2.58)equation

Then, the bias and MSE expressions of images are

(2.59)equation

where

equation

Similarly, the bias and MSE expressions for images are given by

(2.60)equation

2.3.6 Optimum Level of Significance of Preliminary Test

Consider the REff of images compared to images. Denoting it by images, we have

(2.61)equation

where

(2.62)equation

The graph of images, as a function of images for fixed images, is decreasing crossing the 1‐line to a minimum at images (say); then it increases toward the 1‐line as images. The maximum value of images occurs at images with the value

equation

for all images, the set of possible values of images. The value of images decreases as images‐values increase. On the other hand, if images and images vary, the graphs of images and images intersect at images. In general, images and images intersect within the interval images; the value of images at the intersection increases as images‐values increase. Therefore, for two different images‐values, images and images will always intersect below the 1‐line.

In order to obtain a PTE with a minimum guaranteed efficiency, images, we adopt the following procedure: If images, we always choose images, since images in this interval. However, since in general images is unknown, there is no way to choose an estimate that is uniformly best. For this reason, we select an estimator with minimum guaranteed efficiency, such as images, and look for a suitable images from the set, images. The estimator chosen maximizes images over all images and images. Thus, we solve the following equation for the optimum images:

(2.63)equation

The solution images obtained this way gives the PTE with minimum guaranteed efficiency images, which may increase toward images given by (2.61), and Table 2.2. For the following given data, we have computed the maximum and minimum guaranteed REff for the estimators of images and provided them in Table 2.2.

equation

Table 2.2 Maximum and minimum guaranteed relative efficiency.

images
0.05 0.10 0.15 0.20 0.25 0.50
images
images 4.825 2.792 2.086 1.726 1.510 1.101
images 0.245 0.379 0.491 0.588 0.670 0.916
images 8.333 6.031 5.005 4.429 4.004 3.028
images
images 4.599 2.700 2.034 1.693 1.487 1.097
images 0.268 0.403 0.513 0.607 0.686 0.920
images 7.533 5.631 4.755 4.229 3.879 3.028
images
images 4.325 2.587 1.970 1.652 1.459 1.091
images 0.268 0.403 0.513 0.607 0.686 0.920
images 6.657 5.180 4.454 4.004 3.704 2.978
images
images 4.165 2.521 1.933 1.628 1.443 1.088
images 0.319 0.452 0.557 0.644 0.717 0.928
images 6.206 4.955 4.304 3.904 3.629 2.953

2.3.7 Ridge‐Type Estimation of Intercept and Slope

In this section, we consider the ridge‐type shrinkage estimation of images when it is suspected that the slope images may be 0. In this case, we minimize the objective function with a solution as given here:

(2.64)equation

which yields two equations

equation

Hence,

(2.65)equation

2.3.7.1 Bias and MSE Expressions

From (2.65), it is easy to see that the bias expression of images and images, respectively, are given by

(2.66)equation

Similarly, MSE expressions of the estimators are given by

(2.67)equation

where images and

(2.68)equation

Hence, the REff of these estimators are given by

(2.69)equation

Note that the optimum value of images is images. Hence,

(2.70)equation

2.3.8 LASSO Estimation of Intercept and Slope

In this section, we consider the LASSO estimation of images when it is suspected that images may be 0. For this case, the solution is given by

equation

Explicitly, we find

equation

where images and images.

According to Donoho and Johnstone (1994), and results of Section 2.2.5, the bias and MSE expressions for images are given by

(2.71)equation

where

(2.72)equation

Similarly, the bias and MSE expressions for images are given by

(2.73)equation

Then the REff is obtained as

(2.74)equation

For the following given data, we have computed the REff for the estimators of images and images and provided them in Tables 2.3 and 2.4 and in Figures 2.3 and 2.4, respectively.

equation

It is seen from Tables 2.3 and 2.4 and Figures 2.3 and 2.4 that the RRE dominates all other estimators but the restricted estimator uniformly and that LASSO dominates LSE, PTE, and SE uniformly except RRE and RLSE in a subinterval images.

Table 2.3 Relative efficiency of the estimators for images.

Delta LSE RLSE PTE RRE LASSO SE
0.000 1.000 images 2.987 9.426 5.100 2.321
0.100 1.000 10.000 2.131 5.337 3.801 2.056
0.300 1.000 3.333 1.378 3.201 2.558 1.696
0.500 1.000 2.000 1.034 2.475 1.957 1.465
1.000 1.000 1.000 0.666 1.808 1.282 1.138
1.177 1.000 0.849 0.599 1.696 1.155 1.067
2.000 1.000 0.500 0.435 1.424 0.830 0.869
5.000 1.000 0.200 0.320 1.175 0.531 0.678
10.000 1.000 0.100 0.422 1.088 0.458 0.640
15.000 1.000 0.067 0.641 1.059 0.448 0.638
20.000 1.000 0.050 0.843 1.044 0.447 0.637
25.000 1.000 0.040 0.949 1.036 0.447 0.637
30.000 1.000 0.033 0.986 1.030 0.447 0.637
40.000 1.000 0.025 0.999 1.022 0.447 0.637
50.000 1.000 0.020 1.000 1.018 0.447 0.637

Table 2.4 Relative efficiency of the estimators for images.

Delta LSE RLSE PTE RRE LASSO SE
0.000 1.000 images 3.909 images 9.932 2.752
0.100 1.000 10.000 2.462 10.991 5.694 2.350
0.300 1.000 3.333 1.442 4.330 3.138 1.849
0.500 1.000 2.000 1.039 2.997 2.207 1.550
1.000 1.000 1.000 0.641 1.998 1.326 1.157
1.177 1.000 0.849 0.572 1.848 1.176 1.075
2.000 1.000 0.500 0.407 1.499 0.814 0.856
5.000 1.000 0.200 0.296 1.199 0.503 0.653
10.000 1.000 0.100 0.395 1.099 0.430 0.614
15.000 1.000 0.067 0.615 1.066 0.421 0.611
20.000 1.000 0.050 0.828 1.049 0.419 0.611
25.000 1.000 0.040 0.943 1.039 0.419 0.611
30.000 1.000 0.033 0.984 1.032 0.419 0.611
40.000 1.000 0.025 0.999 1.024 0.419 0.611
50.000 1.000 0.020 1.000 1.019 0.419 0.611
Graph depicting the relative efficiency of LSE, RLSE, PTE, RRE, LASSO, and SE estimators for q.

Figure 2.3 Relative efficiency of the estimators for images.

Graph depicting the relative efficiency of LSE, RLSE, PTE, RRE, LASSO, and SE estimators for b.

Figure 2.4 Relative efficiency of the estimators for images.

2.4 Summary and Concluding Remarks

This chapter considers the location model and the simple linear regression model when errors of the models are normally distributed. We consider LSE, RLSE, PTE, SE and two penalty estimators, namely, the RRE and the LASSO estimator for the location parameter for the location model and the intercept and slope parameter for the simple linear regression model. We found that the RRE uniformly dominates LSE, PTE, SE, and LASSO. However, RLSE dominates all estimators near the null hypothesis. LASSO dominates LSE, PTE, and SE uniformly.

Problems

  1. 2.1 Derive the estimate in (2.4) using the least squares method.
  2. 2.2
    1. Consider the simple location model images and show that for testing the null‐hypothesis images against of images, the test statistic is
      equation
      and
      equation
      where images is the unbiased estimator of images.
    2. What will be the distribution for images and images under null and alternative hypotheses?
  3. 2.3 Show that the optimum value of ridge parameter images is images and
    equation
  4. 2.4 Consider LASSO for location parameter images and show that
    equation
  5. 2.5 Prove Theorem 2.3.
  6. 2.6 Consider the simple linear model and derive the estimates given in (2.32) using the least squares method.
  7. 2.7 Consider the simple linear model and show that to test the null‐hypothesis images vs. images. Then, we use the LR test statistic
    equation

    where images follows a noncentral chi‐square distribution with 1 DF and noncentrality parameter images and images follows a noncentral images‐distribution with images, images DF and noncentral parameter,

  8. 2.8 Prove Theorems 2.4 and 2.5.
  9. 2.9 Consider the LASSO estimation of the intercept and slope models and show that the bias and MSE of images are, respectively,
    equation
  10. 2.10 Show that when images is known, LASSO outperforms the PTE whenever
    equation
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset