Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

7
BASIC ASYMPTOTICS: LARGE SAMPLE THEORY

7.1 INTRODUCTION

In Chapter 6 we described some methods of finding exact distributions of sample statistics and their moments. While these methods are used in some cases such as sampling from a normal population when the sample statistic of interest is or S², often either the statistics of interest, say , is either too complicated or its exact distribution is not simple to work with. In such cases we are interested in the convergence properties of T_n. We want to know what happens when the sample size is large. What is the limiting distribution of T_n? When the exact distribution of T_n (and its moments) is unknown or too complicated we will often use their asymptotic approximations when n is large.

In this chapter, we discuss some basic elements of statistical asymptotics. In Section 7.2 we discuss various modes of convergence of a sequence of random variables. In Sections 7.3 and 7.4 the laws of large numbers are discussed. Section 7.5 deals with limiting moment generating functions and in Section 7.6 we discuss one of the most fundamental theorem of classical statistics called the central limit theorem. In Section 7.7 we consider some statistical applications of these methods.

The reader may find some parts of this chapter a bit difficult on first reading. Such a discussion has been indicated with a^†.

7.2 MODES OF CONVERGENCE

In this section we consider several modes of convergence and investigate their interrelationships. We begin with the weakest mode of convergence.

It must be remembered that it is quite possible for a given sequence DFs to converge to a function that is not a DF.

We next give an example to show that weak convergence of distribution functions does not imply the convergence of corresponding PMF’s or PDF’s.

The following result is easy to prove.

In the continuous case we state the following result of Scheffé [100] without proof.

The following result is easy to establish.

A slightly stronger concept of convergence is defined by convergence in probability.

Remark 1. We emphasize that the definition says nothing about the convergence of the RVs X_n to the RV X in the sense in which it is understood in real analysis. Thus does not imply that, given , we can find an N such that . Definition 2 speaks only of the convergence of the sequence of probabilities to 0.

The following statements can be verified.

.
, for , and it follows that for every .
as , for
.
.
.
, for
, for
and each of the three terms on the right goes to 0 as .
.
and Y an .
Note that Y is an RV so that, given , there exists a such that . Thus
, for
The result now follows on multiplication, using result 10. It also follows that .

We remark that a more general result than Theorem 4 is true and state it without proof (see Rao [88, p. 124]): and g continuous on .

The following two theorems explain the relationship between weak convergence and convergence in probability.

Remark 2. We emphasize that we cannot improve the above result by replacing k by an RV, that is, in general does not imply , for let X, X₁, X₂… be identically distributed RVs, and let the joint distribution of (X_n, X) be as follows:

Clearly, . But

Hence, , but .

Remark 3. Example 3 shows that does not imply for any , k integral.

We get, in addition, that implies .

Proof. The proof is left to the reader.

As a simple consequence of Theorem 8 and its corollary we see that together imply .

Remark 4. Clearly the converse to Theorem 10 cannot hold, since does not imply .

Remark 5. In view of Theorem 9, it follows that for .

The following result elucidates Definition 4.

Remark 6. Thus means that, for arbitrary, we can find an n₀ such that

(5)

Indeed, we can write, equivalently, that

(6)

That the converse of Theorem 12 does not hold is shown in the following example.

Remark 7. In Theorem 7.4.3 we prove a result which is sometimes useful in proving a.s. convergence of a sequence of RVs.

Since is arbitrary and x is a continuity point of , we get the result by letting .

Later on we will see that the condition that the X_i’s be (0, 1) is not needed. All we need is that .

PROBLEMS 7.2

Let X₁, X₂,… be a sequence of RVs with corresponding DFs given by if . Does F_n converge to a DF?
Let X₁, X₂… be iid (0, 1) RVs. Consider the sequence of RVs , where . Let F_n be the DF of ., Find . Is this limit a DF?
Let X₁, X₂,… be iid U(0, θ) RVs. Let , and consider the sequence . Does Y_n converge in distribution to some RV Y? If so, find the DF of RV Y.
Let X₁, X₂ be iid RVs with common absolutely continuous DF F. Let , and consider the sequence of . Find the limiting DF of Y_n.
Let X₁, X₂,… be a sequence of iid RVs with common PDF . Write .
1. Show that .
2. Show that .
Let X₁, X₂,… be iid U[0, θ] RVs. Show that .
Let {X_n} be a sequence of RVs such that . Let a_n be a sequence of positive constants such that . Show that .
Let {X_n} be a sequence of RVs such that for all n and some constant . Suppose that . Show that for any .
Let X₁, X₂,…, X_2n be iid (0, 1) RVs. Define

Find the limiting distribution of Z_n.
Let {X_n} be a sequence of geometric RVs with parameter . Also, let . Show that as
(Prochaska [82]).
Let X_n be a sequence of RVs such that , and let c_n be a sequence of real numbers such that as . Show that .
Does convergence almost surely imply convergence of moments?
Let X₁, X₂,… be a sequence of iid RVs with common DF F, and write .
1. For , . Find the limiting distribution of . Also, find the PDF corresponding to the limiting DF and compute its moments
2. If F satisfies
  find the limiting DF of and compute the corresponding PDF and the MGF.
3. If X_i is bounded above by x₀ with probability 1, and for some
  find the limiting distribution of , the corresponding PDF, and the moments of the limiting distribution.
(The above remarkable result, due to Gnedenko [36], exhausts all limiting distributions of X_(n) with suitable norming and centering.)
Let {F_n} be a sequence of DFs that converges weakly to a DF F which is continuous everywhere. Show that F_n(x) converges to F(x) uniformly.
Prove Theorem 1.
Prove Theorem 6.
Prove Theorem 13.
Prove Corollary 1 to Theorem 8.
Let V be the class of all random variables defined on a probability space with finite expectations, and for define

Show the following:
1. .
2. is a distance function on V (assuming that we identify RVs that are a.s. equal).
3. .
For the following sequences of RVs {X_n}, investigate convergence in probability and convergence in rth mean.
1. .
2. .

7.3 WEAK LAW OF LARGE NUMBERS

Let {X_n} be a sequence of RVs. Write, . In this section we answer the following question in the affirmative: Do there exist sequences of constants A_n and , such that the sequence of RVs converges in probability to 0 as ?

Remark 1. Since condition (1) applies not to the individual variables but to their sum, Theorem 2 is of limited use. We note, however, that all weak laws of large numbers obtained as corollaries to Theorem 1 follow easily from Theorem 2 (Problem 6).

Let X₁, X₂,… be an arbitrary sequence of RVs, and let . Let us truncate each X_i at , that is, let

Write

Inequality (6) yields the following important theorem.

We emphasize that in Theorem 3 we require only that ; nothing is said about the variance. Theorem 3 is due to Khintchine.

PROBLEMS 7.3

Let X₁, X₂,… be a sequence of iid RVs with common uniform distribution on [0, 1]. Also, let be the geometric mean of X₁, X₂,…,X_n, . Show that , where c is some constant. Find c.
Let X₁, X₂,… be iid RVs with finite second moment. Let

Show that .
Let X₁, X₂,… be a sequence of iid RVs with and . Let . Does the sequence S_k obey the WLLN in the sense of Definition 1? If so, find the centering and the norming constants.
Yes;
Let {X_n} be a sequence of RVs for which for all n and . Show that the WLLN holds.
For the following sequences of independent RVs does the WLLN hold?
1. .
2. .
3. .
4. .
5. .
Let X₁, X₂,… be a sequence of independent RVs such that for , and as . Prove the WLLN, using Theorem 2.
Let X_n be a sequence of RVs with common finite variance σ². Suppose that the correlation coefficient between X_i and X_j is < 0 for all . Show that the WLLN holds for the sequence {X_n}.
Let {X_n} be a sequence of RVs such that X_k is independent of X_j for or . If for all k, where C is some constant, the WLLN holds for {X_k}.
For any sequence of RVs {X_n} show that
Let X₁, X₂,… be iid (1, 0) RVs. Use Theorem 2 to show that the weak law of large numbers does not hold. That is, show that
Let {X_n} be a sequence of iid RVs with . Let . Suppose {a_n} is a sequence of constants such that . Show that
1. as and (b) .

7.4 STRONG LAW OF LARGE NUMBERS^†

In this section we obtain a stronger form of the law of large numbers discussed in Section 7.3. Let X₁, X₂,… be a sequence of RVs defined on some probability space (Ω, , P).

We will obtain sufficient conditions for a sequence {X_n} to obey the SLLN. In what follows, we will be interested mainly in the case . Indeed, when we speak of the SLLN we will assume that we are speaking of the norming constants , unless specified otherwise.

We start with the Borel-Cantelli lemma. Let {Aj} be any sequence of events in . We recall that

(2)

We will write . Note that A is the event that infinitely many of the A_n occur. We will sometimes write

where “i.o.” stands for “infinitely often.” In view of Theorem 7.2.11 and Remark 7.2.6 we have if and only if for all .

We next prove some important lemmas that we will need subsequently.

As a corollary we get a version of the SLLN for nonidentically distributed RVs which subsumes Theorem 2.

Remark 1. Kolmogorov’s SLLN is much stronger than Corollaries 1 and 4 to Theorem 4. It states that if {X_n} is a sequence of iid RVs then

and then . The proof requires more work and will not be given here. We refer the reader to Billingsley [6], Chung [15], Feller [26], or Laha and Rohatgi [58].

PROBLEMS 7.4

For the following sequences of independent RVs does the SLLN hold?
1. .
2. .
3. .
Let X₁, X₂,… be a sequence of independent RVs with . Show that

Does the converse also hold?
For what values of α does the SLLN hold for the sequence
Let be a sequence of real numbers such that . Show that there exists a sequence of independent RVs {X_k} with , such that does not converge to 0 almost surely.

[Hint: Let , and . Apply the Borel-Cantelli lemma to .]
Let X_n be a sequence of iid RVs with . Show that, for every positive number and .
Construct an example to show that the converse of Theorem 1(a) does not hold.
Investigate a.s. convergence of {X_n} to 0 in each case.
1. .
2. .
(X_n’s are independent in each case.)

7.5 LIMITING MOMENT GENERATING FUNCTIONS

Let X₁,X₂,… be a sequence of RVs. Let F_n be the DF of , and suppose that the MGF M_n(t) of F_n exists. What happens to M_n(t) as ? If it converges, does it always converge to an MGF?

Next suppose that X_n has MGF M_n and , where X is an RV with MGF M. Does ? The answer to this question is in the negative.

The following result is a weaker version of the continuity theorem due to Lévy and Cramér. We refer the reader to Lukacs [69, p. 47], or Curtiss [19], for details of the proof.

Remark 1. The following notation on orders of magnitude is quite useful. We write if, given , there exists an N such that for all and if there exists an N and a constant , such that |x_n/r_n| for all . We write to express the fact that x_n is bounded for large n, and to mean that as .

This notation is extended to RVs in an obvious manner. Thus if, for every and , there exists an N such that for , and if, for , there exists a and an N such that . We write to mean . This notation can be easily extended to the case where r_n itself is an RV.

The following lemma is quite useful in applications of Theorem 1.

For more examples see Section 7.6.

Remark 2. As pointed out earlier working with MGFs has the disadvantage that the existence of MGFs is a very strong condition. Working with CFs which always exist, on the other hand, permits a much wider application of the continuity theorem. Let ϕ_n be the CF of F_n. Then if and only if as on , where ϕ is continuous at . In this case ϕ, the limit function, is the CF of the limit DF F.

PROBLEMS 7.5

Let . Show that

here .
Let . Show that as , in such a way that , where .
Let X₁, X₂… be independent RVs with PMF given by . Let . Show that , where .
Let {X_n} be a sequence of RVs with where is a constant (independent of n). Find the limiting distribution of X_n/n.
Let . Find the limiting distribution of X_n/n².
Let X₁, X₂,…, X_n be jointly normal with for all i and . What is the limiting distribution of n^— 1S_n, where

7.6 CENTRAL LIMIT THEOREM

Let X₁, X₂,… be a sequence of RVs, and let . In Sections 7.3 and 7.4 we investigated the convergence of the sequence of RVs to the degenerate RV. In this section we examine the convergence of to a nondegenerate RV. Suppose that, for a suitable choice of constants A_n and , the RVs . What are the properties of this limit RV Y? The question as posed is far too general and is not of much interest unless the RVs X_i are suitably restricted. For example, if we take X₁ with DF F and X₂, X_3,… to be 0 with probability 1, choosing and leads to F as the limit DF.

We recall (Example 7.5.6) that, if X₁, X₂,…, X_n are iid RVs with common law (1, 0), then is also (1,0). Again, if X₁, X₂,…, X_n are iid (0, 1) RVs then is also (0, 1) (Corollary 2 to Theorem 5.3.22). We note thus that for certain sequences of RVs there exist sequences A_n and , such that . In the Cauchy case , , and in the normal case . Moreover, we see that Cauchy and normal distributions appear as limiting distributions—in these two cases, because of the reproductive nature of the distributions. Cauchy and normal distributions are examples of stable distributions.

Let X₁, X₂,… be iid RVs with common DF F. We remark without proof (see Loève [66, p. 339]) that only stable distributions occur as limits. To make this statement more precise we make the following definition.

In view of the statement after Definition 1, we see that only stable distributions possess domains of attraction. From Definition 1 we also note that each stable law belongs to its own domain of attraction. The study of stable distributions is beyond the scope of this book. We shall restrict ourselves to seeking conditions under which the limit law V is the normal distribution. The importance of the normal distribution in statistics is due largely to the fact that a wide class of distributions F belongs to the domain of attraction of the normal law. Let us consider some examples.

These examples suggest that if we take iid RVs with finite variance and take , then , where Z is (0, 1). This is the central limit result, which we now prove. The reader should note that in both Examples 1 and 2 we used more than just the existence of E|X|². Indeed, the MGF exists and hence moments of all order exist. The existence of MGF is not a necessary condition.

Remark 1. In the proof above we could have used the Taylor series expansion of M to arrive at the same result.

Remark 2. Even though we proved Theorem 1 for the case when the MGF of X_n’s exists, we will use the result whenever . The use of CFs would have provided a complete proof of Theorem 1. Let ϕ be the CF of X_n. Assuming again, without loss of generality, that , we can write

Thus the CF of is

which converges to which is the CF of a (0, 1) RV. The devil is in the details of the proof.

The following converse to Theorem 1 holds.

For nonidentically distributed RVs we state, without proof, the following result due to Lindeberg.

Feller [24] has shown that condition (2) is necessary as well in the following sense. For independent RVs {X_k} for which (3) holds and

(2) holds for every .

If , then , say, as . For fixed k, we can find ε_k such that and then . For , we have

so that the Lindeberg condition does not hold. Indeed, if X₁, X₂,…. are independent RVs such that there exists a constant A with for all n, the Lindeberg condition (2) is satisfied if . To see this, suppose that . Since the X_k’s are uniformly bounded, so are the RVs X_k–EX_k. It follows that for every we can find an N_ε such that, for . The Lindeberg condition follows immediately. The converse also holds, for, if and the Lindeberg condition holds, there exists a constant such that . For any fixed j, we can find an such that . Then, for ,

and the Lindeberg condition does not hold. This contradiction shows that is also a necessary condition that is, for a sequence of uniformly bounded independent RVs, a necessary and sufficient condition for the central limit theorem to hold is as .

Remark 3. Both the central limit theorem (CLT) and the (weak) law of large numbers (WLLN) hold for a large class of sequences of RVs {X_n}. If the {X_n} are independent uniformly bounded RVs, that is, if , the WLLN (Theorem 7.3.1) holds; the CLT holds provided that (Example 5).

If the RVs {X_n} are iid, then the CLT is a stronger result than the WLLN in that the former provides an estimate of the probability . Indeed,

where Z is (0, 1), and the law of large number follows. On the other hand, we note that the WLLN does not require the existence of a second moment.

Remark 4. If {X_n} are independent RVs, it is quite possible that the CLT may apply to the X_n’s, but not the WLLN.

We conclude this section with some remarks concerning the application of the CLT. Let X₁, X₂,… be iid RVs with common mean μ and variance σ². Let us write

and let ₁, ₂ be two arbitrary real numbers with . If F_n is the DF of Z_n, then

that is,

(4)

It follows that the RV is asymptotically normally distributed with mean nμ and variance nσ². Equivalently, the RV is asymptotically . This result is of great importance in statistics.

In Fig. 1 we show the distribution of in sampling from P(λ) and G (1, 1). We have also superimposed, in each case, the graph of the corresponding normal approximation.

c7-fig-0001a — **Fig. 1** (a) Distribution of for Poisson RV with mean 3 and normal approximation and (b) distribution of for exponential RV with mean 1 and normal approximation.

c7-fig-0001b — **Fig. 1** (a) Distribution of for Poisson RV with mean 3 and normal approximation and (b) distribution of for exponential RV with mean 1 and normal approximation.

How large should n be before we apply approximation (4)? Unfortunately the answer is not simple. Much depends on the underlying distribution, the corresponding speed of convergence, and the accuracy one desires. There is a vast amount of literature on the speed of convergence and error bounds. We will content ourselves with some examples. The reader is referred to Rohatgi [90] for a detailed discussion.

In the discrete case when the underlying distribution is integer-valued, approximation (4) is improved by applying the continuity correction. If X is integer-valued, then for integers x₁, x₂

which amounts to making the discrete space of values of X continuous by considering intervals of length 1 with midpoints at integers.

Next suppose that . Then from binomial tables . Using normal approximation, without continuity correction

and with continuity correction

The rule of thumb is to use continuity correction, and use normal approximation whenever , and use Poisson approximation with for .

PROBLEMS 7.6

Let {X_n} be a sequence of independent RVs with the following distributions. In each case, does the Lindeberg condition hold?
1. .
2. .
3. .
4. {X_n} is a sequence of independent Poisson RVs with parameter , such that .
5. .
Let X₁, X₂,… be iid RVs with mean 0, variance 1, and . Find the limiting distribution of
Let X₁, X₂,… be iid RVs with mean α and variance σ², and let Y₁,Y₂,… be iid RVs with mean and variance τ². Find the limiting distribution of , where and .
Let . Use the CLT to find n such that . In particular, let and . Calculate n, satisfying .
Let X₁, X₂,… be a sequence of iid RVs with common mean μ and variance σ². Also, let and . Show that where .
Let X₁, X₂,…, X₁₀₀ be iid RVs with mean 75 and variance 225. Use Chebychev’s inequality to calculate the probability that the sample mean will not differ from the population mean by more than 6. Then use the CLT to calculate the same probability and compare your results.
Let X₁, X₂,…,X₁₀₀ be iid P(λ) RVs, where . Let . Use the central limit result to evaluate and compare your result to the exact probability of the event .
Let X₁, X₂,…,X₈₁ be iid RVs with mean 54 and variance 225. Use Chebychev’s inequality to find the possible difference between the sample mean and the population mean with a probability of at least 0.75. Also use the CLT to do the same.
0.0926; 1.92
Use the CLT applied to a Poisson RV to show that for if , and 0 if .
Let X₁,X₂,… be a sequence of iid RVs with mean μ and variance σ², and assume that . Write . Find the centering and norming constants A_n and B_n such that , where Z is (0, 1).
From an urn containing 10 identical balls numbered 0 through 9, n balls are drawn with replacement.
1. What does the law of large numbers tell you about the appearance of 0’s in the n drawings?
2. How many drawings must be made in order that, with probability at least 0.95, the relative frequency of the occurrence of 0’s will be between 0.09 and 0.11?
3. Use the CLT to find the probability that among the n numbers thus chosen the number 5 will appear between and times (inclusive) if (i) and (ii) .
Let X₁, X₂,…,X_n be iid RVs with and . Let , and for any positive real number ε let . Show that

[Hint: Use (5.3.61.]

7.7 LARGE SAMPLE THEORY

In many applications of probability one needs the distribution of a statistic or some function of it. The methods of Section 7.3 when applicable lead to the exact distribution of the statistic under consideration. If not, it may be sufficient to approximate this distribution provided the sample size is large enough.

Let {X_n} be a sequence of RVs which converges in law to N(μ, σ²). Then converges in law to N(0, 1), and conversely. We will say alternatively and equivalently that {X_n} is asymptotically normal with mean μ and variance σ² More generally, we say that X_n is asymptotically normal with “mean” μ_n and “variance”. , and write X_n is AN , if and as .

(1)

Here μ_n is not necessarily the mean of X_n and. , not necessarily its variance. In this case we can approximate, for sufficiently large n, by images , where Z is (0, 1).

The most common method to show that X_n is is the central limit theorem of Section 6. Thus, according to Theorem 7.6.1 as , where is the sample mean of n iid RVs with mean μ and variance σ². The same result applies to kth sample moment, provided . Thus

In many large sample approximations an application of the CLT along with Slutsky’s theorem suffices.

Often we need to approximate the distribution of g(Y_n) given that Y_n is AN(μ, σ²).

Remark 1. Suppose in Theorem 1 is differentiable k times, , at and for . Then a similar argument using Taylor’s theorem shows that

(4)

Where Z is a (0, 1) RV. Thus in Example 2, when . It follows that

Since .

Remark 2. Theorem 1 can be extended to the multivariate case but we will not pursue the development. We refer the reader to Ferguson [29] or Serfling [102].

Remark 3. In general the asymptotic variance of g(Y_n) will depend on the parameter μ. In problems of inference it will often be desirable to use transformation g such that the approximate variance var g(Y_n) is free of the parameter. Such transformations are called variance stabilizing transformations. Let us write . Then finding a g such that var g(Y_n) is free of μ is equivalent to finding a g such that

for all μ, where c is a constant independent of μ. It follows that

(5)

Remark 4. In Section 6.3 we computed exact moments of some statistics in terms of population parameters. Approximations for moments of can also be obtained from series expansions of g. Suppose g is twice differentiable at . Then

(6)

and

(7)

by dropping remainder terms. The case of most interest is to approximate and . In this case, under suitable conditions, one can show that

(8)

and

(9)

where and .

In Example 2, when X_i’s are iid b(1, p), and so that

and

In this case we can compute and exactly. We have

so that (8) is exact. Also since , using Theorem 6.3.4 we have

Thus the error in approximation (9) is

Remark 5. Approximations (6) through (9) do not assert the existence of or , or var g(X) or .

Remark 6. It is possible to extend (6) through (9) to two (or more) variables by using Taylor series expansion in two (or more) variables.

Finally, we state the following result which gives the asymptotic distribution of the rth order statistic, , in sampling from a population with an absolutely continuous DF F with PDF f. For a proof see Problem 4.

Remark 7. The sample quantile of order p, Z_p, is

where is the corresponding population quantile, and f is the PDF of the population distribution function. It also follows that .

PROBLEMS 7.7

In sampling from a distribution with mean μ and variance σ² find theasymptotic distribution of
1. ,
2. ,
3. ,
both when and when .
Let . Then . Find a transformation g such that has an asymptotic (0, c) distribution for large μ where c is a suitable constant.
Let X₁,X₂,…,X_n be a sample from an absolutely continuous DF F with PDF f. Show that

[Hint:Let Y be an RV with mean μ and ϕ be a Borel function such that E ϕ (Y) exists. Expand ϕ (Y) about the point μ by a Taylor series expansion, and use the fact that .]
Prove Theorem 2. [Hint: For any real μ and compute the PDF of and show that the standardized , is asymptotically (0, 1) under the conditions of the theorem.]
Let . Then is AN(0, 1) and X/n is . Find a transformation g such that the distribution of is AN(0, c).
Suppose X is G(1, θ). Find g such that is AN(0, c).
Let X₁,X₂,…,X_n be iid RVs with . Let and :
1. Show, using the CLT for iid RVs, that .
2. Find a transformation g such that g(S²) has an asymptotic distribution which depends on β₂ alone but not on σ².

Note

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.