Chapter 11 - The Law of Large Numbers (1/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

163

Chapter 11

The Law of Large Numbers

Let us move further away from frequency distribution and look at probability dis-

tributions. e frequency distribution that we have seen in Chapter 10 is an empiri-

cal pattern; what we are now going to see in this chapter and the rest of the book

are mathematical expressions.

The mathematical sciences particularly exhibit order, symmetry,

and limitation; and these are the greatest forms of the beautiful.

Aristotle

BOX 11.1 BIRTH OF PROBABILITY

Before the middle of the 17th century, the term probable meant approvable

and was applied in that sense, univocally, to opinion and to action. A prob-

able action or opinion was one such as sensible people would undertake or

hold in the circumstances. However, the term probable could also apply to

propositions for which there was good evidence, especially in legal contexts.

In the Renaissance times, betting was discussed in terms of odds such as “ten

to one,” and maritime insurance premiums were estimated based on intuitive

risks. However, there was no theory on how to calculate such odds or premiums.

e mathematical methods of probability arose in the correspondence of

Pierre de Fermat and Blaise Pascal (1654) on such questions as the fair divi-

sion of the stake in an interrupted game of chance.

164 ◾ Simple Statistical Methods for Software Engineering

Life Is a Random Variable

Results, in general, are random in nature; some could be in our favor and some not.

Process results do not precisely remain favorable all the time, neither do they become

unfavorable all the time. Results toggle between favor and disfavor, randomly.

The measure of the probability of an event is the ratio of the num-

ber of cases favorable to that event, to the total number of cases.

René Descartes

e discovery of probability goes back to the Renaissance times (see Box 11.1).

A process that toggles between favor and disfavor is called the Bernoulli process,

named after the inventor. Mathematically, a Bernoulli process takes randomly only

two values, 1 and 0. Repeated ﬂipping a coin is a Bernoulli process; we get a head or

tail, success or failure, “1 or 0.” Every toss is a Bernoulli experiment. e Bernoulli

random variable was invented by Jacob Bernoulli, a Swiss mathematician (see Box

11.2 for a short biography).

Results from trials converge to the “expected value” as the number increases.

In an unbiased coin, the “expected value” of the probability of success (probability

of appearance of heads) is 0.5. More number of trials are closer to the value of the

probability of success. is is known as the law of large numbers. Using this law,

we can predict a stable long-term behavior. It took Bernoulli more than 20 years

to develop a suﬃciently rigorous mathematical proof. He named this his golden

Fermat and Pascal helped lay the fundamental groundwork for the theory

of probability. From this brief but productive collaboration on the problem of

points, they are now regarded as joint founders of probability theory. Fermat

is credited with carrying out the ﬁrst ever rigorous probability calculation. In

it, he was asked by a professional gambler why if he bet on rolling at least one

six in four throws of a die he won in the long term, whereas betting on throw-

ing at least one double-six in 24 throws of two dice resulted in his losing.

Fermat subsequently proved why this was the case mathematically. (http://

en.wikipedia.org/wiki/Problem_of_points)

Christiaan Huygens (1657) gave a comprehensive treatment of the subject.

Jacob Bernoulli’s Ars Conjectandi (posthumous, 1713) put probability on

a sound mathematical footing, showing how to calculate a wide range of

complex probabilities.

The Law of Large Numbers ◾ 165

theorem, but it became generally known as Bernoulli’s theorem. is theorem was

applied to predict how much one would expect to win playing various games of

chance.

From suﬃcient data from real-life events, we can arrive at a probability of suc-

cess (p) and trust that the future can be predicted based on this.

Prediction means estimation of two values: mean and variance (which denote

central tendency and dispersion).

In this chapter, we consider four distributions to describe four diﬀerent ways of

describing the dispersion pattern.

1. Binomial distribution

e probability of getting exactly k successes in n trials is given by the fol-

lowing binomial expression:

P X k C p p

n k n k

( ) ( )= = −

−

1 (11.1)

where n is the number of trials, p is the probability of success (same for each

trial), and k is the number of successes observed in n trials, calculated as

follows:

Mean = np (11.2)

Variance = np(1 − p) (11.3)

Equation 11.1 is a paradigm for a wide range of contexts. In service man-

agement, success is replaced by arrival, and the Bernoulli process is called

arrival-type process. In software development processes, we prefer to use the

term success. e coeﬃcient C is a binomial coeﬃcient, hence the name bino-

mial distribution.

Software development processes may consist of two components:

a. An inherent Bernoulli component that complies with the law of large

numbers

b. Inﬂuences from spurious noise factors

Bernoulli distribution is used in statistical process control. e spurious

noise factors must be identiﬁed, analyzed, and eliminated. For example, in

service-level agreement (SLA) compliance data, one may ﬁnd both these

components. If the process is restricted to the Bernoulli type, the process

is said to be under statistical control. (Shewhart called this variation due to

“common causes” and ascribed spurious inﬂuences to special causes.)

Equation 11.1 is used in the quality control of discrete events.

166 ◾ Simple Statistical Methods for Software Engineering

Example 11.1: Binomial Distribution of SLA Compliance

QUESTION

From the previous year’s deliveries, it has been estimated that the probability of

meeting SLA in an enhancement project is 90%. Find out the probability of meet-

ing SLA in at least 10 of 120 deliveries scheduled in the current year. Plot the

related binomial distribution.

BOX 11.2 JACOB BERNOULLI 16541705

Nature always tends to act in the simplest way.

Jacob Bernoulli

Jacob Bernoulli gave a mathematical footing to the theory of probability. e

term Bernoulli process is named after him. A well-known name in the world

of mathematics, the Bernoulli family has been known for their advancement

of mathematics. Originally from the Netherlands, Nicolaus Bernoulli, Jacob’s

father, moved his spice business to Basel, Switzerland. Jacob graduated from

the University of Basel with a master’s degree in philosophy in 1671 and a

master’s degree in theology in 1676. When he was working toward his mas-

ter’s degrees, he would also study mathematics and astronomy. In 1681, Jacob

Bernoulli met mathematician Hudde. Bernoulli continued to study math-

ematics and met world-renowned mathematicians such as Boyleand Hooke.

Jacob Bernoulli saw the power of calculus and is known as one of the

fathers of calculus. He also wrote a book called Ars Conjectandi, published in

1713 (8 years after his death).

Bernoulli added upon Cardano’s idea of the law of large numbers. He

asserted that if a repeatable experiment had a theoretical probability p of turn-

ing out in a certain “favorable” way, then for any speciﬁed margin of error, the

ratio of favorable to total outcomes of some (large) number of repeated trials of

that experiment would be within that margin of error. By this principle, obser-

vational data can be used to estimate the probability of events in real-world

situations. is is what is now known as the law of large numbers. Interestingly,

when he wrote the book, he named this idea the “Golden theorem.”

Bernoulli had received several awards and/or honors. One of the honors

given to him was a lunar crater named after him. In Paris, there is a street

named after the Bernoulli family. e street is called Rue Bernoulli.

The Law of Large Numbers ◾ 167

ANSWER

You can solve Equation 11.1 by keeping p = 0.9, n = 120, and k = 10. is will

give the probability of exactly 10 deliveries meeting the SLA. We are assessing the

chance of getting exactly 10 successes.

Alternatively, use MS Excel function BINOM.DIST.

e answer is zero. e number is too small, 4.04705E-97.

Figure 11.1 shows the binomial probability distribution of SLA compliance.

is may be taken as a process model. One can notice the upper and lower bound-

aries of the density function, approximately from 91 to 118 trials; one can also note

the central tendency, which is exactly the mean.

BOX 11.3 TESTING RELIABILITY USING

NEGATIVE BINOMIAL DISTRIBUTION

To test reliability, we randomly selected and run test cases covering usage.

Executing a complete test library is costly, so we resort to sampling. We can

choose inverse sampling and choose and execute test cases randomly until a

preset number of defects are found (unacceptable defect level). If this level is

reached, the software is rejected. Using regular sampling and under binomial

distribution, we can do an acceptance test, but we might require to execute

a signiﬁcantly large number of test cases to arrive at an equivalent decision.

Inverse sampling under NBD is more eﬃcient in user acceptance testing.

With this information, we can construct the negative binomial distribu-

tion of defects. e salient overall point of the comparison is that, unless the

software is nearly perfect, the negative binomial mode of sampling brings

about large reductions in the average number of executions over the binomial

mode of sampling for identical false rejection and false acceptance risks [1].

0.02

0.04

0.06

0.08

0.10

0.12

0.14

102

106

110

114

118

Figure 11.1 Binomial probability of SLA compliance (p = 0.9, n = 120).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 11 - The Law of Large Numbers (1/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 11 - The Law of Large Numbers (1/4)