15

Blood Testing

15.1 THE PROBLEM

Imagine that a very large number of people, N, are to each have their blood tested for a possible dangerous infection. Think, for example, of the millions of men inducted into the armed services during World War II, and the then not-so-uncommon sexually transmitted infection of syphilis. The obvious way to do this is simply to test each person, for a total of N tests. During World War II a different, quite clever alternative approach was actually used, based on the fact that while syphilis wasn’t uncommon, most men were actually not infected.

The underlying idea was simple: for some number k, pool the blood samples from k men and test the mixture. If all k men are free of infection, then the test will be negative, and one test clears all k men at once. If the test is positive, however, at least one man is infected, and so then all of them are individually tested, for a total of k + 1 tests. Taking the probability of being infected as p, this leads to the following two questions: (1) What is the value of k that minimizes the number of expected tests? (2) For that value of k, what is the expected number of tests (and how does it compare with N, the number of tests if each man is individually tested)?

15.2 THEORETICAL ANALYSIS

Since p is the probability a man will test positive, then 1 − p is the probability he will test negative. Thus, (1 − p)k is the probability a mix of the blood samples from k people will test negative, and 1 − (1 − p)k is the probability the test will be positive (and so k additional, individual tests will need to be done). So, what we have is

1 test with probability (1 − p)k

and

k + 1 tests with probability 1 − (1 − p)k

for each blood mixture formed from k people. Since N people will create N/k blood pools, then, if T (a random variable) is the total number of tests required for N people, the average value of T is

Images

or,

Images

To find that k that minimizes E(T), we’ll next solve the equation d E(T)/dk = 0, a calculation that is valid if k is a continuous variable—which of course it isn’t. But I’ll go ahead with the calculation anyway, as the objection is more a purist’s one than anything else. Since

(1 − p)k = eln{(1 − p)k} = ekln(1 − p),

we have

Images

or

Images

For the case of p Ü 1 (during the years leading up to World War II a typical value of p for syphilis was something like 0.001 to 0.01), and we can use approximations valid for this situation:

(1 − p)k ≈ 1 − kp

and

ln(1 − p) ≈ −p.

So,

Images

or

Images

Now, since p Ü 1, then p2 ÜÜ 1, and so we can drop the p2k term. This gives us the answer to our first question:

Images

For example, if p = 0.01, then k = 10, while if p = 0.001, then k = 32.

To answer the second question, we’ll put Images into the expression for E(T) to get

Images

or, using the approximation

Images

we have

Images

So, if p = 0.01, then E(T) = 0.2N = N/5, while if p = 0.001, then E(T) = 0.063N = N/16. These are dramatic reductions in the expected number of tests required.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset