Bootstrap resampling
consists of three basic steps: 1) resampling, 2) replication of analyses,
and 3) summarizing the results of the analyses (e.g., via a CI). There
are many good references on bootstrap and other resampling techniques.
The brief overview here is not meant to be exhaustive, but rather
to give enough information for you to understand the rest of the chapter.
For more information about these methods, please see Davison &
Hinkley (1997).
The bootstrapping process
begins by resampling from your original sample. You take your existing
sample (say, of 50 participants) and randomly select (with replacement)
a certain number of related samples of N=50 based on those original
50 subjects. The procedure is called “resampling” because
it treats the original sample as fodder for an unlimited number of
new samples. By resampling with replacement, we can get three copies
of the 14th person in the sample, none of the 15th, and one copy of
the 16th person. Perhaps in the next sample there will be one copy
of both the 14th and 15th persons, but none of the 16th. Thus, the
samples are related, in that they all derive from the same master
sample, but they are not exactly the same as each individual can be
present in varying degrees or not in each resampling.
Next, the analysis of
interest is repeated in each sample. Similar to the procedure for
the replication analyses discussed in the previous chapter, it is
important that all procedures are replicated exactly. This will produce
separate results for each resample. The distribution of a statistic
across the resamples is known as the bootstrap distribution. The idea
behind this method is that the resamples can be viewed as thousands
of potential samples from the population (Thompson, 2004). Together,
the estimates from the resamples represent the possible range of the
estimates in the population. The average estimate in the bootstrap
distribution is a rough approximation of the estimate in the population.
Finally, the analyses
must be summarized. A 95% CI can be calculated from the bootstrap
distribution. The easiest way to do this is to identify the values
at the 2.5th and 97.5th percentile of the distribution. This is known
as the percentile interval method of estimating CI. Other methods
exist to estimate bootstrapped CI, some of which might be more robust
to bias. Please see Davison & Hinkley (1997) for more about these
methods.
Most scholars familiar
with bootstrap resampling will agree with what we have said thus far—bootstrap
resampling is beneficial for estimating CI—but they will likely
stop agreeing at this point. There are a wide number of opinions on
what bootstrap resampling is good for and what it is not good for.
You will get our opinion on that in this chapter, but be aware that
there are strong passions around this issue (much like principal components
analysis).