Chapter 10. Power analysis

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 10. Power analysis

This chapter covers

Determining sample size requirements
Calculating effect sizes
Assessing statistical power

As a statistical consultant, I am often asked the question, “How many subjects do I need for my study?” Sometimes the question is phrased this way: “I have x number of people available for this study. Is the study worth doing?” Questions like these can be answered through power analysis, an important set of techniques in experimental design.

Power analysis allows you to determine the sample size required to detect an effect of a given size with a given degree of confidence. Conversely, it allows you to determine the probability of detecting an effect of a given size with a given level of confidence, under sample size constraints. If the probability is unacceptably low, you’d be wise to alter or abandon the experiment.

In this chapter, you’ll learn how to conduct power analyses for a variety of statistical tests, including tests of proportions, t-tests, chi-square tests, balanced oneway ANOVA, tests of correlations, and linear models. Because power analysis applies to hypothesis testing situations, we’ll start with a brief review of null hypothesis significance testing (NHST). Then we’ll review conducting power analyses within R, focusing primarily on the pwr package. Finally, we’ll consider other approaches to power analysis available with R.

10.1. A quick review of hypothesis testing

To help you understand the steps in a power analysis, we’ll briefly review statistical hypothesis testing in general. If you have a statistical background, feel free to skip to section 10.2.

In statistical hypothesis testing, you specify a hypothesis about a population parameter (your null hypothesis, or H₀). You then draw a sample from this population and calculate a statistic that’s used to make inferences about the population parameter. Assuming that the null hypothesis is true, you calculate the probability of obtaining the observed sample statistic or one more extreme. If the probability is sufficiently small, you reject the null hypothesis in favor of its opposite (referred to as the alternative or research hypothesis, H₁).

An example will clarify the process. Say you’re interested in evaluating the impact of cell phone use on driver reaction time. Your null hypothesis is Ho: µ₁ – µ₂ = 0, where µ₁ is the mean response time for drivers using a cell phone and µ₂ is the mean response time for drivers that are cell phone free (here, µ₁ – µ₂ is the population parameter of interest). If you reject this null hypothesis, you’re left with the alternate or research hypothesis, namely H₁: µ₁ – µ₂ ≠ 0. This is equivalent to µ₁ ≠ µ₂, that the mean reaction times for the two conditions are not equal.

A sample of individuals is selected and randomly assigned to one of two conditions. In the first condition, participants react to a series of driving challenges in a simulator while talking on a cell phone. In the second condition, participants complete the same series of challenges but without a cell phone. Overall reaction time is assessed for each individual.

Based on the sample data, you can calculate the statistic

where and are the sample reaction time means in the two conditions, s is the pooled sample standard deviation, and n is the number of participants in each condition. If the null hypothesis is true and you can assume that reaction times are normally distributed, this sample statistic will follow at distribution with 2n-2 degrees of freedom. Using this fact, you can calculate the probability of obtaining a sample statistic this large or larger. If the probability (p) is smaller than some predetermined cutoff (say p < .05), you reject the null hypothesis in favor of the alternate hypothesis. This predetermined cutoff (0.05) is called the significance level of the test.

Note that you use sample data to make an inference about the population it’s drawn from. Your null hypothesis is that the mean reaction time of all drivers talking on cell phones isn’t different from the mean reaction time of all drivers who aren’t talking on cell phones, not just those drivers in your sample. The four possible outcomes from your decision are as follows:

If the null hypothesis is false and the statistical test leads us to reject it, you’ve made a correct decision. You’ve correctly determined that reaction time is affected by cell phone use.
If the null hypothesis is true and you don’t reject it, again you’ve made a correct decision. Reaction time isn’t affected by cell phone use.
If the null hypothesis is true but you reject it, you’ve committed a Type I error. You’ve concluded that cell phone use affects reaction time when it doesn’t.
If the null hypothesis is false and you fail to reject it, you’ve committed a Type II error. Cell phone use affects reaction time, but you’ve failed to discern this.

Each of these outcomes is illustrated in the table below.

Controversy surrounding null hypothesis significance testing

Null hypothesis significance testing is not without controversy and detractors have raised numerous concerns about the approach, particularly as practiced in the field of psychology. They point to a widespread misunderstanding of p values, reliance on statistical significance over practical significance, the fact that the null hypothesis is never exactly true and will always be rejected for sufficient sample sizes, and a number of logical inconsistencies in NHST practices.

An in-depth discussion of this topic is beyond the scope of this book. Interested readers are referred to Harlow, Mulaik, and Steiger (1997).

In planning research, the researcher typically pays special attention to four quantities: sample size, significance level, power, and effect size (see figure 10.1).

Figure 10.1. Four primary quantities considered in a study design power analysis. Given any three, you can calculate the fourth.

Specifically:

Sample size refers to the number of observations in each condition/group of the experimental design.
The significance level (also referred to as alpha) is defined as the probability of making a Type I error. The significance level can also be thought of as the probability of finding an effect that is not there.
Power is defined as one minus the probability of making a Type II error. Power can be thought of as the probability of finding an effect that is there.
Effect size is the magnitude of the effect under the alternate or research hypothesis. The formula for effect size depends on the statistical methodology employed in the hypothesis testing.

Although the sample size and significance level are under the direct control of the researcher, power and effect size are affected more indirectly. For example, as you relax the significance level (in other words, make it easier to reject the null hypothesis), power increases. Similarly, increasing the sample size increases power.

Your research goal is typically to maximize the power of your statistical tests while maintaining an acceptable significance level and employing as small a sample size as possible. That is, you want to maximize the chances of finding a real effect and minimize the chances of finding an effect that isn’t really there, while keeping study costs within reason.

The four quantities (sample size, significance level, power, and effect size) have an intimate relationship. Given any three, you can determine the fourth. We’ll use this fact to carry out various power analyses throughout the remainder of the chapter. In the next section, we’ll look at ways of implementing power analyses using the R package pwr. Later, we’ll briefly look at some highly specialized power functions that are used in biology and genetics.

10.2. Implementing power analysis with the pwr package

The pwr package, developed by Stéphane Champely, implements power analysis as outlined by Cohen (1988). Some of the more important functions are listed in table 10.1. For each function, the user can specify three of the four quantities (sample size, significance level, power, effect size) and the fourth will be calculated.

Table 10.1. pwr package functions

Function	Power calculations for
pwr.2p.test()	Two proportions (equal n)
pwr.2p2n.test()	Two proportions (unequal n)
pwr.anova.test()	Balanced one-way ANOVA
pwr.chisq.test()	Chi-square test
pwr.f2.test()	General linear model
pwr.p.test()	Proportion (one sample)
pwr.r.test()	Correlation
pwr.t.test()	t-tests (one sample, two sample, paired)
pwr.t2n.test()	t-test (two samples with unequal n)

Of the four quantities, effect size is often the most difficult to specify. Calculating effect size typically requires some experience with the measures involved and knowledge of past research. But what can you do if you have no clue what effect size to expect in a given study? You’ll look at this difficult question in section 10.2.7. In the remainder of this section, you’ll look at the application of pwr functions to common statistical tests. Before invoking these functions, be sure to install and load the pwr package.

10.2.1. t-tests

When the statistical test to be used is a t-test, the pwr.t.test() function provides a number of useful power analysis options. The format is

pwr.t.test(n=, d=, sig.level=, power=, alternative=)

where

n is the sample size.

d is the effect size defined as the standardized mean difference.

	where μ₁	= mean of group 1
	μ₂	= mean of group 2
	σ²	= common error variance

sig.level is the significance level (0.05 is the default).
power is the power level.
type is two-sample t-test ("two.sample"), a one-sample t-test ("one.sample"), or a dependent sample t-test ( "paired"). A two-sample test is the default.
alternative indicates whether the statistical test is two-sided ("two.sided") or one-sided ("less" or "greater"). A two-sided test is the default.

Let’s work through an example. Continuing the cell phone use and driving reaction time experiment from section 10.1, assume that you’ll be using a two-tailed independent sample t-test to compare the mean reaction time for participants in the cell phone condition with the mean reaction time for participants driving unencumbered.

Let’s assume that you know from past experience that reaction time has a standard deviation of 1.25 seconds. Also suppose that a 1-second difference in reaction time is considered an important difference. You’d therefore like to conduct a study in which you’re able to detect an effect size of d = 1/1.25 = 0.8 or larger. Additionally, you want to be 90 percent sure to detect such a difference if it exists, and 95 percent sure that you won’t declare a difference to be significant when it’s actually due to random variability. How many participants will you need in your study?

Entering this information in the pwr.t.test() function, you have the following:

> library(pwr)
> pwr.t.test(d=.8, sig.level=.05, power=.9, type="two.sample",
   alternative="two.sided")

     Two-sample t test power calculation

              n = 34
              d = 0.8
      sig.level = 0.05
          power = 0.9
    alternative = two.sided

 NOTE: n is number in *each* group

The results suggest that you need 34 participants in each group (for a total of 68 participants) in order to detect an effect size of 0.8 with 90 percent certainty and no more than a 5 percent chance of erroneously concluding that a difference exists when, in fact, it doesn’t.

Let’s alter the question. Assume that in comparing the two conditions you want to be able to detect a 0.5 standard deviation difference in population means. You want to limit the chances of falsely declaring the population means to be different to 1 out of 100. Additionally, you can only afford to include 40 participants in the study. What’s the probability that you’ll be able to detect a difference between the population means that’s this large, given the constraints outlined?

Assuming that an equal number of participants will be placed in each condition, you have

> pwr.t.test(n=20, d=.5, sig.level=.01, type="two.sample",
  alternative="two.sided")

     Two-sample t test power calculation

              n = 20
              d = 0.5
      sig.level = 0.01
          power = 0.14
    alternative = two.sided

 NOTE: n is number in *each* group

With 20 participants in each group, an a priori significance level of 0.01, and a dependent variable standard deviation of 1.25 seconds, you have less than a 14 percent chance of declaring a difference of 0.625 seconds or less significant (d = 0.5 = 0.625/1.25). Conversely, there’s a 86 percent chance that you’ll miss the effect that you’re looking for. You may want to seriously rethink putting the time and effort into the study as it stands.

The previous examples assumed that there are equal sample sizes in the two groups. If the sample sizes for the two groups are unequal, the function

pwr.t2n.test(n1=, n2=, d=, sig.level=, power=, alternative=)

can be used. Here, n1 and n2 are the sample sizes and the other parameters are the same as for pwer.t.test. Try varying the values input to the pwr.t2n.test function and see the effect on the output.

10.2.2. ANOVA

The pwr.anova.test() function provides power analysis options for a balanced oneway analysis of variance. The format is

pwr.anova.test(k=, n=, f=, sig.level=, power=)

where k is the number of groups and n is the common sample size in each group.

For a one-way ANOVA, effect size is measured by f, where

	where p_i	= n_i/N,
	n_i	= number of observations in group i
	N	= total number of observations
	μ_i	= mean of group i
	μ	= grand mean
	σ²	= error variance within groups

Let’s try an example. For a one-way ANOVA comparing five groups, calculate the sample size needed in each group to obtain a power of 0.80, when the effect size is 0.25 and a significance level of 0.05 is employed. The code looks like this:

> pwr.anova.test(k=5, f=.25, sig.level=.05, power=.8)

     Balanced one-way analysis of variance power calculation

              k = 5
              n = 39
              f = 0.25
      sig.level = 0.05
          power = 0.8

 NOTE: n is number in each group

The total sample size is therefore 5 × 39, or 195. Note that this example requires you to estimate what the means of the five groups will be, along with the common variance. When you have no idea what to expect, the approaches described in section 10.2.7 may help.

10.2.3. Correlations

The pwr.r.test() function provides a power analysis for tests of correlation coefficients. The format is as follows:

pwr.r.test(n=, r=, sig.level=, power=, alternative=)

where n is the number of observations, r is the effect size (as measured by a linear correlation coefficient), sig.level is the significance level, power is the power level, and alternative specifies a two-sided ("two.sided") or a one-sided ("less" or "greater") significance test.

For example, let’s assume that you’re studying the relationship between depression and loneliness. Your null and research hypotheses are

H₀: ρ ≤ 0.25 versus H₁: ρ > 0.25

where ρ is the population correlation between these two psychological variables. You’ve set your significance level to 0.05 and you want to be 90 percent confident that you’ll reject H₀ if it’s false. How many observations will you need? This code provides the answer:

> pwr.r.test(r=.25, sig.level=.05, power=.90, alternative="greater")

     approximate correlation power calculation (arctangh transformation)

              n = 134
              r = 0.25
      sig.level = 0.05
          power = 0.9
    alternative = greater

Thus, you need to assess depression and loneliness in 134 participants in order to be 90 percent confident that you’ll reject the null hypothesis if it’s false.

10.2.4. Linear models

For linear models (such as multiple regression), the pwr.f2.test() function can be used to carry out a power analysis. The format is

pwr.f2.test(u=, v=, f2=, sig.level=, power=)

where u and v are the numerator and denominator degrees of freedom and f2 is the effect size.

	where	R² = population squared multiple correlation
	where	= variance accounted for in the population by variable set A
		= variance accounted for in the population by variable set A and B together

The first formula for f2 is appropriate when you’re evaluating the impact of a set of predictors on an outcome. The second formula is appropriate when you’re evaluating the impact of one set of predictors above and beyond a second set of predictors (or covariates).

Let’s say you’re interested in whether a boss’s leadership style impacts workers’ satisfaction above and beyond the salary and perks associated with the job. Leadership style is assessed by four variables, and salary and perks are associated with three variables. Past experience suggests that salary and perks account for roughly 30 percent of the variance in worker satisfaction. From a practical standpoint, it would be interesting if leadership style accounted for at least 5 percent above this figure. Assuming a significance level of 0.05, how many subjects would be needed to identify such a contribution with 90 percent confidence?

Here, sig.level=0.05, power=0.90, u=3 (total number of predictors minus the number of predictors in set B), and the effect size is f2 = (.35-.30)/(1-.35) = 0.0769. Entering this into the function yields the following:

> pwr.f2.test(u=3, f2=0.0769, sig.level=0.05, power=0.90)

     Multiple regression power calculation

              u = 3
              v = 184.2426
             f2 = 0.0769
      sig.level = 0.05
          power = 0.9

In multiple regression, the denominator degrees of freedom equals N-k-1, where N is the number of observations and k is the number of predictors. In this case, N-7-1=185, which means the required sample size is N = 185 + 7 + 1 = 193.

10.2.5. Tests of proportions

The pwr.2p.test() function can be used to perform a power analysis when comparing two proportions. The format is

pwr.2p.test(h=, n=, sig.level=, power=)

where h is the effect size and n is the common sample size in each group. The effect size h is defined as

and can be calculated with the function ES.h(p1, p2).

For unequal ns the desired function is

pwr.2p2n.test(h =, n1 =, n2 =, sig.level=, power=).

The alternative= option can be used to specify a two-tailed ("two.sided") or one-tailed ("less" or “greater") test. A two-tailed test is the default.

Let’s say that you suspect that a popular medication relieves symptoms in 60 percent of users. A new (and more expensive) medication will be marketed if it improves symptoms in 65 percent of users. How many participants will you need to include in a study comparing these two medications if you want to detect a difference this large?

Assume that you want to be 90 percent confident in a conclusion that the new drug is better and 95 percent confident that you won’t reach this conclusion erroneously. You’ll use a one-tailed test because you’re only interested in assessing whether the new drug is better than the standard. The code looks like this:

> pwr.2p.test(h=ES.h(.65, .6), sig.level=.05, power=.9,
              alternative="greater")

     Difference of proportion power calculation for binomial
     distribution (arcsine transformation)

              h = 0.1033347
              n = 1604.007
      sig.level = 0.05
          power = 0.9
    alternative = greater

 NOTE: same sample sizes

Based on these results, you’ll need to conduct a study with 1,605 individuals receiving the new drug and 1,605 receiving the existing drug in order to meet the criteria.

10.2.6. Chi-square tests

Chi-square tests are often used to assess the relationship between two categorical variables. The null hypothesis is typically that the variables are independent versus a research hypothesis that they aren’t. The pwr.chisq.test() function can be used to evaluate the power, effect size, or requisite sample size when employing a chi-square test. The format is

pwr.chisq.test(w =, N = , df = , sig.level =, power =)

where w is the effect size, N is the total sample size, and df is the degrees of freedom. Here, effect size w is defined as

	where	p0_i =	cell probability in ith cell under H₀
		p1_i =	cell probability in ith cell under H₁

The summation goes from 1 to m, where m is the number of cells in the contingency table. The function ES.w2(P) can be used to calculate the effect size corresponding the alternative hypothesis in a two-way contingency table. Here, P is a hypothesized two-way probability table.

As a simple example, let’s assume that you’re looking the relationship between ethnicity and promotion. You anticipate that 70 percent of your sample will be Caucasian, 10 percent will be African American, and 20 percent will be Hispanic. Further, you believe that 60 percent of Caucasians tend to be promoted, compared with 30 percent for African Americans, and 50 percent for Hispanics. Your research hypothesis is that the probability of promotion follows the values in table 10.2.

Table 10.2. Proportion of individuals expected to be promoted based on the research hypothesis

Ethnicity	Promoted	Not promoted
Caucasian	0.42	0.28
African American	0.03	0.07
Hispanic	0.10	0.10

For example, you expect that 42 percent of the population will be promoted Caucasians (.42 = .70 × .60) and 7 percent of the population will be nonpromoted African Americans (.07 = .10 × .70). Let’s assume a significance level of 0.05 and the desired power level is 0.90. The degrees of freedom in a two-way contingency table are (r-1)*(c-1), where r is the number of rows and c is the number of columns. You can calculate the hypothesized effect size with the following code:

> prob <- matrix(c(.42, .28, .03, .07, .10, .10), byrow=TRUE, nrow=3)
> ES.w2(prob)

[1] 0.1853198

Using this information, you can calculate the necessary sample size like this:

> pwr.chisq.test(w=.1853, df=2, sig.level=.05, power=.9)

     Chi squared power calculation

              w = 0.1853
              N = 368.5317
             df = 2
      sig.level = 0.05
          power = 0.9

 NOTE: N is the number of observations

The results suggest that a study with 369 participants will be adequate to detect a relationship between ethnicity and promotion given the effect size, power, and significance level specified.

10.2.7. Choosing an appropriate effect size in novel situations

In power analysis, the expected effect size is the most difficult parameter to determine. It typically requires that you have experience with the subject matter and the measures employed. For example, the data from past studies can be used to calculate effect sizes, which can then be used to plan future studies.

But what can you do when the research situation is completely novel and you have no past experience to call upon? In the area of behavioral sciences, Cohen (1988) attempted to provide benchmarks for “small,” “medium,” and “large” effect sizes for various statistical tests. These guidelines are provided in table 10.3.

Table 10.3. Cohen’s effect size benchmarks

Statistical method	Effect size measures	Suggested guidelines for effect size
		Small	Medium	Large
t-test	d	0.20	0.50	0.80
ANOVA	f	0.10	0.25	0.40
Linear models	f2	0.02	0.15	0.35
Test of proportions	h	0.20	0.50	0.80
Chi-square	w	0.10	0.30	0.50

When you have no idea what effect size may be present, this table may provide some guidance. For example, what’s the probability of rejecting a false null hypothesis (that is, finding a real effect), if you’re using a one-way ANOVA with 5 groups, 25 subjects per group, and a significance level of 0.05?

Using the pwr.anova.test() function and the suggestions in f row of table 10.3, the power would be 0.118 for detecting a small effect, 0.574 for detecting a moderate effect, and 0.957 for detecting a large effect. Given the sample size limitations, you’re only likely to find an effect if it’s large.

It’s important to keep in mind that Cohen’s benchmarks are just general suggestions derived from a range of social research studies and may not apply to your particular field of research. An alternative is to vary the study parameters and note the impact on such things as sample size and power. For example, again assume that you want to compare five groups using a one-way ANOVA and a 0.05 significance level. The following listing computes the sample sizes needed to detect a range of effect sizes and plots the results in figure 10.2.

Figure 10.2. Sample size needed to detect various effect sizes in a one-way ANOVA with five groups (assuming a power of 0.90 and significance level of 0.05)

Listing 10.1. Sample sizes for detecting significant effects in a one-way ANOVA

library(pwr)
es <- seq(.1, .5, .01)
nes <- length(es)

samsize <- NULL
for (i in 1:nes){
    result <- pwr.anova.test(k=5, f=es[i], sig.level=.05, power=.9)
    samsize[i] <- ceiling(result$n)
}

plot(samsize,es, type="l", lwd=2, col="red",
     ylab="Effect Size",
     xlab="Sample Size (per cell)",
     main="One Way ANOVA with Power=.90 and Alpha=.05")

Graphs such as these can help you estimate the impact of various conditions on your experimental design. For example, there appears to be little bang for the buck increasing the sample size above 200 observations per group. We’ll look at another plotting example in the next section.

10.3. Creating power analysis plots

Before leaving the pwr package, let’s look at a more involved graphing example. Suppose you’d like to see the sample size necessary to declare a correlation coefficient statistically significant for a range of effect sizes and power levels. You can use the pwr.r.test() function and for loops to accomplish this task, as shown in the following listing.

Listing 10.2. Sample size curves for detecting correlations of various sizes

Listing 10.2 uses the seq function to generate a range of effect sizes r (correlation coefficients under H₁) and power levels p . It then uses two for loops to cycle through these effect sizes and power levels, calculating the corresponding sample sizes required and saving them in the array samsize . The graph is set up with the appropriate horizontal and vertical axes and labels . Power curves are added using lines rather than points . Finally, a grid and legend are added to aid in reading the graph . The resulting graph is displayed in figure 10.3.

Figure 10.3. Sample size curves for detecting a significant correlation at various power levels

As you can see from the graph, you’d need a sample size of approximately 75 to detect a correlation of 0.20 with 40 percent confidence. You’d need approximately 185 additional observations (n=260) to detect the same correlation with 90 percent confidence. With simple modifications, the same approach can be used to create sample size and power curve graphs for a wide range of statistical tests.

We’ll close this chapter by briefly looking at other R functions that are useful for power analysis.

10.4. Other packages

There are several other packages in R that can be useful in the planning stages of studies. Some contain general tools, whereas some are highly specialized.

The piface package (see figure 10.4) provides a Java GUI for sample-size methods that interfaces with R. The GUI allows the user to vary study parameters interactively and see their impact on other parameters.

Figure 10.4. Sample dialog boxes from the piface program

Although the package is described as Pre-Alpha, it’s definitely worth checking out. You can download the package source and binaries for Windows and Mac OS X from http://r-forge.r-project.org/projects/piface/. In R, enter the code

install.packages("piface", repos="http://R-Forge.R-project.org")
library(piface)
piface()

The package is particularly useful for exploring the impact of changes in sample size, effect size, significance levels, and desired power on the other parameters.

Other packages related to power analysis are described in table 10.4. The last five are particularly focused on power analysis in genetic studies. Genome-wide association studies (GWAS) are studies used to identify genetic associations with observable traits. For example, these studies would focus on why some people get a specific type of heart disease.

Table 10.4. Specialized power analysis packages

Package	Purpose
asypow	Power calculations via asymptotic likelihood ratio methods
PwrGSD	Power analysis for group sequential designs
pamm	Power analysis for random effects in mixed models
powerSurvEpi	Power and sample size calculations for survival analysis in epidemiological studies
powerpkg	Power analyses for the affected sib pair and the TDT (transmission disequilibrium test) design
powerGWASinteraction	Power calculations for interactions for GWAS
pedantics	Functions to facilitate power analyses for genetic studies of natural populations
gap	Functions for power and sample size calculations in case-cohort designs
ssize.fdr	Sample size calculations for microarray experiments

Finally, the MBESS package contains a wide range of functions that can be used for various forms of power analysis. The functions are particularly relevant for researchers in the behavioral, educational, and social sciences.

10.5. Summary

In chapters 7, 8, and 9, we explored a wide range of R functions for statistical hypothesis testing. In this chapter, we focused on the planning stages of such research. Power analysis helps you to determine the sample sizes needed to discern an effect of a given size with a given degree of confidence. It can also tell you the probability of detecting such an effect for a given sample size. You can directly see the tradeoff between limiting the likelihood of wrongly declaring an effect significant (a Type I error) with the likelihood of rightly identifying a real effect (power).

The bulk of this chapter has focused on the use of functions provided by the pwr package. These functions can be used to carry out power and sample size determinations for common statistical methods (including t-tests, chi-square tests, and tests of proportions, ANOVA, and regression). Pointers to more specialized methods were provided in the final section.

Power analysis is typically an interactive process. The investigator varies the parameters of sample size, effect size, desired significance level, and desired power to observe their impact on each other. The results are used to plan studies that are more likely to yield meaningful results. Information from past research (particularly regarding effect sizes) can be used to design more effective and efficient future research.

An important side benefit of power analysis is the shift that it encourages, away from a singular focus on binary hypothesis testing (that is, does an effect exists or not), toward an appreciation of the size of the effect under consideration. Journal editors are increasingly requiring authors to include effect sizes as well as p values when reporting research results. This helps you to determine both the practical implications of the research and provides you with information that can be used to plan future studies.

In the next chapter, we’ll look at additional and novel ways to visualize multivariate relationships. These graphic methods can complement and enhance the analytic methods that we’ve discussed so far and prepare you for the advanced methods covered in part 3.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 10. Power analysis

Create new playlist

Sign In

Sign Up

Chapter 10. Power analysis

10.1. A quick review of hypothesis testing

Figure 10.1. Four primary quantities considered in a study design power analysis. Given any three, you can calculate the fourth.

10.2. Implementing power analysis with the pwr package

Table 10.1. pwr package functions

10.2.1. t-tests

10.2.2. ANOVA

10.2.3. Correlations

10.2.4. Linear models

10.2.5. Tests of proportions

10.2.6. Chi-square tests

Table 10.2. Proportion of individuals expected to be promoted based on the research hypothesis

10.2.7. Choosing an appropriate effect size in novel situations

Table 10.3. Cohen’s effect size benchmarks

Figure 10.2. Sample size needed to detect various effect sizes in a one-way ANOVA with five groups (assuming a power of 0.90 and significance level of 0.05)

Listing 10.1. Sample sizes for detecting significant effects in a one-way ANOVA

10.3. Creating power analysis plots

Listing 10.2. Sample size curves for detecting correlations of various sizes

Figure 10.3. Sample size curves for detecting a significant correlation at various power levels

10.4. Other packages

Figure 10.4. Sample dialog boxes from the piface program

Table 10.4. Specialized power analysis packages

10.5. Summary

Table of Contents for
Chapter 10. Power analysis