In
multiple regression texts, some authors (e.g., Pedhazur, 1997, p.
207) suggest subject to variable ratios of 15:1 or 30:1 when generalization
is critical. But there are few explicit guidelines such as this for
EFA (Baggaley, 1983). Two different approaches have been taken: suggesting
a minimum total sample size, or examining the ratio of parameters
such as subjects to variables, as in multiple regression.
Comfrey and Lee (1992) suggest that “the
adequacy of sample size might be evaluated very roughly on the following
scale: 50 – very poor; 100 – poor; 200 – fair;
300 – good; 500 – very good; 1000 or more – excellent”
(p. 217). Guadagnoli & Velicer (1988) review several studies that
conclude that absolute minimum sample sizes, rather than subject to
item ratios, are more relevant. These studies range in their recommendations
from an N of 50 (Barrett & Kline, 1981) to 400 (Aleamoni, 1976).
In our mind, some of these recommendations are ridiculous, as they
could result in analyses estimating far more parameters than available
subjects.
The case for ratios. There
are few scholars writing from the multiple regression camp who would
argue that total N is a superior guideline to the ratio of subjects
to variables. Yet, authors focusing on EFA occasionally vehemently
defend this position. It is interesting precisely because the general
goal for both analyses is similar: to take individual variables and
create optimally weighted linear composites that will generalize to
other samples or to the population. Although the mathematics and procedures
differ in the details, the essence and the pitfalls are the same.
Both EFA and multiple regression risk overfitting of the estimates
to the data (Bobko & Schemmer, 1984), and both suffer from lack
of generalizability when sample size is too small.
Absolute sample sizes seem simplistic
given the range of complexity factor analyses can exhibit—each
scale differs in the number of factors or components, the number of
items on each factor, the magnitude of the item to factor correlations,
and the correlation between factors, for example. This has led some
authors to focus on the ratio of subjects to items or, more recently,
the ratio of subjects to parameters (as each item will have a loading
for each factor or component extracted). This is similar to what authors
do with regression, rather than absolute sample size, when discussing
guidelines concerning EFA.
Gorsuch (1983, p. 332) and Hatcher (1994,
p. 73) recommend a minimum subject
to item ratio of at least 5:1 in EFA, but they also describe stringent
guidelines for when this ratio is acceptable, and they both note that
higher ratios are generally better. There is a widely cited rule of
thumb from Nunnally (1978, p. 421) that the subject to item ratio
for exploratory factor analysis should be at least 10:1, but that
recommendation was not supported by empirical research. Authors such
as Stevens (2002) have provided recommendations ranging from 5 to
20 participants per scale item, with Jöreskog & Sörbom
(1996) encouraging at least 10 participants per
parameter estimated.
There is no one ratio
that will work in all cases; the number of items per factor and communalities
and item loading magnitudes can make any particular ratio overkill
or hopelessly insufficient (MacCallum, Widaman, Preacher, & Hong,
2001).