Matrix of associations. The basic unit
of analysis in an EFA is a matrix of associations—either a
correlation or a covariance matrix. If you input a data set into your
EFA, the program will estimate this as step 1. Alternatively, you
can input the correlation or covariance matrix directly, reading it
in as the raw data. This can be useful when trying to replicate someone’s
analyses based on published results or when wanting to analyze ordinal
or dichotomous variables through a corrected correlation matrix (i.e.,
polychoric or tetrachoric). In either case, the extraction methods
above will yield slightly different results based on the matrix of
association being analyzed. The default method in PROC
FACTOR
is the simple correlation matrix (the most commonly
used type of association for EFA). Correlations are most commonly
used in EFA as they are only influenced by the magnitude of the association
of the two variables. By contrast, covariances are influenced by association,
as well as the variance of each of the two variables in question (Thompson,
2004). The default method in PROC FACTOR
can
be changed using the COVARIANCE
option.
Communalities. In
EFA, the communalities are the estimates of the shared variance in
each variable, or the variance that will be accounted for by all the
factors. They are computed from the matrix of associations, and their
decomposition and partitioning is the goal of all subsequent analysis.
The estimation of the communalities is a defining characteristic of
EFA that distinguishes it from PCA. In EFA, the communalities are
always less than 1.00 for each variable because EFA seeks to decompose
the shared variance; while in PCA, they are initially 1.00 because
there is no distinction between shared and unique variance.
Although
the different extraction methods generally yield different estimates
of communalities, each method typically starts with the same initial
estimates. The initial estimate aims to get a quick and simple idea
of the shared variance in each variable. In PROC FACTOR
,
the default process for the EFA techniques is
to estimate the initial communalities as the squared multiple correlation
of a variable with all other variables. They are called the “Prior
Communality Estimates” in the output, and they should appear
as one of the first tables. Starting with the initial estimates, the
communalities are then iteratively re-estimated via the selected extraction
method to produce the final estimates.
The communalities can
be thought of as a row statistic. When looking at a table of factor
loadings, with variables as the rows and factor loadings in columns,
the communalities for a variable are a function of the factor loadings.
Squaring and summing each factor loading for a variable should equal
the extracted communality (within reasonable rounding error).
Eigenvalues. Eigenvalues
are a representation of the aggregated item-level variance associated
with a factor. They can be viewed as a column statistic—again
imagining a table of factor loadings. If you square each factor loading
and sum them all within a column, you should get an approximation
of the eigenvalue for that factor (again within rounding error). Thus,
eigenvalues are higher when there are at least some variables with
high factor loadings, and lower when there are mostly low loadings.
You will notice that eigenvalues (and communalities) change from initial
statistics (which are estimates and should be identical regardless
of extraction method)
to extraction, which will vary depending on the mathematics of the
extraction. The cumulative percent variance accounted for by the extracted
factors will not change (to be discussed later), but the distribution
of the variance will change along with changing factor loadings during
rotation. Thus, if the extracted eigenvalues account for a cumulative
45% of the variance overall, the cumulative variance accounted for
will still be 45% after the factors are rotated, but that 45% might
have a slightly different distribution across factors after rotation.
This will become clearer in a little bit, as we look at some example
data.
Iterations and convergence. The
majority of the methods described below rely on an iterative procedure
to “converge” on a final solution. Convergence occurs
when the change between one model’s communalities and the next
model’s communalities is less than .001. This can essentially
be interpreted to mean that the two models are yielding the same results.
Although we have not yet come across a good reason to reset the convergence
criteria, it is possible to do so through the CONVERGE
option.
If an EFA analysis fails to “converge,” that means that
these coefficients failed to stabilize and continued changing dramatically.
This is most commonly due to inappropriately small sample sizes. One
potential solution to this problem is to increase the default number
of iterations. The default number of iterations is 30 in PROC
FACTOR
and it can be reset with the MAXIT
ER
option.