Two
basic procedures exist for producing factor scores in SAS: 1) a predefined
SAS option and 2) a program-it-yourself option. As you might imagine,
the predefined SAS option is by far the easier. When you include the
OUT=output-data-set-name
option
on the
FACTOR
statement (SAS, 2015), SAS
will automatically output a data set that contains all of your original
data along with an estimate of weighted factor scores (e.g., Factor1,
Factor2, etc.). SAS computes the factor scores as a linear combination
of a standardized version of our variables (i.e., with a mean of 0
and a standard deviation of 1) and something called the standardized
scoring coefficients. The scoring coefficients are essentially the regression
coefficients used in the computation of the factor scores. Note that
these are different from the pattern matrix loadings, which are viewed
as the regression coefficients for computing the communalities. It
is important that you specify the desired number of factors to extract
and the extraction and rotation methods when using this method; if
you do not, SAS will use the default options of extracting the minimum
number of factors recommended by the proportion of variance and the
minimum eigenvalue criteria
(see Factor Extraction Criteria) and performing principal components
analysis with no rotation. An example of the syntax to estimate factor
scores, using this method for the engineering data, is presented below.
proc factor data = engdata nfactors = 2 method = prinit priors = SMC
rotate = OBLIMIN out=factor_scores1;
var EngProb: INTERESTeng: ;
run;
The other way of estimating factor scores—the
program-it-yourself way—is often more complex. You will need
to write your own code, which might contain a number of procedures
or DATA steps. Some options that might be useful for this are the SCORE
and OUTSTAT=output-data-set-name
options.
Both options are added to the FACTOR
step.
The SCORE
option will print the standardized
scoring coefficients, and the OUTSTAT
option
will output a data set that contains the various results (e.g., pattern
loadings, communalities, etc.) from the analysis. The standardized
scoring coefficients are also included in this data set when the SCORE
option
is used in conjunction with the OUTSTAT
option.
This information can be useful in the various calculations of factor
scores. A simple example of the syntax to estimate equally weighted
improper factor scores (discussed in the next section) is presented
below.
data factor_scores2;
set engdata;
Factor1=mean(of EngProbSolv:); *compute mean of all items in the
engineer problem solving scale;
Factor2=mean(of INTERESTeng:); *compute mean of items in the
interest in engineering scale;
run;