Example Syntax and Output

Theory. The first criterion summarized above is unfortunately not something that SAS can help you with. This requires combing the literature to better understand the theoretical constructs that might underlie your set of items. The literature might identify one specific structure (e.g., a two-factor model) or multiple structures (e.g., a two-factor model under one framework and a five-factor model under another). It is your job to understand these models and then test them with your data.
If we think back to the engineering data that we used in Chapter 2, the scale used was designed to evaluate two factors: engineering problem-solving and interest in engineering. We could do some additional research to see whether anyone has ever used this set of items or a similar set to evaluate a different set of constructs. If we did not find anything else in the literature, theory would tell us this data contains two factors.
Kaiser Criterion, scree plot, minimum eigenvalue, and proportion of variance. The next four criteria are relatively easy to use and evaluate in SAS. As we reviewed in Chapter 2, the various eigenvalue estimates are automatically produced by the FACTOR procedure. However, the scree plot is not automatically output. We can request it be produced by adding the SCREE option to the FACTOR statement or by requesting it through the ODS graphics system. In general, the Output Delivery System (ODS) offers prettier graphics, so that is our preferred choice, but we will show you both below.
Using the engineering data, we can examine these three criteria with the syntax provided below. The first set of syntax shows how to use the SCREE option to produce the scree plot, and the second shows how to use ODS to produce the scree plot. We use the iterated PAF extraction method (method=PRINIT priors=SMC) for this data because it seemed to be one of the many appropriate methods based on the results in Chapter 2. We also are not including the number of factors to extract (NFACTORS = ) option for two reasons. First, if we remember the example output for this data set provided in Chapter 2, SAS will produce the initial eigenvalues as well as the final extracted eigenvalues, so we will get the important output no matter what. And second, the default method in SAS is to use either the proportion of variance or minimum eigenvalue criteria, so we will be able to confirm our interpretation of one of the criteria with the SAS interpretation. It is always better to have SAS check us if possible!
*Using the scree option;
proc factor data = engdata  method = PRINIT  priors = SMC  SCREE;
   var EngProbSolv: INTERESTeng: ;
run;
*Using the ODS system;
ods graphics on;
proc factor data = engdata  method = PRINIT  priors = SMC  plots = SCREE;
   var EngProbSolv: INTERESTeng: ;
run;
ods graphics off;
Both of the PROC FACTOR statements above will produce the following table related to the Kaiser Criterion, minimum eigenvalue, and proportion of variance. Figure 3.1 Initial eigenvalue estimates displays the initial eigenvalue estimates produced before the extraction method is used to iteratively converge on a solution. We use the initial estimates because these are generally identical across extraction method.[3] In this example, the Kaiser Criterion would tell us to extract two factors because only the first two rows (which represent potential factors to extract) have eigenvalues that are greater than 1. Based on the minimum eigenvalue criteria, we would retain only factors that account for .75 of an eigenvalue or more (i.e., the average eigenvalue or the total extracted variance divided by the number of items: 10.6068/14). This would also have us retain two factors. Finally, the proportion of variance criteria would recommend we retain two factors because 100% of the common variance is explained by two factors. The text at the bottom of the figure tells us that SAS plans to retain two factors based on the proportion of variance criteria.
Figure 3.1 Initial eigenvalue estimates
The scree plots produced by the SCREE option and ODS are produced in Figure 3.2 Scree plot from SCREE option and Figure 3.3 Scree plot from ODS , respectively. They show mostly identical results, the plotted initial eigenvalue estimates, just in slightly different formats. The SCREE option produces a simplified text-based figure (that can actually be copied and pasted as text), but ODS produces a figure in a graphical format. ODS also includes a plot that combines a graphical representation of the eigenvalues with the proportion of variance explained.
Based on these scree plots, we can see an “elbow” or inflection point at factor 3, suggesting that two factors should be retained. There is also an additional “elbow” that appears at factor 4. This is much less pronounced than the previous inflection point but, based on these results, we can consider exploring a three-factor solution in addition to a two-factor solution.
Figure 3.2 Scree plot from SCREE option
Figure 3.3 Scree plot from ODS
MAP and parallel analysis. The last two extraction criteria are a little trickier to use. As we mentioned above, one barrier to researchers using MAP and parallel analysis is that these procedures are not widely implemented in most common statistical software, including SAS. Fortunately, O’connor (2000) developed SAS syntax to perform these analyses. These can currently be downloaded from https://people.ok.ubc.ca/brioconn/boconnor.html. In addition, a macro version of this syntax is included in the example code for this book and is available from the book website.
To run these analyses from the macros that are available from the book website, you need to include the macro syntax in your current SAS file using a %INCLUDE statement. This loads the macro into your current session memory so that you can call macros from inside the external file. You then can run the analyses by calling the respective macro and entering the necessary arguments. The MAP macro has only one argument: datafile, the data set name. Thus, you can run the MAP macro from % map( datafile ). The parallel macro has more arguments: datafile, the data set name; ndatsets, the number of random data sets to use; percent, the percentile to use in determining whether the eigenvalues are significantly above the mean; kind, the type of parallel analysis (1=PCA and 2=factor analysis); randtype, the type of random data to be used (1=from normally distributed random data and 2=random permutations of the raw data); and seed, the seed value to use in computations. You can then run the parallel macro from % parallel( datafile,ndatsets,percent,kind,randtype,seed ). The syntax to do this for the engineering data is presented below.
*Include MAP and PARALLEL ANALYSIS macro syntax for use below;
filename parallel‘C:Location of Fileparallel_macro.sas’;
filename map ‘C:Location of Filemap_macro.sas';
%include parallel map;
*Run MAP and Parallel Analysis; 
%map(engdata);
%parallel(engdata,100,95,2,2,99);
The results produced by the O’Connor (2000) MAP analysis are presented in Figure 3.4 MAP analysis results below. The results show the eigenvalues extracted from the data, the average squared partial correlations, and the average partial correlations to the fourth power. Recall that, using MAP analysis, we want to choose a number of factors where the average partial correlations hits a minimum (the squared partial correlation based on the 1976 criteria and the average partial correlation to the fourth power based on the revised criteria). According to the results, two factors should be retained.
Figure 3.4 MAP analysis results
Another way to visualize these results is to plot them. Figure 3.5 Plot of average partial correlations for MAP test of the engineering data presents a plot of the average squared partial correlations. As you can see, the inflection point (minimum) on the plot is at 2.
Figure 3.5 Plot of average partial correlations for MAP test of the engineering data
The results for the parallel analysis are presented in Figure 3.6 Parallel analysis results below. The parallel analysis computed the eigenvalues from the data and then generated a series of comparable sets of random data. The mean eigenvalues across the sets of random data were computed along with the 95th percentile of those values. These values are presented on the right of the figure, in the columns labeled Raw Data, Means, and Prcntyle. Remember, the goal is to select the number of factors whose observed eigenvalues exceed those produced from random data. The current results would recommend two factors be extracted since the raw data eigenvalue goes below the generated mean eigenvalue and 95th percentile eigenvalue among the random data sets at factor 3. Again, we can visualize these results by plotting the mean eigenvalues in the random data against the eigenvalues in the raw data. (See Figure 3.7 Parallel analysis plot of engineering data.)
Figure 3.6 Parallel analysis results
Figure 3.7 Parallel analysis plot of engineering data
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset