Best Practices for EFA

This journey started over a decade ago when one of the first author’s students (Blandy Costello) walked into his office and shared her frustration over conflicting advice and directives from different doctoral committee members. That initial discussion highlighted the lack of consensus on best practices related to exploratory factor analysis, and ended with us deciding that there were empirical ways to explore best practices. After many simulations and further discussions, two articles were published related to these issues: Osborne & Costello (2004) and Costello & Osborne (2005). Both have received significant attention in the literature, and the latter has been cited about 3600 times as we write this. Obviously, we recognized that there is some utility in attempting to explicate best practices in quantitative methods, and that has led to other articles, books, and finally, to this project.
EFA is such a confounding, controversial, and misused (yet valuable and interesting) technique that it has provided lots of fun and fodder for this type of endeavor. We hope you agree it has been worthwhile. Our goal is to collect and elaborate on evidence-based best practices that were published previously, to put them in a single place that is easily accessible, and to model how to implement them within SAS. For those of you who have persevered and have reached this part of the book, we hope that you have drawn the following conclusions:
  1. Keep the “E” in EFA! Many researchers have attempted to perform “confirmatory” analyses through exploratory analyses. Many researchers use confirmatory language and draw confirmatory conclusions after performing exploratory analyses. This is not appropriate. EFA is a fun and important technique, but we need to use confirmatory techniques (e.g., CFA) when we desire to draw those types of conclusions.
  2. EFA is a large-sample technique. We hope that through the course of the book you have become convinced that the best results from EFA come when the sample is appropriately large. There are examples in this book and elsewhere in the literature of the volatile and nonrepresentative results that can happen in small-sample EFA. A reasonable rule of thumb, if one is intent on robust analyses, would be a minimum of 20 cases for each variable in the analysis. We have had students and colleagues show us analyses that had fewer cases than variables. That is rarely a good state of affairs, in our opinion.
  3. Clean your data, and deal with missing data appropriately. Garbage in, garbage out. We won’t belabor this point — but we hope you take it seriously. If we don’t see you address whether you checked your data, tested assumptions, and dealt appropriately with missing data, we might wonder whether anything else you report matters.
  4. Useful results are those that are precise and generalizable. In our mind, the most useful results are those that we can generalize to other samples, or use to draw good inferences about the population as a whole. Likewise, the worst use of anyone’s time is to publish or present results that are not replicable, or that are so imprecise that we cannot draw any conclusions. Large samples and clean data (in addition to strong factor loadings and larger numbers of strongly loading variables per factor) contribute to this mission. Small samples and weak loadings (and few variables per factor) make for messy, conflicting, and useless results.
  5. Principal components analysis is not exploratory factor analysis. We have seen endless debate amongst a small number of partisans regarding PCA vs EFA. Almost nobody else seems to care about this debate, caring rather for whether they can trust their results and interpret them sensibly. If you feel some compelling reason to use PCA (and we do not see one at present), then we hope this book can guide you as well. Most of the best practices that we have covered in this book also apply to PCA. If you insist on using PCA, at least do it with large samples, clean data, and with the limitations of the procedure clearly and overtly admitted.
  6. If you use EFA, don’t use the defaults! If you want to consider yourself to be modeling and exploring latent variables in the best way possible, you want to use ML, iterated PAF, or ULS extraction (depending on whether your data meets the assumptions of ML), and we think you want to use oblique rotation methodologies (either oblimin or Promax seems to work fine in most cases—if one doesn’t work, try the other). Scholars in this area spend so much energy arguing about which extraction or rotation technique is best. But keep our mantra in mind—this is just an exploration. Thus, it should be considered a low-stakes endeavor. Whatever you find from EFA should subsequently be confirmed in a large sample confirmatory analysis.
  7. Use multiple decision rules when deciding how many factors to extract. Another point of constant argument in this field seems to be what decision rule is best in guiding someone on how many factors to extract. We reviewed several, and none are perfect. Just in our three examples, one had a clearly uninterpretable scree plot, one parallel analysis produced what we consider to be questionable guidance, and one MAP analysis was clearly (to our eye, anyway) ambiguous and unhelpful. The other criteria were also at times confusing and problematic. The best guide is theory, and beyond that, choose whatever provides the results that make the most sense. If you cannot make sense of the results—in other words, if you cannot easily explain to someone what each factor means—then you need to go back to exploring. Because any model that you produce has to be confirmed with CFA in the context of a new sample, this seems to us the most sensible approach. Thanks to Brian O’Connor, we have easily accessible ways of trying out modern decision criteria (MAP, parallel analysis). Use them, but realize that no one decision rule will be perfect in all situations.
  8. Replicate your results. If you have two good samples, you can present replication statistics like those we reviewed in Chapter 5, or you can put a single sample to work in bootstrap analysis, like those we explored in Chapter 6. It’s not easy, nor is it automatic, but with the syntax and macros we share, it is not too difficult. And we think that it provides invaluable perspective on your results. We wish this mandate to replicate results would permeate every research lab and statistics class, regardless of what statistical techniques they use. The lessons that are contained in these chapters are equally valid if you are performing ANOVA or regression analyses, hierarchical linear modeling, or nonparametric techniques. Replicate your results, bootstrap your analyses, and report (and interpret) confidence intervals for important effects so we, as readers, can get more out of the hard work that you put into your research.
  9. Have fun! The ability and training to perform research like this is a wonderful gift. The first author has been lucky enough to spend the last 25 years doing quantitative research, and the second author is at the beginning of her journey, but we have enjoyed every minute of it. Those of us who perform data analysis[1] are the ones who are present at the moment each tiny bit of knowledge is created. We create knowledge—we ask questions and find answers. Sometimes those answers are not what we expect, which is an opportunity to ask better questions or learn something unexpected. We cannot think of a more rewarding way to spend our career, and we hope each one of you experiences the same joy and thrill from your research.
Thank you for taking time to read our work. We always welcome feedback or communication from readers. The best way to reach us is through email at: [email protected]. We hope you find the ancillary materials on the book website (http://jwosborne.com or http://support.sas.com/publishing/authors) useful. Happy researching!
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset