Item-Level Factor Analysis

Cautions Regarding Item-Level Factor Analyses

Applying familiar factor analysis procedures to item-level data often produces misleading or un-interpretable results. What follows is (a) a brief description of the problems, (b) expert recommendations on alternative analytic procedures for item-level factor analyses, (c) a brief listing of programs for conducting the recommended alternative analyses, (d) a brief discussion of parallel analysis for item-level data, and (e) some useful references. If it all seems unexpected and a bit complicated, resist the urge to ignore the issues and forge ahead with familiar factor analyses in the hope that the problems do not apply to your data. The problems with familiar methods are very real and common, the alternatives are not that complicated, and your results will almost certainly make more sense if you follow the expert recommendations.

Problems With Factor Analyses of Item-Level Data

Familiar factor analysis procedures, such as common factor analysis, maximum likelihood factor analysis, and principal components analysis, produce meaningful results only if the data are truly continuous and multivariate normal. Item-level data in psychological research almost never meets these requirements. Dichotomous items (e.g., true or false) are obviously problematic, but item-level data based on Likert scales (e.g., with 4-10 ordered response options) are often just as bad, if not worse (Bernstein & Teng, 1989).

The correlation between any two items is affected by both their substantive (content-based) similarity and by the similarities of their statistical distributions (Bernstein, 1988, p. 398). Items with similar distributions tend to correlate more strongly with one another than do with items with dissimilar distributions. Easy or commonly endorsed items tend to form factors that are distinct from difficult or less commonly endorsed items, even when all of the items measure the same unidimensional latent variable (Nunnaly & Bernstein, 1994, p. 318). Item-level factor analyses using traditional methods are almost guaranteed to produce at least some factors that are based solely on item distribution similarity. The items may appear multidimensional when in fact they are not. Conceptual interpretations of the nature of item-based factors will often be erroneous.

Bernstein (1988) states that the following simple examination should be mandatory: "When you have identified the salient items (variables) defining factors, compute the means and standard deviations of the items on each factor. If you find large differences in means, e.g., if you find one factor includes mostly items with high response levels, another with intermediate response levels, and a third with low response levels, there is strong reason to attribute the factors to statistical rather than to substantive bases" (p. 398).

Expert Recommendations on Alternative Analytic Procedures

The following are recommendations are from the top experts working in this area (see the list of References below). For dichotomous data, conduct a familiar factor analysis on the matrix of tetrachoric inter-item correlations rather than on the matrix of Pearson correlations. When the items are based on ordered categories (e.g., Likert scales), familiar factor analyses should be conducted on the matrix of polychoric inter-item correlations rather than on the matrix of Pearson correlations. Tetrachoric and polychoric correlations are based on the assumption that the response categories (dichotomous or Likert scale) are actually proxies for unobserved, normally distributed variables. Factor analysis of tetrachoric or polychoric correlation matrices are essentially factor analyses of the relations among latent response variables that are assumed to underlie the data and that are assumed to be continuous and normally distributed (Panter, Swygert, Dahlstrom, & Tanaka, 1997, p. 570-571).

Even better: Conduct a full-information factor analysis, which is a factor analytic technique based on item response theory. This approach completely bypasses the problems associated with correlation matrices and uses all of the information contained in the response category pattern frequencies (McLeod, Swygert, & Thissen, 2001; Swygert, McLeod, & Thissen, 2001).

Other common recommendations are to (a) combine the items into mini-scales and then factor the mini-scales, and/or (b) conduct extension and/or higher-order factor analyses.

Programs for Conducting the Recommended Alternative Analyses

paramap, an R package for factor analysis that has options for polychoric correlations

mirt, an R package that conducts full-information factor analysis

POLYMAT-C: SPSS program for computing the polychoric correlation matrix, by Lorenzo-Seva & Ferrando

An SPSS R-Menu for ordinal factory analysis, by Basto & Pereira

The Factor program, by Lorenzo-Seva & Ferrando

A SAS macro (for pre SAS 9.4 versions)

Niels Waller's WinMFACT 2.0 for Windows

Introduction to the Tetrachoric and Polychoric Correlation Coefficients, by John Uebersax

SPSS, SAS, & Matlab programs for extension and higher-order factor analyses

Parallel Analysis For Item-Level Data

The issues described above must be considered before using any procedures for determining the number of factors/components in item-level data sets, including parallel analysis.

Parallel analysis programs typically involve data sets produced by computer-generated, normally distributed random numbers. The eigenvalues in item-level raw data based on dichotomous or Likert response scales cannot be meaningfully compared to the eigenvalues from parallel analyses based on normally distributed random numbers. Instead, determine the number of factors or components by first finding the eigenvalues for the raw-data matrix of tetrachoric or polychoric correlations, and then compare these eigenvalues to those that are produced by the computer-generated random data. And, of course, any subsequent factor analysis should be conducted on the raw-data matrix of tetrachoric or polychoric correlations and not on Pearson correlations.

Parallel analyses can also be based on random permutations of a raw data set, in which the item/variable distributions are identical to those of the real, raw data. This method of parallel analysis should not be used for item-level data because distribution-similarity factors can still emerge.

Useful References

Basto, M., & Pereira, J. M. (2012). An SPSS R-Menu for ordinal factory analysis. Journal of Statistical Software, 46(4), 1-29.

Bernstein, I. H., Garbin, C., & Teng, G. ( 1988 ). Applied Multivariate Analysis. New York: Springer-Verlag. (see especially Chapter 12).

Bernstein, I. H., Teng, G. (1989). Factoring items and factoring scales are different: Spurious evidence for multidimensionality due to item categorization. Psychological Bulletin, 105, 467-477.

Choi, J., Peters, M., & Mueller, R. O. (2010). Correlational analysis of ordinal data: From Pearson's r to Bayesian polychoric correlation. Asia Pacific Educational Review, 11, 459-466.

Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement of clinical assessment instruments. Psychological Assessment, 7, 286-299.

Gorsuch, R. L. (1988). Exploratory factor analysis. In J. R. Nesselroade & R. B. Cattell, (Eds.), Handbook of multivariate experimental psychology (2nd ed., pp. 231-258). New York, NY, US: Plenum Press.

Holgado-Tello, F. P., Chacón-Moscoso, S., Barbero-García, I., & Vila-Abad, E. (2010). Polychoric versus Pearson correlations in exploratory and confirmatory factor analysis of ordinal variables. Quality & Quantity, 44(1), 153-166. doi:10.1007/s11135-008-9190-y

Lorenzo-Seva, U., & Ferrando, P. (2014). POLYMAT-C: A comprehensive SPSS program for computing the polychoric correlation matrix. Behavioral Research Methods. DOI: 10.3758/s13428-014-0511-x. https://link.springer.com/article/10.3758/s13428-014-0511-x

McLeod, L. D., Swygert, K. A., & Thissen, D. (2001). Factor analysis for items scored in two categories. In D. Thissen & H. Wainer (Eds.). Test scoring (pp. 189-216). Mahwah, NJ: Lawrence Erlbaum.

Nunnaly, J. & Bernstein, I. (1994). Psychometric Theory. New York: McGraw-Hill. (see especially pp. 316-318)

Panter, A. T., Swygert, K. A., Dahlstrom, W. G., & Tanaka, J. S. (1997). Factor analytic approaches to personality item-level data. Journal of Personality Assessment, 68, 561-589.

Reise, S. P., Waller, N. G., & Comrey, A. L. (2000). Factor analysis and scale revision. Psychological Assessment, 12, 287-297.

Swygert, K. A., McLeod, L. D., & Thissen, D. (2001). Factor analysis for items or testlets scored in more than two categories. In D. Thissen & H. Wainer (Eds.). Test scoring (pp. 217-249). Mahwah, NJ: Lawrence Erlbaum.

Waller, N. G., Tellegen, A., McDonald, R. P., & Lykken, D. T. (1996). Exploring nonlinear models in personality assessment: Development and preliminary validation of a negative emotionality scale. Journal of Personality, 64, 545-576.

Waller, N. G. (1999). Searching for structure in the MMPI. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every psychologist and educator should know (pp. 185-217). Mahwah, NJ: Lawrence Erlbaum.

Brian P. O'Connor
Department of Psychology
University of British Columbia - Okanagan
Kelowna, British Columbia, Canada
brian.oconnor@ubc.ca