Factor Analysis: A Gentle Introduction
Posted by Jeffrey Henning on Tue, Nov 23, 2010
Ray Poynter, organizer of The New MR Virtual Festival (coming to a web browser near you the week of December 6), recently presented a wonderful introduction to factor analysis. This statistical technique, in his words, “reduces the dimensionality of a space by finding latent factors.” Which is easier to understand than it sounds:
- Reducing the dimensionality – As an analogy, Ray discussed displaying the globe as a map, which moves from three dimensions to two, in the process distorting the picture, making countries like Greenland and the United Kingdom much larger and compressing countries around the equator. Factor analysis reduces the number of dimensions being studied.
- Finding latent factors – Charles Spearman found that students’ test scores across a wide range of subjects were highly correlated. He postulated that the general intelligence factor g was responsible, a latent factor more widely known as IQ. Latent factors can’t be measured directly; “the observable data doesn’t create the underlying factor, the underlying factor creates the observable data.”
What are the potential uses for factor analysis?
- Removing attributes from a study – If you are asking respondents to rate service as “fast”, “speedy” and “slow”, these attributes all reflect a latent factor of service speed. Many questionnaires feature such redundancy; a factor analysis can shorten the survey.
- As a first step to regression – When conducting key driver analysis, you want to use latent factors instead of attributes so that you are focusing on different drivers.
- As a first step to clustering – As with regression, you want to conduct factor analysis first when conducting cluster analysis. Unlike with regression, you don’t want to use factor scores but want to use the attribute with the greatest correlation to the score: cluster analysis works best with variability and lumpiness, which the factor scores would smooth out.
- Creating factor scores – Rather than report on customer satisfaction, likelihood to recommend and likelihood to continue purchasing, for instance, track and trend the factor score: Advocacy Loyalty.
When designing a questionnaire that will use factor analysis, require respondents to answer each attribute that will be used, otherwise you will have to delete incomplete records or impute missing answers. You should also avoid “Don’t Know” and “Not Applicable” responses in this case. If you can’t avoid them, then run two separate factor analyses, one which excludes attributes with such responses and one that doesn’t, then calculate the correlation analysis with the attributes you removed.
When redesigning a questionnaire to remove attributes, look at the verbatim comments to see what attributes people discuss. Don’t just remove attributes based on how they load to the factor score. “In a perfect world, with a normal distribution of values around the mean, the worst attribute is where nearly everyone rates it a 4 or 5,” Ray said. “I’ve even seen some studies where an attribute is 5 of 5 for almost everyone.” Do look at the correlation coefficients, but remove attributes with a low standard derivation. And, of course, which attributes to remove can be political: “Customers [of market researchers] have some questions that they are more in love with than others, and some annual compensation might be linked to specific attributes.”
For an example of how to conduct a factor analysis, please see the recording of Ray’s excellent (and short!) webinar, “Factor Analysis: An Introductory Webinar”. Market researchers are like cartographers: creating maps that simplify a market in order to better navigate it. Factor analysis is a great mapmaking tool.
And do consider signing up to attend The New MR Virtual Festival the week of December 6 – there are plenty of excellent sessions to choose from.
See also: