Acquiescence Bias & Agree-Disagree Scale Best Practices
Posted by Vovici Blog on Mon, Sep 28, 2009
Nothing is easier for the survey author then to dash off a questionnaire asking respondents to rate a bunch of items on an agree-disagree scale (also known as the Likert scale). For instance:
For each of the following statements, please indicate if you: Completely disagree, Disagree, Somewhat disagree, Neither agree nor disagree, Somewhat agree, Agree, Completely agree.
- My overall job satisfaction is very high.
- The issue of excessive executive compensation is very important to me personally.
- I rarely feel discouraged with my work.
- I am very likely to seek employment elsewhere in the next six months.
You can easily add many other statements to this list for respondents to rate. In fact, I have seen questionnaires with 80 to 100 items, all to be rated on this agreement scale.
Unfortunately, in such batteries of questions, respondents exaggerate their actual agreement. Over 100 studies now have demonstrated acquiescence response bias, as some respondents will agree to almost any assertion. Saris, Krosnick and Shaeffer identify three reasons for this, in their paper "Comparing Questions with Agree/Disagree Response Options to Questions with Construct-Specific Response Options":
- Some respondents are simply agreeable, and indicate agreement out of politeness.
- Other respondents expect that the researchers agree with the listed items and defer to their judgment.
- Most respondents engage in survey satisficing and find that agreeing takes less effort than carefully weighing each optional level of disagreement and agreement.
The standard solution to acquiescence response bias has been to have a balanced battery of items, where each item has a negated counterpart somewhere else in the questionnaire. For instance, averaging the agreement level to "I am generally a satisfied employee" with "I am not generally a satisfied employee" was thought, in theory, to produce a rating that factored out the acquiescence response bias. Saris, Krosnick and Shaeffer put that to the test and found three ways it leads to lower data quality: having to answer twice as many questions, which leads to satisficing; processing negations, which are more complex cognitively; and placing respondents who acquiesce in the middle of the scale.
The solution with the highest data quality does lead to more work for the survey author. Each question needs to be asked with what Saris et al call "construct-specific response options": in other words, a rating scale that can be used to measure the item in question. Applying this recommendation to the four questions above yields:
- How would you rate your job satisfaction overall? Not at all satisfied, Slightly satisfied, Moderately satisfied, Very satisfied, Completely satisfied?
- How important is the issue of excessive executive compensation to you personally? Not at all important, Slightly important, Moderately important, Very important, Extremely important?
- How often do you feel discouraged with your work? Never, Rarely, Sometimes, Often, Always?
- How likely are you to seek employment elsewhere in the next six months? Not at all likely, Slightly likely, Moderately likely, Very likely, Completely likely?
At first glance, this looks like more work for the respondent, who must read the choice list for each question. While there is more to read, there is less to think about. For the statement, "I am generally a satisfied employee", respondents might have come up with four reasons to disagree:
- They are generally dissatisfied.
- They are neither dissatisfied nor satisfied.
- They are often satisfied, but not often enough to classify it as "generally".
- They are always satisfied, which is more often than "generally".
The agreement/disagreement scale gives the respondent too much to think about for each item and too many potential reasons to disagree. If this particular question were reworded to use a satisfaction scale, respondents are only rating their satisfaction. Less work mentally, and a more accurate answer.
The best practices for agree-disagree scales are therefore simple:
- Avoid them if at all possible, rephrasing each question to use a common rating scale where possible, otherwise using a custom rating scale.
- When the executives or customer sponsoring the research dictate that you must use an agreement scale, use the seven-item bipolar scale:
Completely disagree, Disagree, Somewhat disagree, Neither agree nor disagree, Somewhat agree, Agree, Completely agree. "Strongly disagree, Disagree, Slightly disagree, Neither agree nor disagree, Slightly agree, Agree, Strongly agree."
The Likert scale has a long and illustrious history, having been invented in 1932. Sadly, it is now as obsolete as the cathode-ray tube (which, coincidentally, RCA first demonstrated could be used as to receive TV transmissions in 1932). Construct-specific response options are the HDTV of the survey world.