Survey Software, Web Survey, Online Surveys, and Enterprise Feedback Management solutions from Vovici

Your email:
   

Welcome to the Listening Post!

Your single source for everything Voice of the Customer (VoC) and Customer Experience (CxP). And, don’t forget you can follow us on twitter @vovici, or come check us out on Facebook and join the Vovici Network on LinkedIn.

 

Current Articles | RSS Feed RSS Feed

On a Scale of 0 to 10, Numeric Ratings are a 6 Pack of Suckitude

 
6 pack

I can't seem to convince people to abandon numeric rating scales and use fully labeled scales, despite research into rating scale best practices that shows that numeric scales are 1) less reliable, 2) have lower predictive validity, 3) confuse the less educated, 4) have poor interrater reliability, 5) are artificial, and 6) suffer mode bias in IVR surveys.

So I am going to repackage my advice to be more in keeping with current market research fads:

Avoid numeric rating scales because they are not conversational.

Recall the old way: "On a scale of 0 to 10, where 0 is not at all likely and 10 is completely likely, how likely are you to recommend us?"

Now imagine the follow-up question: "Why did you rate us a 9?"

"Because I don't give out 10s."

Wow, thanks. (But hardly atypical: questions that re-state ratings often generate comments about the question itself.)

Instead, phrase it this way. "How likely are you to recommend us?" Short and sweet, no mention of scales, no pleas to "rate us".

OK, you do have to prompt "Not at all likely, Slightly likely, Moderately likely, Very likely, Completely likely." Long-winded and not very conversational, I'm afraid, but in an online survey people skim such choice lists as they quickly look for the appropriate answer, and research has shown that respondents prefer fully labeled scales. And the follow-up questions are now much more natural:

"Why are you not completely likely to recommend us?"

Much better than "Why did you rate us a 9?" 

Avoiding numeric scales is going to provide you information with greater reliability, validity and consistency, with less bias by mode and demographic. And you're going to get to avoid "survey speak" in favor of more natural wording and question flow.

Anyway, it's something to talk about.

Comments

My feelings exactly.
Posted @ Wednesday, May 05, 2010 10:34 AM by Lori Langone
Hi Jeffrey - I agree that fully labeled scales are best. However, since we plan to switch at least our "willingness to recommend" question from prior 7-point scale to the 11-point Net Promoter Score scale, that will be a lot of words. Questions: 
 
1) Does this change your recommendation? Such as label just some? 
 
2) What labels would you use for all 11 points? 
 
PS: I know you don't recommend the NPS scale, but for a variety of reasons (mainly requirement that we have a number that is comparable to other companies/industries), we are feeling compelled to make this switch. Thanks! 
 
Posted @ Wednesday, May 05, 2010 11:01 AM by Pam Snodgrass
But, but, but... 
 
Then I cannot calculate means and what would we do without means? 
 
In all seriousness, the statisticians do not like this approach because it affects the types of analysis that can be conducted.
Posted @ Wednesday, May 05, 2010 11:09 AM by Bob Fichtner
But you can assign numeric report values to the verbal responses behind the scenes that allow you to run analyitics, without the survey respondents needing to see those numbers.
Posted @ Wednesday, May 05, 2010 11:12 AM by Lori Langone
Yes, I would definitely assign the numbers in the background, and since it would be an 11-point scale, there would still be a good mean score. Still don't love the idea of an 11-point scale, but since so many other companies seem to still use it, we don't have a lot of choice. Although will continue to use 7-point scales for other questions in our annual survey and 5-point scales for our transactional surveys.
Posted @ Wednesday, May 05, 2010 11:16 AM by Pam Snodgrass
Lori - I agree. But that presumes a linear relationship and equi-distant spacing between the response choices. Is the distance between "Extremely sat" and "Very sat" the same as the distance between "Very sat" and "Somewhat sat?" The stat guys argue that they are not and it can foul up some of the advanced analytics. 
 
Most people do it anyway, but that doesn't mean there isn't a tradeoff.
Posted @ Wednesday, May 05, 2010 11:16 AM by Bob Fichtner
No statistician worth their salt should prefer a numeric scale over a labeled scale. They are both effectively ordinal scales - not even interval scales. The study of magnitude estimation looks at what people mean when they give it a 9 instead of 3, it does not mean 3 times, nor does it mean twice the gap from 3 to 6. 
 
The psychometricians established that labeled scales were more robust (e.g. the Likert scale). 
 
The 11-point NPS scale is an interesting odd one out. It is a 3 point scale, Promote, Don't Know/Care, and Detract. The layout of the scale is simply to aid people putting themselves into one of the three boxes. 
 
The essential problem is that humans are not numeric processing machines, we are at best ordinal, and often only capable of paired choices.
Posted @ Wednesday, May 05, 2010 11:42 AM by Ray Poynter
Good discussion. Many statisticians/analysts prefer 11-point scales as the enable a finer a degree of discrimination (assuming the scale is appropriately and consistently interpreted by the sample, etc.). However, fully-labeled scales are superior as they more concisely define the ratings for the respondent. Ultimately it would seem that fully-labeled scales will provide higher quality data - ergo would be the superior option if one is forced/able to choose one or the other.
Posted @ Wednesday, May 05, 2010 12:25 PM by Brandon Watts
Thanks to everyone for the great comments! 
 
Pam, when using benchmark questions, I prefer to keep the question exactly as is, even if I find the scale suboptimal. As Lori points out, don't feel that how you show the scale to respondents needs to be the same as how you show it when reporting: 
Standardization of Scales in Survey Analysis
 
Ray, thanks for commenting. I would argue that the 11-point scale is being analyzed wrong: presented as unipolar to respondents, yet interpreted as bipolar (NPS is a Misnomer). 
 
Brandon, a 0-to-100 point scale offers even greater discrimination and is easy to understand: unfortunately, respondents don't need greater discrimination and can't use it consistently. Statisticians should be talked out of the 0-to-10 point scale for similar reasons
 
Bob, you've given me an idea for another post. :-)
Posted @ Wednesday, May 05, 2010 12:55 PM by Jeffrey Henning
By abandoning numeric scales it does not mean that you can’t post a numerical score. We use descriptors with absolutes at 1 and 10.
Posted @ Thursday, May 06, 2010 3:38 AM by Stuart Lamb
good dicsussion. I usually prefer doing "frequency" so that I don't need to worry about the distance between different options. For more advanced analysis, it may not be that handy, but it is fine to do basic reporting.
Posted @ Thursday, May 06, 2010 12:01 PM by YCC
I have to agree with Brandon about the NPS. I find the scale useful and will continue to use it. However elsewhere I always try and use the fully labeled scales
Posted @ Monday, May 10, 2010 11:08 AM by alistair
I think we also need to consider where the research is being fielded. English is very rich in adjectives, but this is not true of many languages. It can be difficult to obtain the same degree of granularity with verbal scales that we can in English.
Posted @ Monday, May 10, 2010 11:52 PM by Kevin Gray
Jeffrey, Nice post. Definitely food for thought. Two questions: 
 
1. How do you recommend setting up labeled scales in an online environment? My survey software barely allows for end-point labels (readability and presentation concerns), let alone labels for a full 11-point scale. 
2. What kind of labels would you use? Most articles I see deal with 5- or 7-point scales only.
Posted @ Wednesday, May 12, 2010 11:06 AM by Lynne
I am also feeling compelled to use the numeric NPS scale, due to a need to have a standard comparible to industry measures and a need for good data for advanced analytics (although I agree that even numeric scales could be perceived as ordinal rather than interal or ratio). I have not looked at the research but I would be surprised to find that people are not good at using 0-10 scales to rate their likelihood to recommend. Although it does seem silly to ask people "why did you give a 9." However, I work in a business to business context so my respondents are usually fairly educated. Does anyone know if Professor Don Dillman is part of this network and would comment on the use of the 11 point NPS scale?
Posted @ Wednesday, May 12, 2010 1:53 PM by Theresa Ditton
Lynne, since numbers with labels can confuse respondents, use a standard choose-one or select-one question instead of a numeric scale question type. For writing scales, I use two common patterns, which I describe in my post http://blog.vovici.com/blog/bid/18620/Custom-Scale-Development Custom Scale Development. 
 
Professor Dillman does not read this blog, nor has he published any papers on NPS, but according to his published writings he cites Krosnick and Fabrigar 1997 for use of fully labeled scales and has used five-point scales in his own research.
Posted @ Saturday, May 22, 2010 4:48 PM by Jeffrey Henning
Post Comment
Name
 *
Email
 *
Website (optional)
Comment
 *

Allowed tags: <a> link, <b> bold, <i> italics

Latest Posts

Loading
What's New
Don't Be in the 4%
VoC on Twitter
Verint Blog
Verint Blog: Read the Latest from the Verint Systems Blog