Free EBook!
We've compiled much of the blog into a free, 73-page ebook, Survey Software Success. The book outlines seven best practices for conducting online surveys. Download your free copy here.
|
Survey Research & Enterprise Feedback Management
|
RSS Feed
Posted by Jeffrey Henning on Fri, Jun 19, 2009
If you are simply developing a list of choices to a choose-one question, and those choices have no relative relationship to one another, than you are not developing a rating scale. When you report on these choices, you will simply report on the frequency with which each choice was selected and highlight the most frequently selected choices. Use these choose-one best practices to come up with the appropriate choice list. For a rating scale, on the other hand, you want each label to represent a standard interval from one another. You plan on reporting on the arithmetic mean of the answers, not just choice frequencies, and you may try to discover correlations between the numeric rating and other variables in the survey. How to label the scale depends in part on whether the scale is unipolar (ranging from 0% to 100% of a property) or bipolar (where the zero point is in the middle and the end points are opposites, such as “completely dissatisfied” and “completely satisfied”). If you are developing a unipolar rating scale, use a five-point numeric scale such as 0 to 4 or 1 to 5, choosing a label for each point. A common approach to unipolar scales follows this wording: - Not at all cromulent
- Slightly cromulent
- Somewhat cromulent
- Moderately cromulent
- Extremely cromulent
For a bipolar rating scale, use a seven-point scale ranging from -3 to 3, choosing a label for each point. Bipolar rating scales are easier to write, as the wording should be in parallel for positive and negative items with the same absolute value. For instance, for measuring satisfaction, a good bipolar scale is: - Completely dissatisfied
- Mostly dissatisfied
- Somewhat dissatisfied
- Neither satisfied nor dissatisfied
- Somewhat satisfied
- Mostly satisfied
- Completely satisfied
Purists typically insist that the midpoint take the form “Neither satisfied nor dissatisfied” but others prefer to label the midpoint “Neutral” for succinctness. If you want to use a different word or phrase for each label, take care that the words are approximately equally apart. For instance, Jon Krosnick and Leandre Fabrigar in “ Designing rating scales for effective measurement in surveys” summarize the results of four studies into scale values for labels assessing liking. Clearly, the scale “Very Poor, Poor, Fair, Good, Excellent” does a good (but not excellent!) job of spacing out each label. To develop an original scale such as this requires pre-testing and is probably inappropriate for most business researchers to attempt. In those cases where no other rating scale will do, instead use a five-point unipolar scale with just the endpoints labeled or a seven-point bipolar scale with the endpoints and midpoint labeled. While not ideal, and against best practices, you are less likely to go wrong using such an approach than simply making up a scale of your own.
Posted by Jeffrey Henning on Tue, Jun 09, 2009
Over the past 22 years, I have helped many clients write surveys, and more often than not, novice authors listed the questions in the questionnaire simply in the order in which the questions occurred to them. Good surveys have a natural flow to them, but to achieve that flow typically requires editing and reordering.
By rearranging your questions to follow a standard order, your respondents will have a better experience. I often use an inverted pyramid approach, drilling down:
- Screener questions
- Open-ended questions
- General questions
- Specific questions
- Demographics or Firmographics
- Follow-up questions
You could follow the links and read each of these posts separately, but for a more cohesive view of structuring questionnaires, I'd recommend downloading a copy of the ebook Survey Software Success and checking out chapter 5.
Posted by Jeffrey Henning on Fri, Jun 05, 2009
Your boss says that questionnaire you're writing should ask your customers "What is your overall satisfaction with our product?" and should use a five-point scale. Sounds simple, you think to yourself, then go to write the question. Which format do you use?

Like many other tactical issues regarding questionnaire design, scale formats have been carefully studied. According to the summary of available research by Jon Krosnick and Leandre Fabrigar in "Designing rating scales for effective measurement in surveys":
- Respondents prefer rating scales with more verbal labels
- Respondents believe such scales provide more valid measurement
- Choosing a labeled choice is a more natural mental activity (not to mention more conversational) than selecting a number within a range
- Longitudinal reliability is greater when using fully labeled scales instead of partially-labeled scales
- Validity, especially inter-rater validity, is greater using fully labeled scales
- Using fully labeled scales provides greater reliability and greater validity from respondents with low to moderate education
- Because numeric values can confuse respondents and affect the choices they make, it is better to omit numeric labels altogether.
Given the importance of fully labeling a rating scale, choose an existing common scale where possible, rather than writing your own scale. Reword the question if necessary to fit a common scale. And, of course, take care when deciding how many points to use within the scale.
Posted by Jeffrey Henning on Tue, Jun 02, 2009
Having looked at the order effects of choices in web surveys a few weeks ago, I thought it appropriate to look at the order effects of questions themselves. Does re-arranging the order of the questions affect responses?
The market researcher would prefer that the respondent consider each question in isolation, unrelated to any questions that have been asked before. Of course, respondents are not robots, and earlier questions will unfortunately bring topics to mind that can "contaminate" later answers.
The example of such contamination that I have seen in my own surveys is the order of general questions vs. specific questions. If the general question is an open-ended question, many survey authors I've worked with prefer to put it after the closed-ended question, since open-ends are harder to answer (requiring thinking and typing rather than thinking and clicking a button). But when asking the verbatim question second, you will definitely get a greater percentage of respondents talking about the previous questions.
In the paper "Effects of Question Order on Survey Responses" by Sam McFarland, some respondents were asked general questions (describing their interest in politics and religion) and then specific questions (evaluating the state of the economy and the energy market) while others were asked the specific questions first. Asking the specific first increased the likelihood that respondents would report an interest in the general questions.
|
Test A |
Test B |
| Question Order |
1. General 2. Specific |
1. Specific 2. General |
| General Results |
Control |
Greater interest in specific items |
| Specific Results |
No change |
No change |
As a result, my preference continues to be to ask an open-ended question first about how an organization can improve a product or service, then follow up with a closed-ended question presenting a range of items to be rated.
Because of the ability for early questions to contaminate later questions, sometimes one question order for every respondent is the wrong approach. When asking a respondent to rate two or more contrasting items (typically products, services or organizations), it is customary to rotate the order of the items, so that the consistent assessment of one item before another doesn't introduce any bias into the results. In survey software, this is typically accomplished by setting up page rotations that randomly rotate pages or other blocks of questions. This is analogous to randomizing choices in a choice list.
A free subscription to this blog to the first person with an RSS reader who can tie the photo to this topic! (Clearly I need to hire the MR blogger Zebra Bites as a photo consultant.)
Posted by Jeffrey Henning on Thu, May 28, 2009
In March, Scott Blacker joined us as our new senior director of product management. In his career, Scott has held multiple product management positions, most recently with Rosetta Stone, the leader in language-learning software. Appropriately, then, Scott's first post is on the perils of translation and surveys.
In the race for global market share, many organizations with an international customer base require surveys to be deployed in multiple languages. Typically, the survey is written in the native language of the survey author, translated into the target language(s), and then deployed. Unfortunately, this process misses a critical, often-overlooked step: back-translating a survey into the native language of the survey author.
The reasons for skipping this step are easy enough to understand. Translation costs are expensive, and paying to both translate and back-translate a survey doubles these costs. Additionally, time demands on survey deployment are often intense, and back-translating can add valuable days to the survey deployment timeline. However, skipping this step can have serious consequences when ultimately analyzing survey response data, effectively killing the survey ROI.
The original survey author is the subject matter expert on the topic at hand. The nuance of how a question is posed - and the specific word choices involved -matter greatly in determining the nature and validity of the final data collected. Translators may have several linguistically correct options to choose from during the translation process, but may choose a nuance that misses the original intention of the survey author. Back-translating through a second translator (who has no affiliation with the original translator) greatly reduces the likelihood of this type of error. Back translating allows the original survey author to:
- Validate the quality of the initial translation
- Ensure that the nuance of the translation matches the original intent
- Open up a dialogue with multiple translators to build a consensus around the best possible translation.
This actually happened to an associate of mine just last year. In deploying a satisfaction survey into the Japanese market, he paid a premium for a top-of-the-line translator. While the survey was linguistically correct and made perfect sense in Japanese, the translator has chosen a word for "satisfaction" that was closely aligned with "happy" in the Japanese language. When the results came, it appeared as if the product was a success - nearly 80% of Japanese indicated that they were "very happy" or "somewhat happy" with the product.
However, when looking over some of the open-ended qualitative feedback, he realized that something was amiss. Japanese respondents had interpreted the word not as "happy", but as "fun". So yes...80% of Japanese respondents had indicated that the product was "fun" (which it was - it was a learning game), but fun in this case bore little correlation to satisfaction. Other elements, such as the ability to achieve a learning objective (after all, it was a learning game), turned out to be much more relevant to the customer's overall satisfaction.
In this case, the problem was caught, but not before the results had been presented to the CEO. The survey had to be re-run, incurring additional costs and delaying decisions on whether a major marketing campaign should be run. It also didn't reflect well on the market research department that was responsible for deploying the survey. More seriously though, had the problem not been caught, the company might have invested millions of marketing dollars into a product that had no chance of succeeding in that market.
As a survey author, you spend hours agonizing over diction when constructing questions in your native language...not investing the same time and resources into ensuring that the same nuance is appropriately reflected in your globally deployed surveys could cause you to fail just inches shy of the finish line.
Posted by Jeffrey Henning on Wed, May 20, 2009
We last looked at respondent behavior with the post Long Surveys Turn Respondents into Liars. Well, similarly, long choice lists turn respondents into satisficers, selecting a satisfactory answer rather than the optimal answer.
Jon Krosnick and Duane Alwin in the report "An evaluation of a cognitive theory of response order effects in survey measurement" provide an excellent summary of the past research that documented this behavior:
Studies of impression formation1, the impact of persuasive communications2, sequential processing of performance information3, and the serial position effect4 all suggest that when items are presented visually on "show cards," primacy effects are to be expected. This occurs for two main reasons.
- Items presented early may establish a cognitive framework or standard of comparison that guides interpretation of later items. Because of their role in establishing the framework, early items may be accorded special significance in subsequent judgments.
- Items presented early in a list are likely to be subjected to deeper cognitive processing; by the time a respondent considers the final alternative, his or her mind is likely to be cluttered with thoughts about previous alternatives that inhibit extensive consideration of it. Research on problem-solving suggests that the deeper processing accorded to early items is likely to be dominated by generation of cognitions that justify selection of these early items5. Later items are less likely to stimulate generation of such justifications (because they are less carefully considered) and may therefore be selected less frequently.
So, now that we know that our respondents do this, how do we address this issue when constructing choice lists?
- If a long choice list can be structured into an outline, present the choices as a hierarchical question instead.
- Consolidate the long choice list into a shorter list that makes fewer distinctions.
- For long lists that can't be modified, use randomization. While it would be too costly in a paper survey to have multiple versions of the questionnaire, each presenting choice lists in different orders, for a web survey the ability to randomize choice lists is a built-in capability of most survey software and has no added cost to use. Such randomization isn't needed for long lists that respondents don't have to read; for instance, alphabetized lists of states or countries, where the respondent knows the answer without reading the choice list and is simply finding the choice in the list. Nor is randomization appropriate for rating scales. Instead, randomize the choices for any long list that lacks an inherent order.
- Finally, Krosnick and Alwin advise attempting to "to increase respondent motivation in order to increase concentration and decrease satisficing. Motivation may be increased by adding special instructions informing respondents that the question they are about to answer is relatively difficult and requires extra concentration."
1 Asch, 1946;Nisbett & Ross, 1980, p. 172-175; Anderson & Hubert, 1963; Sherif, 1935; 1936; Lingle & Ostrom, 1981; Anderson L Barrios, 1961; Dreben, Fiske, & Hastie,1979. 2 Miller & Campbell, 1959; Ronis et al., 1977; Crano, 1977; Hovland et al., 1957; Insko, 1964. 3 Jones et al., 1968. 4 Bruce & Papay, 1970; Crowder, 1969; Rundus, 1971. 5 Koriat, Lichtenstein, & Fischhoff,1980; Hoch, 1984; Klayman & Ha, 1984; Tschirgi, 1980; Wason & Johnson-Laird,1972.
Posted by Jeffrey Henning on Mon, May 11, 2009
The saying Garbage In, Gospel Out reflects our willingness to believe computer output, even if it was generated from bad input. Survey researchers are no more immune to this tendency than computer scientists, as poorly worded questions can lead to suspect results and erroneous conclusions.
We recognize bad questions when we see them in other people's surveys, but we don't recognize them as easily when we write them ourselves. Each of the following examples are adapted from public surveys:
- "How likely is it that you will attend the 2009 Expo at our new, low entry prices?"
- "The 2009 Forum will be held July 24-26 in central Iowa. Please check all the reasons why you may choose not to attend."
- "What type of involvement would you like to have with the Celebration of Nations program?"
- "What do you think of those titles? Yes, we are clearly not creative, but that's why we are asking for your help."
So how do you write objective questions that don't bias the results one way or the other?
- Your questions should use nonjudgmental wording and neutral terms. Respondents should not be able to determine where you stand on any topic. (See this post for an example of how to research attitudes towards abortion.)
- Don't presuppose anything (one type of leading question). For the third example above, for instance, ask instead "What type of involvement, if any, would you like to have with the Celebration of Nations program?"
- Don't ask other types of leading questions. (See the links for details.)
- Avoid so-called "double barreled questions" by splitting them into two. Instead of "How would you rate our price and service?" ask "How would you rate our price?... How would you rate our service?"
- Remove ambiguity in use of words and grammatical structures. The question "Will a bimonthly schedule make you more or less likely to renew your subscription?" is useless, since bimonthly can mean either "twice a month" or "every other month".
- Avoid industry jargon and acronyms; too often in my own surveys I've assumed people know what I mean by CRM or TQM when I should have defined and described them.
- Specify how you use general terms. If you need to map results back to industry figures, make sure you are using industry definitions. Instead of "Have you purchased a new big-screen television in the past year?" make sure you define the cut-off screen size that maps to your data (e.g., "Since January 1, 2008, have you purchased a new big screen (40-inch or larger) television?").
- Open-ended questions should specify a unit of measure. Instead of "How far do you live from the nearest Acme store?" ask "How many minutes does it typically take you to travel to the Acme store that you typically shop at?"
In general, try to write from the respondent's perspective rather than your perspective. Don't make subtle distinctions that would be obvious to a coworker but not a customer. You spend 40, 50, 60 or more hours a week thinking about your company and its products and services; your customers don't. Have others outside the organization proofread your questions for clarity. For strategic surveys, pre-test your survey with a segment of your audience.
Bad questions are the Garbage In, with their analysis being the Garbage Out. A question like "How likely is it that you will attend the 2009 Expo at our new, low entry prices?" will end up overstating actual likelihood to attend, while a question like the following will understate the likelihood to attend: "The 2009 Forum will be held July 24-26 in central Iowa. Please check all the reasons why you may choose not to attend."
If you don't get the question right, you won't get the analysis right.
Posted by Jeffrey Henning on Thu, May 07, 2009
Sometimes I hear clients use the words ranking and rating interchangeably, even though there is a distinction. The difference is simple: a rating question asks you to compare different items using a common scale (e.g., "Please rate each of the following items on a scale of 1-10, where 1 is ‘not at all important' and 10 is ‘very important'") while a ranking question asks you to compare different items directly to one another (e.g., "Please rank each of the following items in order of importance, from the #1 most important item through the #10 least important item"). Both types of questions have their strengths and weaknesses.
Ranking questions:
- Guarantee that each item ranked has a unique value
- Take on average three times longer to answer than rating questions (Munson and McIntyre, 1979)
- Mentally tax respondents, requiring them to compare multiple items against one another
- Increase the difficulty of answering disproportionately as choices are added
- Limit the range of statistical analysis available.
Rating questions:
- Lead to less differentiation among items, with the possibility that a respondent rates every item identically
- Often have a narrow distribution of ratings, which typically fall into an upper band
- Accept great personal variations in rating styles (e.g., respondents who never assign the highest rating)
- Produce possibly spurious positive correlations due to individuals' personal variations.
(See "The measurement of values in surveys: A comparison of ratings and rankings" by Duane Alwin and Jon Krosnick for a more technical review.)
The mental effort required to answer a rating question is linear: the same effort is involved per item. The mental effort for a rank-order question is almost exponential - N*(N-1)/2 - since each item has to be compared to every other item. Because the effort grows rapidly as more items are added, it is commonly advised to only use ranking questions when there are seven or fewer items to compare. For longer lists, use a rating question instead or ask respondents to select the three most important items from the list.
For other cases, think about whether the items should be similar in comparison or should be very different. For instance, when asking people to rate the importance of items about why they did business with your organization or why they purchased a product, many attributes are of similar importance, making a rating scale appropriate. Alternatively, when asking people what features you should work on next, where you need to build a priority list for your development team, a ranking question is more appropriate.
When do you prefer to use ranking questions instead of rating questions, or vice versa?
Posted by Jeffrey Henning on Tue, May 05, 2009
Add to the list of ways to corrupt people with surveys - offering incentives to respondents and bonuses to employees - boring people to death with long questionnaires. Actually, you just bore them to the point where they pay less and less attention to the quality of their answers, until finally they're cheating.
When respondents first begin the questionnaire, they're trying to give you the most appropriate answer to each question (optimizing). A little ways into the survey and they are engaging in what Krosnick calls weak satisficing: selecting the first choice that appears reasonable and answering "Yes" to be agreeable. Further into the survey and they are strong satisficing: failing to differentiate between ratings and selecting "don't know" rather than giving an opinion. After subjecting them to many pages of a questionnaire you've lost them and their good will; if the questions are required, they are just randomly selecting responses to be done with the damn thing. What can you do about this?
- Remind respondents in the introduction of the importance of accurate answers
- Keep the questionnaire length appropriate to the purpose of the survey
- Follow our six strategies for shortening questionnaires
- Rotate the order of sections of the questionnaire, so that no section suffers disproportionately from having respondents satisfice
- Minimize the use of required questions, so that the truly uninterested respondent can skip the question without fabricating an answer
- Keep the questions engaging and relevant.
What other tips can you think of to keep respondents engaged for the duration of a long questionnaire?
Posted by Jeffrey Henning on Fri, May 01, 2009
Quick post today, contrasting good and bad sampling techniques and linking to past posts on these topics.
Good Sampling Techniques
- Make sure there is an equal probability of selecting any member of the target population (the first requirement of random samples)
- Send scheduled email reminders to recipients you invited, so as to maximize external selection (the second requirement of random samples)
- If you have more than 1,000 potential respondents, do not invite them all in an attempted census, but leave names available for future surveys
- Gather enough responses to meet the recommended sample size
- Take care to describe your surveys results as representative of the population for which you have email addresses, rather than the entire population
Bad Sampling Techniques
All Posts
Error sending email
Email sent successfully
|