 |

By Shiv Raman, M.A., M.B.A.,
Senior Vice President and Chief Marketing Scientist, GfK V2
I have a terrific idea for an online
business in the brave new world of Web 2.0. Let’s find out what
busy, career-driven singles really want in a partner, what’s really
important to them and create compatibility metrics to match couples.
People would pay anything for such a service, right? It’s foolproof!
OK. Reality check. According to Wikipedia, there were at least 844 lifestyle
and dating services at the end of 2004. That number is likely closer
to a thousand now. So, not a great business idea in the summer of 2007,
but I bring this up for another reason. A quick run through some of
the major dating sites reveals that they all require prospective singles
to fill out an online questionnaire. The questionnaire is presumably
used to develop profiles of the respondents and thereby identify the
kind of people they are likely to be compatible with. Many of these
questions measure the importance of traits such as intelligence, looks,
sense of humor and so on. But, just how useful are such measures of
importance? Would anyone rate any of these attributes as being less
important relative to the others?
My contrived example illustrates two well-known issues in the measurement
of stated attribute importance. First is the obvious point that if our
attributes are all of the “mom and apple pie” variety, we
are unlikely to detect variation in responses. This is the reason online
dating sites want to know whether your idea of a good time involves
dodging kinkajous and agoutis in the rainforest or sipping mojitos at
a candle-lit table. Second, even if attributes cover a wide spectrum
of issues, are such ratings-based exercises capable of identifying the
top few attributes that truly matter? This latter point is the premise
of this column – that typical, scale-based questions to measure
stated importance are seldom a source of great insight. There are many
reasons for the poor information content of such tasks, not the least
of which are a) variations in how people interpret and use
scales (think Grumpy Gordon with his fours and Sunny Sally with her
tens) and b) a general unwillingness on our part to admit that
some things really are unimportant (Bigger! Better! Faster! More!)
So what’s a poor researcher to do? She can’t just stop measuring
attribute importance – if for nothing else, out of compassion
for the legions of research agencies that churn out glossy quadrant
maps by the dozen, plotting stated and derived importance with circles
and arrows and a paragraph under each one. No, the solution must lie
in thwarting the twin evils that plague ratings-based importance measurement,
namely, scale usage bias and lack of differentiation.
Such a solution does exist in the form of a technique known MaxDiff.
In the MaxDiff exercise, respondents do not rate attributes on a scale;
instead, they evaluate attributes in smaller groups (typically four
to six at a time) and pick the most and the least important in each
group. This is an “information-rich” task, since for any
subset of four attributes, identifying best and worst allows us to infer
the ordering for five out of the six possible comparisons; that is,
if attribute A is best and D is worst, we automatically know that A>B,
A>C, A>D, B>D and C>D. An experimental design, known as
the Balanced Incomplete Block Design, is utilized to ensure
that all attributes appear in every position the same number of times
to avoid bias arising from context and order effects.
|
 |
 |
The
respondent evaluates several such subsets. For example, with 15 attributes,
a reasonably optimal design is possible with 15 subsets of five attributes
each. The resulting data is analogous to discrete choice or conjoint data
and is analyzed using a special form of the Multinomial Logit Model. The
end result is a complete scoring of the relative value (or importance)
of each of the attributes.
The charts below illustrate stated importance measurements from a traditional
ratings-based exercise and those from a MaxDiff exercise. There is little
differentiation among the top attributes and the results are likely to
jump around from study to study (i.e., attribute 5, the most important
attribute, may or may not be the winner in a subsequent study). In contrast,
the MaxDiff output demonstrates a clearly differentiated ordering of the
attributes and provides solid evidence for attribute five’s preeminent
role in influencing decisions.

MaxDiff provides unambiguous results as to the attributes that respondents
feel are truly important. This is the nature of the exercise: By forcing
respondents to identify the most and least important attributes, we are
able to avoid the “everything is important” syndrome that
afflicts the typical rating task. Another advantage is that the stability
of these model-based estimates of importance makes them amenable to tracking
over time.
So, is there anything at all that mars this rosy picture? As with most
things, the power and value of this approach are not free. Though the
exercise is simple, MaxDiff takes longer to complete than a rating exercise.
Secondly, the analysis is analytically more challenging and requires specialized
design and model-estimation. Typically, however, this additional complexity
is on the scale of “value-add” analytics that adds richness
to standard landscaping or ATU studies – in short a small investment
that pays rich dividends immediately and over time.
Helping people find their perfect partners may be beyond the scope of
any statistical tool. However, if you simply want to maximize the value
of information from your next market research study, count on Dr. Max
Diff!
|
 |