May 2007


Is 30 Really the Magic Number? Determining Sample Size in Primary Marketing Research

By Paul Teta, Ph.D.,
Executive Vice President, GfK V2


There is a dirty little secret behind sample size decisions: They’re subjective. You could spend a lifetime developing theorems, doing power analyses and programming Monte Carlo simulations, but it will never remove the fundamental subjectivity that underlies determining the “right” sample size for a given project.

I mean no offense to my colleagues, but few of us yearned in our youth to become sampling statisticians. The field is necessarily steeped in complex mathematical formulae, the results of which are difficult for the practitioner to apply when evaluating the fundamental question of how much is enough. As a result, we often…far too often…rely on project budgets to determine how many respondents to include in research.

Some marketing research suppliers fuel this reliance by providing multiple sampling alternatives in their proposals without an explanation of the trade-offs associated with each. Let’s face it, providing a menu of sample sizes is more often motivated by a desire to hedge bets on winning work than it is on helping clients make informed research decisions. Absent the detail on margin of error, the decision maker has nothing to go on but price. Why would anyone pay more for the “optimum n” alternative when the “minimum n” will suffice?

Mind you, budget is a perfectly legitimate part of the equation, but it must be considered within the context of other design factors, including the size of the population of interest, the variability of what you wish to measure and the importance of the business decisions riding on the research outcomes.

The Principle of Aggregation

Despite statisticians’ impenetrable lexicon, the logic underlying sampling is simple. The essential concept is given by the “principle of aggregation.” Every piece of data that you collect in primary marketing research consists of some part “truth” and some part “error.” Error arises from a number of causes. Poor recollection, respondent fatigue, overstating prescribing intent and misentering responses are all common examples of experimental error. Some errors will be positive – for example, the respondent overestimates use of your product – some errors will be negative, as when the respondent underestimates use of your product. In theory, if you have an infinite number of respondents, positive and negative errors will fully cancel out each other and you will be left with pure truth. The implication of this principle in practice is that larger sample sizes will contain less error than smaller sample sizes: An n of 50 will contain significantly more error than an n of 500.

The Central Limit Theorem

The principle of aggregation is operationalized through the central limit theorem (CLT), which many regard as the heart of probability theory. Statisticians use the CLT to justify mathematically the assumption of normality that underlies most of statistical testing. The normality assumption is critical because it provides the probabilities (p-values) that allow us to infer statistical significance. The CLT says that if we have a sufficient number of observations, we can be comfortable that the data do not violate the assumption of normality.

So what number is sufficient for the CLT? Over nearly 300 years of conducting probability simulations, statisticians have concluded that a sample size of 30 is enough for the theorem to hold (okay, not all statisticians, but that’s a topic for another article). But common sense dictates that simply because 30 respondents are enough to “make the math work,” it does not mean that we can get away with this minimum in applied research. Would you use a novel cardiovascular agent if it were tested in clinical trials with just 30 subjects?

Informing Sample Size Decisions with a Power Analysis

Sample size decisions must be informed with the aid of a power analysis. Statistical power is broadly defined as our ability to detect an effect, given that the effect actually exists. For example, if a new pharmaceutical product is truly superior in efficacy to the existing gold standard, a power analysis will guide how many patients we need in the clinical trials to detect this effect.

There are several elements that drive a power analysis. First, consider the anticipated variability in response, because higher variability will require larger samples. If you want to understand the prescribing dynamics in Alzheimer’s disease, you would need to talk to far fewer physicians than if you want to understand the same dynamics in hyperlipidemia. There are fewer approved agents in Alzheimer’s disease than there are in hyperlipidemia, so there will be less variability in prescribing decisions. Similarly, generalists tend to be more heterogenous than are specialists, therefore you should sample more heavily from primary care physicians.

Next, decide on level of confidence with which you are comfortable. Level of confidence refers to the amount of risk you are willing to tolerate in your study. For example, a risk of 5 percent means there is a 1-in-20 chance you will fail to find an effect that actually exists.

Finally, estimate the effect size. The more subtle the effect, the more difficult it will be to detect. Isolating small anticipated effects will require more respondents.

The power analysis results in estimates of the margin of error associated with various sample sizes. Pollsters frequently report margin of error to communicate the statistical reliability of their findings, e.g., “this poll has a margin of error of plus or minus 4 percent.” This means that if the poll were conducted 100 times, the data would be within four points above or below the percentage reported 95 times out of 100.

Once you have the results of the power analysis, the resultant margin of error will help you with the most important question: What is riding on this project? If the research is exploratory, then your tolerance for risk will be high, so a higher margin of error is acceptable. If, on the other hand, critical business decisions are on the line, you are well advised to demand a low margin of error and budget accordingly.



Want to learn more on this topic? Please contact: