Determining the Drivers of Prescribing: Part Two
– Attacking the Halo Effect
By Jeff Cartwright-Smith, Ph.D., Senior Vice President
and Andrew Douglas, Vice President
In the February issue of Pipeline, we explored some of the challenges inherent in determining what drives prescribing. This month we will introduce two techniques GfK has pioneered to address these challenges.
When dozens of variables reflect potential causes of prescribing, the variables will almost invariably correlate strongly together, a problem called “multicollinearity,” or more familiarly, the “halo effect.” It would be wonderful if the Principal Component Regression (PCR) approach we discussed last month solved the problem, but it usually fails to do so. There are two major reasons why:
- The principal component solution extracts factors that load on all your attributes. For example, the “efficacy” component is composed not only of efficacy-related attributes but of bits and pieces of every other attribute, including side effects, cost, safety, managed care and so on. Particularly in the typical case of substantial halo effects, it may not be possible to interpret the solution, or for clients to implement it.
- The principal component approach looks only at the relationships among the independent or causal variables; there is no consideration of the dependent or effect variable(s). The four to six components extracted may not load heavily on the attributes that relate best to the dependent variable, and so the results of the second step, that of predicting the dependent variable, may be disappointing.
You may have assumed that the world of statistics is a sleepy backwater of academia, where nothing has changed since long before your freshman year in Stats 101. Not so! GfK has pioneered the application of two new statistical techniques to solve the stubborn halo effect problem.
Shapley Value Regression
The Shapley Value (SV) was developed to provide an ordering of the worth of players in a multiplayer cooperative game. GfK’s Dr. Stan Lipovetsky is a mathematician and physicist who, with his colleagues, brought this tool to GfK. (SV derives from the same game theory mathematics as the Nash equilibrium, familiar to those who read or saw John Nash’s Nobel Prize work in A Beautiful Mind.)
SV represents the worth of each player over all possible combinations of players. We can use it to calculate the incremental value of each independent cause among a set of causes, even if the set is highly correlated. As illustrated in the figure below, what does Variable A contribute alone to predicting the dependent variable (DV)? What does it contribute when combined with Variable B? With B and C? And so on, for every combination of all variables.

With SV measures, we can then apportion the total predictive information in the set of causal variables (all the variables’ collective ability to drive prescribing) into the unique predictive contribution value of each individual variable. An example is shown in the table below.
| Attribute |
Correlation
with Rx |
Shapley
Value |
| Reduces pain and inflammation |
0.39 |
0.03 |
| Improves physical function |
0.39 |
0.07 |
| Reduces the number of swollen/tender joints |
0.39 |
0.02 |
| Improves mobility/range of motion |
0.39 |
0.04 |
| Protects joints from further degradation |
0.39 |
0.04 |
| Inhibits disease progression |
0.37 |
0.04 |
| Improves quality of life for patients |
0.37 |
0.07 |
| Induces remission |
0.35 |
0.01 |
| Controls signs and symptoms of rheumatoid arthritis |
0.34 |
0.04 |
| Sustained efficacy over time |
0.32 |
0.04 |
| Lower out-of-pocket Medicare cost |
0.39 |
0.01 |
| Use not widely restricted by managed care |
0.34 |
0.05 |
| Patients have few cost complaints |
0.31 |
0.04 |
| Low risk of infection |
0.32 |
0.01 |
| Low potential to cause hepatic problems |
0.32 |
0.14 |
| Low level of required patient monitoring |
0.31 |
0.01 |
| Low incidence of drug interactions |
0.30 |
0.03 |
| Fewer side effects/tolerability |
0.29 |
0.04 |
| Safe for long-term therapy |
0.28 |
0.02 |
| Total for all attributes together (R2) |
|
∑ = 0.75 |
We can see that while all attributes may correlate similarly with prescribing, “Low potential to cause hepatic problems” makes the highest single contribution to predicting prescribing, with quality of life and improved physical function next in importance. Other attributes (induces remission, risk of infection, patient monitoring) appear to be relatively important as well.
If there has been a thoughtful choice of variables to include as potential causes, Shapley can be an excellent solution to the halo effect problem. Notice that SV measures the power of each variable to influence prescribing, so all scores are positive. It is common for all variables to be worded positively, but were there a negative driver (“High risk of infection”), we would need to point that out. Note also that attributes can be summed to form components or factors (“Efficacy,” “Low cost”). Finally, the more subtly or trivially differentiated attributes there are, the more fragmented the effect of each alone, which is a penalty paid by using redundant attribute lists.
Partial Least Squares
An alternative solution comes out of the field of chemometrics, where it is used in the isolation of unknown compounds. Whereas principal component regression (PCR) reduces the attributes into their underlying components without reference to the dependent variable (e.g., prescribing), partial least squares regression (PLSR) differs by not only summarizing the independent variables but also summarizing the relationship between the independent and dependent variables. In other words, PLSR brings the dependent variable(s) right into the tent at the beginning of the analysis; the relationship between the predictors and the dependent variable is not lost.
As in PCR, underlying components orthogonal to each other are created that summarize the contribution of variance from the set of attributes. In essence this analysis parses out the variation much like a partial correlation. For example, if component No.1 accounts for some percentage of the variation in Y, component No.2 adds X percent of additional information, independent of No.1.

In the graphic above, suppose three components were derived from the independent variables. The first factor or component is created to best predict prescribing; the second is the best component that can be derived independent of the first one; and so on. More detail is provided in the following table.
| Attribute |
Component 1 (Halo) |
Component 2 (Cost vs. Efficacy) |
Component 3 (Side Effects) |
PLSR Beta Wt |
| Reduces pain and inflammation |
0.23 |
-0.02 |
-0.05 |
0.08 |
| Improves physical function |
0.28 |
-0.05 |
-0.07 |
0.08 |
| Reduces the number of swollen/tender joints |
0.32 |
-0.03 |
-0.00 |
0.15 |
| Improves mobility/range of motion |
0.21 |
-0.06 |
-0.07 |
0.04 |
| Protects joints from further degradation |
0.25 |
-0.08 |
-0.08 |
0.05 |
| Inhibits disease progression |
0.23 |
-0.05 |
-0.15 |
0.02 |
| Improves quality of life for patients |
0.29 |
-0.01 |
0.04 |
0.16 |
| Induces remission |
0.34 |
-0.19 |
-0.29 |
-0.07 |
| Controls signs and symptoms of rheumatoid arthritis |
0.29 |
-0.21 |
-0.10 |
-0.01 |
| Sustained efficacy over time |
0.21 |
-0.22 |
-0.21 |
-0.09 |
| Lower out-of-pocket Medicare cost |
0.10 |
0.24 |
0.02 |
0.18 |
| Use not widely restricted by managed care |
0.16 |
0.48 |
0.09 |
0.37 |
| Patients have few cost complaints |
0.12 |
0.43 |
0.10 |
0.33 |
| Low risk of infection |
0.24 |
0.01 |
0.24 |
0.25 |
| Low potential to cause hepatic problems |
0.25 |
0.10 |
0.28 |
0.32 |
| Low level of required patient monitoring |
0.28 |
0.04 |
0.22 |
0.27 |
| Low incidence of drug interactions |
0.21 |
-0.04 |
0.14 |
0.16 |
| Fewer side effects/tolerability |
0.20 |
-0.08 |
0.23 |
0.18 |
| Safe for long-term therapy |
0.25 |
-0.14 |
0.14 |
0.13 |
| % Y Predicted =75% |
50% |
20% |
5% |
|
Here we see Component 1 is positively associated with all the attributes—a classic example of the common halo effect: If physicians rate a product highly on some attributes, they are apt to do so on all attributes, somewhat indiscriminately. The creation of further components allows us to find which attributes have the potential to increase prescribing intent beyond that of the halo. Component 2 appears to be strongly associated with cost and negatively associated with efficacy measures. This might allow us to conclude that in relation to prescribing, cost is traded off with efficacy. Finally, the third component contributes an additional 5 percent beyond generalized overall appeal (halo) and cost.
A positive feature of the PLSR is the ability to predict more than one dependent variable within a single analysis. This allows for more complicated scenarios to be investigated. For example, best in class may be a measure of perceived drug performance while intent to prescribe reflects behavioral intent. While these variables may be related, we have found they also tell a very different story: Drivers of one are not drivers of the other. By using both variables, we gain granularity in understanding brand perception and Rx movement.
SV analysis and PLSR address the “importance of attributes” issue in different ways.
- SV gives us the contribution each predictor makes to R2, considering both its direct effect (i.e., its correlation with the criterion) and its effects when combined with the other attributes. PLSR somewhat downplays the halo effect if present, and focuses on the incremental effects coming from the other components.
- SV focuses on each attribute alone—no underlying components are derived. PLSR can give an understanding of the attributes’ individual contributions, but only through the derived components. However, PLSR can be very helpful in breaking down the structure of the relationship between attributes and what they are predicting into their underlying driving components.
- In PLSR, attributes’ net contributions can often be negative for predictors which may correlate positively with the outcome variable. If they are small in absolute size, we usually interpret them as passive, unwelcome concomitants of other, driving attributes (e.g., elevated side effects are concomitants of high efficacy in dermatologic products). However, if they are sizably negative, PLSR can bring surprises. By isolating a halo effect like “overall liking” or “scale use,” performance attributes can come to the forefront to be usefully exploited. We have seen examples of these negative coefficients replicate across time and category. For example, a consistent PLSR finding from sales force effectiveness research has shown that rating a representative as “articulate” or “enthusiastic” does not necessarily make for a successful detail.
Although the decision to prescribe one agent over another involves dozens or even hundreds of measureable attributes, new tools like SV and PLSR enable us to cut through the complexity and help our clients understand what drives the decision.
Want to learn more on this topic? Please contact:
 
|