Cambridge Analytica’s external validation is a PNAS paper by Matz et al. that Dean Eckles, Bertt Gordon, and Garrett Johnson (Eckles et al.) just showed is NOT valid in letter to PNAS (yes, that is how science is done, with a letter to the same journal that published the original!). Crazy thing is that Eckles et al. did not even get to the most interesting part. Their letter shows that the Matz et al. paper did not correct for an unbalanced sample of respondents that would have been enough to invalidate their results within each treatment. What this means: within a given treatment, the statistically different results are an artifact of different respondents, not a true randomized result. What Eckels et al. did not bother to address is the wildly different baselines between treatments. The differences within the treatment are magnitudes smaller than the differences between the treatments.

Matz et al. examined people under two different psychometric traits and showed that people clicked-through or purchased (conversion rate) at higher rates when hit with a version of the ad that matched their trait. You can see the key chart below. They had two treatment conditions: introverted v. extroverted, and low openness v. high openness. Notice that within each treatment they hit each type of person with each type of ad, and the conversion rate was always higher when people got hit with their type of ad. Introverted people liked introverted ads better, and extroverted people liked extroverted ads better. What Eckles et al. shows is that the people were not randomly assigned to their treatment (and maybe not uniquely assigned either), but optimized by Facebook, so that these results are not valid.


But, notice a small problem, while high openness people did not respond to low openness ads as well as low openness people, and low openness people did not respond to high openness ads as well as high openness people: High openness people responded better to low openness ads than they did to high openness ads! That means that the advertiser would have been better off showing the high openness people the low openness ad.

It gets worse, even the worse converting respondents in the “openness” treatment had a conversion rate about 25x higher than the highest converted respondents in the “introverted” treatment. Thus, if they just hit everyone with the “openness” ad, either of them, without any targeting, their conversation would have been 25x higher, than every time they used the “introversion” ads.

Of course, these are not all the same product, etc., but when baselines are off over a magnitude more than the key differences in the study, that is a big red flag. And, that does not even go into the mislabeling of psychometric that underline any real-world implementation (i.e., it is hard to get everyone’s categories right, so some people will get the wrong treatment). Or, the real counterfactual: was there much more improvement to be had to target on gender, age, geography, etc. (traditional observed demographics), than these psychometric values?

Psychometrics are good to follow: they may help improve messaging and targeting, for some people, under some conditions. But this study proves that we should consider the impact of different messages on the respondents, as a whole, before we worry about micro-targeting on demographics (which are easier to identify) and/or psychometrics (which are harder to identify).

In other words: don’t get trapped by the TYRANNY OF THE MARGINAL RETURN. Sometimes we get so excited by a statically significant difference between A & B, we do not realize that there a magnitude difference between either A or B, and C. For instance: this fall there will certainty be Democratic consultants studying the conversion rate between attack ad A and attack ad B, without realizing that pro-healthcare ad C performs 10 times better than either. Hopefully this should serve as reminder to think about exploring a little further outside their comfort zone.

Note: this criticism was generated from conversations is with numerous people.