Polling: Random Device Engagement (RDE) with Organic Samples (Part 4 of 4)

This is the fourth, and final, piece of our series, Evolution of Polling Samples from RDD to RDE

In part 1: we describe a polling industry that is ripe for transformation as Random Digit Dialing (RDD) becomes increasingly tenuous.
In part 2: we discuss issues involved with online panels which are increasingly replacing RDD.
In part 3: we discuss Assisted crowdsourcing, a brand-new form of polling.

Part 4: Random Device Engagement

Random Device Engagement (RDE): platform organically engages users through the devices they are already using. Survey researchers buy a set number of responses randomly generated from the target population. Ideally, the survey software natively works natively within the device. The mode depends on device type that is engaged. Coverage is reasonably high, insofar as smart phones (as a key device type today) have very high penetration ~ 70%). Response rate is decent.

We have outlined the bleak future of RDD in part 1 – with single-digit response rates, many of the assumptions behind RDD are broken, such as idiosyncratic non-response. Yet, some of the tenants of RDD are commendable: calling respondents in their homes means that respondents are picked up in an organic location for getting opinions. Pollsters reach respondents where they spend time organically. That is to say, people engage in their quotidian tasks at home, get information at home, and interact with friends and family. In short, RDD gathers opinions in that natural environment. Can we fix what is broken with RDD while maintaining its strengths? Let us introduce Random Device Engagement (RDE); it is the natural successor of RDD, in terms of orthography, philosophy and quality.

Random device engagement (RDE) polling relies on advertising networks, or other portals on devices, to engage random people where they are. One of the most common versions of this is within advertising modules on smart phones, but it can easily be placed in gaming, virtual reality, etc. Respondents are asked to participate in a poll in exchange for an incentive token that stays true to the philosophy of the app in which they are engaged: For example, respondents contacted via the popular mobile gaming App “Harry Potter: Hogwarts Mystery” can be reimbursed for survey participation with energy points, a crucial currency of the game. Direct monetary incentives are also possible, such as the chance to win an Amazon gift certificate. The key here is that by being able to monitor the unique identifier of the device world – ad IDs – survey firms can prevent fraud originating from SUMAs (single users, multiple accounts). And, RDE samples are both random and organic. This is the natural successor to random digit dialing, which aims to randomly engage with landline (and now cell) phones. In many ways, it is just making RDD generic for the future: random, device (rather than phone), engagement (rather than dialing). It addresses RDD’s greatest problem: technology is always changing. It solves for this by targeting a respondent’s unique ID that can be tracked across changing devices, as the future of phones is uncertain. In addition, RDE brings a plethora of telemetry or para data to the table that is amenable to bias correction, from location history to application usage.

This method has a number of advantages:

1) Fast: RDE can be extremely fast. RDD takes days (and weeks in some cases). Using social networks (assisted crowdsourcing) can be done a little faster, but still lacks speed compared to RDE. Using online panels is comparable in speed, if you pay for extra respondents from a merged panel (online panels will charge extra to get respondents from other panels to increase speed).

2) Cost-effective: RDE is extremely inexpensive compared with other sampling options. The major RDE providers, like Pollfish, Dalia or Tap Research, charge 10% the cost of RDD, 20% the cost of using assisted crowdsourcing, and 25% the cost of online panels.

3) Coverage is good and growing: Accuracy is good, because coverage is good. The major RDE providers mentioned easily reach 5,000,000 unique respondents, in the US market alone. And, while RDE is still behind RDD in coverage at this time, it will reach parity soon. Coverage is similar to social media-based assisted crowd source polling, and much better than with online panels. Online panels have a very small footprint, which also affects their ability to get depth in population.

4) Response rate is solid: Pollfish reports a reasonable response rate (much higher than RDD), conditional on being targeted for a poll (to completion of the survey, that is). Online panels have low sign-up rates and high drop out, but do not post comparable response rates. Social media-based polling, in assisted crowdsourcing, is reliant on ads that suffer from very low click-through.

5) Flexible: RDE is meant to be flexible with the growth of devices. It should provide a seamless experience across device types. RDD is stuck with telephones, by definition. And, RDD is subject to interviewer effects (albeit to a smaller extent than in-person surveys), meaning that tone of voice can influence considerations of the respondent, or trigger undesired interviewer-respondent interactions, ultimately introducing measurement error. RDE, with its streamlined experience, is not subject to this kind of error.

6) Telemetry data: RDE is able to supplement collected attitudinal data with a rich array of para or telemetry data. Example? As we know, people who answer surveys are fundamentally different than people who do not. As the progressive analytics shop CIVIS has argued recently, a battery of nearly 30 additional demographic, attitudinal, and lifestyle questions that get at notions of social trust and cosmopolitanism is necessary to be able to weight and correct for all the ways in which survey respondents are unusual. As we have recently argued in this academic paper in collaboration with Stephanie Eckman (RTI International), telemetry data is a much more cost-effective (and unobtrusive) way to collect these variables. Home and work location, commuting or mobility patterns or the political makeup of one’s neighborhood or social network, derived from satellite-based (read: extremely accurate) longitudinal location-coordinate data predict demographic variables well, such as race and income. And, applications on the device can more accurately describe political traits prone to erroneous self-report, such as frequency of political discussion, political engagement or knowledge.

7) RDE will get stronger in the future: Penetration of devices will further increase in the future, increasing reach of RDE in the US, and making RDE the only viable alternatives in less developed markets. Take Africa: the smartphone penetration rate is projected to grow at 52.9% year- on-year. Currently, there are 293.8 million smartphone users across the continent, meaning that taking into account current growth rates, there will be 929.9 million smartphones by the year 2021 in Africa. But the rosy future for RDE is not just about penetration. Advances in bridging Ad IDs with other known identifiers in the American market, such as voter file IDs, Experian Gold IDs, etc., mean that individual targeting based on financial history or credit card spending patterns will be possible. And, RDE will be able to adopt “list-based polling”, in which political survey firms poll directly from the voter file, large-scale administrative data detailing the turnout and registration history of ~250,000,000 Americans.

8) River sampling is different, as devices are unknown: River sampling can either mean banner-ad based polling or engagement with respondents via legacy websites or similar places RDE recruits from. In contrast to RDE, devices are unknown to river samplers: River sampling usually does not have access to the Ad ID, introducing two huge disadvantages: River samples have no way to address SUMA – it is possible for fraudsters to engage with the same poll twice to increase chances to win the price for participation, especially if it comes in the form of financial incentives. And, any degree of demographic/geographic (not to mention individual) targeting is virtually impossible. In addition, banner ads themselves, similar to social-media ads, suffer from disastrous response rates. Good RDE polling is done with cooperation of the publisher, providing a native experience, while banners ads are pushed through the ad-network. This degraded user experience depresses response rates and can introduce serious measurement error. Second, ad-networks optimize their delivery in a way that fights against the random sample. The users are chosen because they are more likely to respond, due to unobserved variables (at least to the survey researcher), that are correlated with how they will respond. As this underlying data is never shared, it is impossible to correct for by the survey researcher.

This method has some disadvantages:

Just like every other modern sample method – RDD, assisted crowdsourcing, online panels – RDE is non-probability. There is no sample method (anymore) that has perfect coverage and known probabilities for any respondent. This is one of the reasons we have developed analytics to overcome known biases. And, RDE has bias that we understand and can overcome, and additional data points that add to the power of correcting bias, such as telemetry data that is not available to RDD. While RDD has shifting and shrinking coverage, online panels suffer from panel fatigue and panel conditioning, and assisted crowdsourcing has bias introduced by efficient but to the polling firm nontransparent targeting algorithms that cannot be addressed, RDE is our method of choice, and the future, in the ever-changing market of polling.