Back to Blog

Probabilistic vs Deterministic Matching: Our Viewpoint on Identity Graph Methodologies

  • - LiveRamp
  • 4 min read

The average person today owns multiple connected devices. Smartphones, tablets, laptops, desktops, smart TVs, video game consoles, streaming sticks—consumers may use any combination of these and other devices every day.

This fragmented, device-based customer data footprint creates a tremendous challenge for marketers seeking to deliver personalized experiences across channels. But it’s certainly not insurmountable. LiveRamp identity resolution enables marketers to tie disparate data sets together and match consumers to their proxies in a privacy-conscious manner.

Probabilistic vs Deterministic Matching: What’s The Difference?

There are two primary methodologies used to resolve devices to consumers:  probabilistic​ and deterministic.

  • Deterministic Identity Methodologies create device relationships by joining devices using directly identifiable personal data, such as email, name, and phone number. Devices are only linked when they are directly observed using the directly identifiable personal data tied to a consumer, prioritizing accuracy and limiting false positives.
  • Probabilistic Identity Methodologies create device relationships by using a knowledge base of linkage data and predictive algorithms as the foundation for an identity graph. Devices are also grouped together implicitly—via device fingerprinting, IP matching, screen resolution, operating system, location, Wi-Fi network, and behavioral and browsing data—using statistical modeling at a given confidence level. These groups can be linked to identities based on predictive algorithms.

For example, assume a phone and desktop linked to a household are observed logging onto Wi-Fi at all times of the day throughout the week. Meanwhile, another device that belongs to a friend only logs onto Wi-Fi on the weekends. An algorithm can use this data point in combination with others to infer that the friend’s device does not belong to the same household.

Through our people-based identity graph—reaching over 200 million unique users on the web as well as more than 600 million matched mobile devices—we have planted our flag firmly in the fully deterministically sourced matching camp. We affirm this approach is imperative to executing people-based marketing, and our customers agree.

How Probabilistic and Deterministic Matching Complement Each Other

But that does not mean that we believe probabilistic methodologies have no value. Indeed, framing the debate as  probabilistic vs. deterministic matching neglects to take into account that these methodologies complement each other. Specifically, probabilistic methodologies can add value and scale when applied within an identity solution that has a core deterministic foundation.

Choosing between Probabilistic and Deterministic Matching Approaches

Choosing the right methodology depends entirely on your marketing objectives:

  • When to opt for deterministic – If your goal is to target only actual buyers of a specific product, then a deterministic data set should be your option of choice. For instance, a carrier extending upgrade offers would only want to reach customers who own the previous phone model.
  • When to opt for probabilistic – If your goal is to target people who might buy or be interested in a specific product, then probabilistic data will give you greater reach. For example, holiday car promotions can attract various potential buyers who are at different stages in the hunt for a new set of wheels. Reach becomes more important that precision.

The Need For a Deterministic Foundation

Continuous, updated, and curated deterministic matches are table stakes for a people-based graph. Our deterministic linkages continuously move and change, while the people-based ID they are anchored to stays persistent.

We believe a solution based on probabilistic matches, even when using a knowledge base of directly identifiable personal data linkages for machine learning, cannot achieve the same level of accuracy and recency of identity as a truly deterministic identity graph. (Stay tuned for a future post on the key differentiators of the best identity solutions.)

Where Can Probabilistic Add Value?

Probabilistic methodologies can complement a deterministic identity solution in two major ways: expanded reach (finding people who have been matched deterministically across more devices) and linkage curation (confirming device linkages and resolving identity conflicts).

Amplify Total Reach

LiveRamp uses device groupings from probabilistic links to expand our deterministic matches. We work with preferred partners, such as Drawbridge and Tapad, to layer these probabilistic capabilities on top of our graph.

Because probabilistic matching may introduce more false positives, marketers should reserve it for specific use cases, such as when they need extra reach and are comfortable with sacrificing a small amount of deterministic accuracy. For example, a luxury brand that wants to target a new premium product at high-net-worth consumers could opt for this approach as there is no harm in also reaching that audience’s network since they likely share a similar wealth profile.

Linkage Curation and Estimated Validity

When using probabilistic methodologies, a deterministic vendor can and should filter incoming linkage data sources. Deterministic linkages can sometimes be misleading if misinterpreted. For example, if a father’s Spotify account is accessed by his son and his son’s college roommate, one account is now attached to multiple devices, resulting in identity conflict.

Probabilistic device groupings can provide critical insight into the likelihood that incoming linkages are valid. Probabilistic evidence can also be used to corroborate incoming deterministic links to improve accuracy.

Deterministic Matching is Key to People-Based Marketing

At LiveRamp, our position is clear: we believe deterministic matching should be the backbone of marketing. Best-in-class identity solutions should be based primarily on a people-based, deterministic foundation.

But relying exclusively on deterministic methodologies limits the use cases available to marketers. Deterministic identity solutions complement probabilistic graphs for reach expansion and incorporate probabilistic corroboration for incoming deterministic data.

What Could You Create With Data Collaboration?

Download 5 Ways to Create Better Customer Experiences with Data Collaboration

Download Now