No Substitute for First-Party Data

For the past several years, in addition to working professionally with commercial enterprise data, I have been the self-designated genealogist in my family. I have learned that there is simply no substitute for first-party data. This truism applies, of course, to my professional and personal work with data. 

In my research, I found that, while official archives and public records are indispensable, they can also be quite misleading without reliable first-party data. Allow me to explain. Recently, I connected with two fourth cousins who are also engaged in mapping their family trees. Each of us descends from one of three sisters who are our second great-grandmothers. I was fortunate to have what I will call first-party data about my second great-grandmother, Mary, through conversations with her granddaughter—my grandmother—who also left a handwritten memoir (including a photograph of Mary) when she passed away.

Using this information, enhanced and enriched by the public record, I reconstructed Mary’s life fairly accurately, including her age, place of birth, date of immigration to the U.S., and date of death. However, when I looked at my fourth cousins’ family trees, the data they both had about Mary was incorrect. At the same time, the incorrect information they had was identical. My cousins relied entirely on the public record and so had both discovered the same data about the wrong “Mary.” I was able to identify the correct Mary in the public record by leveraging my grandmother’s direct, one-to-one relationship with her.

This kind of first-party data is just as critical in the world of commercial enterprise. With any group of consumers, the data a business is able to secure from its partners (second-party data) or from consumer data suppliers (third-party data) will never provide a complete and accurate view. Moreover, just as it was in my genealogy example, second-party and third-party data can be misleading without a solid core foundation of first-party data. This is especially true when it comes to businesses seeking to serve U.S. Hispanics. 

In 2016, the research team at Univision conducted an internal audit of the “Hispanic” cookies furnished by one of the leading commercial data management platforms (DMPs). The patterns discovered in the data were surprising: Nearly one-fifth of the Hispanic cookies in the DMP were qualified as Hispanic based on product purchases of consumer packaged goods. Another significant portion of the Hispanic cookies were qualified as such based on travel purchases. That is to say, in lieu of other data, the purchase of salsa or travel to a city in Latin America may be used to qualify someone as “Hispanic” for advertisers or agencies.

There are also significant gaps in the commercially available third-party data about U.S. Hispanics. According to the fall 2015 “Simmons National Hispanic Consumer Study,” more than half of U.S. Hispanics do not own their homes. Consumers who own their homes don’t move as often. The stationary nature of their address becomes a critical unique identifier for threading data records together and permits data aggregators to create a more complete picture of each household. However, in the case of more than half of U.S. Hispanics, a home address will often prove unhelpful in stitching together complete and accurate profiles. Additionally, the survey found that 26% of U.S. Hispanics were not online, and 24% used only cash or checks when making purchases. This means that data about roughly one-quarter of U.S. Hispanics will not be available in most commercial DMPs or in most commercially available consumer purchase datasets.

Just as in the case of my second great-grandmother, Mary, a direct, one-to-one relationship—that is to say first-party data—is vital to forming an accurate view of consumers, especially Hispanic consumers. Without this relationship, the available data about a consumer may very likely be—in the best case—limited, incomplete, or unstable. In the worst case, the data may ultimately be inaccurate and thus deceptive or misleading. There is still no substitute for first-party data.   

Related Articles

Data is all the rage in marketing and media. It would be quite difficult to find a single marketing or media-related conference this year that doesn't offer at least one panel in its program on data, and most of them feature two or more. Deterministic data (i.e., data that can be matched to a unique and identifiable individual, such as a name or an email address) is being touted as an untapped precious resource. But in this discussion regarding data, many marketers and media companies are missing the opportunities offered by social media data.