It doesn’t take much for researchers to figure out who you are and what you bought if they have access to your “anonymous” credit card records.
According to a new study published in Science magazine, researchers with access to “anonymized’ credit card data — transactions that have been stripped of personally identifying information — can piece together who bought what simply by using a few publicly available markers they’ve collected online, such as geo-tagged tweets and time-stamped Facebook status updates.
“A data set’s lack of names, home addresses, phone numbers or other obvious identifiers does not make it anonymous,” wrote study authors Yves-Alexandre de Montjoye, Laura Radaelli, Vivek Kumar Singh and Alex Pentland in the Jan. 30 report.
A retailer or credit card company might feel comfortable releasing anonymous transaction data to third parties because they’re confident that when they remove the names, credit card numbers and other identifying markers from the data sets, they are protecting their customers’ privacy. But the reality is that it can be “surprisingly easy” to connect the dots, say the study authors, if a researcher cross-references the anonymized data with secondary pieces of information they collect from other sources.
“We are showing that the privacy we are told that we have isn’t real,” said study co-author Alex Pentland in an email to the Associated Press.
For example, if you purchase shoes at one store and tweet while you’re in the checkout line, then use FourSquare to “check-in” at a nearby taco stand the next day, a researcher could cross-reference that information and pinpoint who you are based on the probability that no one else visited both of those locations during the same period.
The technique is so effective that the study authors discovered that they could re-identify 90 percent of the individuals in a credit card transaction data set made up of 1.1 million people just by looking at the dates and locations of four individual purchases. If the researchers also looked at the price of a particular purchase, they had an even easier time re-identifying people.
Coarsening the data — making it less specific — didn’t do much to protect cardholders’ privacy either. For example, if researchers just looked at the general location of a transaction rather than the specific store or restaurant, they had a harder time figuring out who made the purchase. But if they cross-referenced the purchases with additional data points, then they were often able to make up for the data’s lack of specificity.
“We have these unique patterns that identify us,” said Pentland in a separate interview with the Wall Street Journal. “They show up in any data where there is diverse behavior that changes over time. Your pattern will be different and I could identify you.”
The study’s findings are significant, say the authors, because many organizations wrongly assume that it’s safe to collect and sell anonymized data. For example, some credit card companies sell anonymous transaction data to marketers so that they can analyze people’s shopping habits. Meanwhile, a growing number of organizations, ranging from the CFPB to healthcare providers, are using people’s anonymous financial data for research and other purposes.
You can try to make it harder for people to connect you to your credit card purchases by deliberately leaving less information online. So don’t check-in to a store or restaurant on Foursquare, for example, and disable geolocation tracking on Twitter and Facebook. But that may not be enough to escape the prying eyes of data brokers and other businesses that collect data from a much wider variety of sources, including ones you can’t opt out of, such as public sources. As long as your credit card transaction data is shared “anonymously” with others, your personal information could still be at risk.