The ongoing metadata/content debate finally seems to be winding down. In a blog post last week, Stanford researchers Jonathan Mayer and Patrick Mutchler published the latest results of their ongoing MetaPhone research project. Their primary findings provide empirical support for what computer scientists have been saying all along: cell phone metadata can tell us quite a lot about an individual without need to hear a single word of the content of his or her calls. But Mayer’s and Mutchler’s work also raises a number of secondary considerations which show just how complex the interactions between cell phones and privacy can actually be.
Before diving in, it’s helpful to review the mechanics behind the MetaPhone research. The core idea is elegantly simple: users voluntarily install an app written for the Android mobile operating system, and that app records the metadata information generated by as part of the cell phone’s regular usage. This data is then sent to Mayer and Mutchler for analysis. By combining this data with publicly-available information, the researchers are able to draw sometimes startlingly accurate inferences both regarding the identity of a particular user and about the types of activities in which a user might be engaged. For example, Mayer’s and Mutchler’s data revealed the following narrative:
Participant E had a long, early morning call with her sister. Two days later, she placed a series of calls to the local Planned Parenthood location. She placed brief additional calls two weeks later, and made a final call a month after.
If the goal of the MetaPhone project is to show how deeply personal information can be extracted from contentless metadata, then it seems safe to say, “Mission accomplished.” But like any good research project, MetaPhone reveals more questions than it answers. These questions are relatively subtle when compared to the binary is-it-or-isn’t-it discussion surrounding metadata and content, but they are no less important.
This is my cell phone. There are many like it, but this one is mine.
According to data from the Census Bureau, about 1/3 of Americans rely on cell phones as their only phone, and that number doubles if we only look at households led by individuals between the ages of 15 and 29. As a rule, modern Americans do not share their cell phones. We each have our own phone number. This means that in the vast majority of cases, a cell phone number doubles as a unique identifier. This is why Mayer and Mutchler can say that a participant called a certain number, rather than restricting themselves to the more limited (though technically more accurate) statement that the call was placed from the phone.
When combined with a significant corpus of metadata, this allows us to directly connect a phone’s activity to an individual. It also means that we can draw a far more detailed maps of their networks of connections, and that we can more accurately analyze their calls for patterns. Members of illicit networks, such as drug dealers or terrorists, are more than aware of this capability, which is why they often swap SIM cards, share phones, and use other forms of tradecraft to insulate their networks from outside analysis.
Ignoring the tickle on the back of your neck.
One of the most striking aspects of the MetaPhone data is just how normal it looks. Remember, MetaPhone users opted in to the program, so (barring the intercession of a nefarious actor) each of them knew that their call metadata was being logged and transmitted to researchers. Despite this fact, Mayer and Mutchler recorded calls to “head shops,” Planned Parenthood clinics, pharmaceutical hotlines, specialty gun shops, and, for 2% of MetaPhone participants, even calls to adult establishments.
One of the most prevalent arguments against mass surveillance is that mere knowledge that the ongoing surveillance exists is enough to distort behavior. And perhaps it did—without access to a control dataset, it is impossible to tell how the MetaPhone data stacks up to typical cell phone use. This distortion may be compounded by self-selection among MetaPhone users. As Mayer and Mutchler note, “[the] participant population is small and participants only provide a few months of phone activity on average.” Furthermore, we can make certain inferences about the types of individuals who participate in the MetaPhone study—at minimum, they must be the kind of people who would a) donate their calling data to academic research, and b) are sufficiently technically savvy to download and install an application on a cell phone.
On the other hand, the whole point of the MetaPhone experiment would be short-circuited if participants in the study altered their habits, and it is reasonable to believe that anyone who voluntarily joins a study like MetaPhone does not actively want the study to fail. Convenience likely also plays a role. There is both a cognitive and a temporal cost to working around the MetaPhone software, the participants may have viewed it as more trouble than it was worth to circumvent the software for “sensitive” phone calls.
It is worth noting that it may very well be impossible to acquire a reasonably complete dataset to use as a control in this sort of research. This kind of information is only readily available to phone companies and, presumably, organizations like the NSA. For the set to work as a control, the individuals whose information it contains must not be aware that the information is being collected. Even acquiring such a dataset would leave researchers deep in the legal and ethical weeds, and it might be that this particular hedge maze is one without a clear solution.
What we talk about when we talk about metadata.
Finally, and to me most significantly, the MetaPhone project highlights the chasm between scientific understanding and political and legal rhetoric when it comes to matters of national security. The idea that metadata can be powerful should not be up for debate. It should be intuitively obvious that knowing a person’s calling patterns can tell us quite a bit about the person. The MetaPhone research may have demonstrated this empirically, but it is hard to say that their result is all that surprising. Participant E and her phone calls to Planned Parenthood illustrate that point all too clearly.
To some degree, the real difficulty here is in bridging the gap between the intellectual tools of science and the rhetorical techniques of law. At its heart, science is about studying the objective world and trying to make statements that are true by some external, fixed measure. Law operates in a fuzzier regime, where often the currency is not “what is true?” but “of what can I convince you?”. The MetaPhone project is an attempt to use the tools of science to shift the policy debate, and this goal is both noble and necessary.
The next step? Teaching the political world how to listen.
A version of this post originally appeared at the UC Hastings Institute for Innovation Law Blog.