Data Retention and the Fourth Amendment

Attorney General Alberto Gonzalez is interested in restarting talks with Congress about data retention legislation. (See Anne Broache, “Feds: Details of ISP Snooping haven’t been decided”, I’m worried, but not as much as many commentators seem to be.

A thousand years ago (1994), I wrote an article for the Harvard Journal of Law and Technology about electronic communications and the Fourth Amendment. At that time, email was only a blip in terms of data traffic, but there was still significant data traffic in the form of fax communications, and many corporations had already moved to privatize their telephone and data networks using PBX equipment. Under the Supreme Court’s Fourth Amendment cases, I wondered how, if at all, would the “plain view” exception to the requirement for a warrant operate when a wiretap, of necessity, required sifting through a great deal of electronic communications in order to find data related to the warrant.

The legality of the new “warrantless wiretaps” (which seem to deal mostly with voice rather than data) to one side, my conclusion was that both federal wiretap law (the minimization requirement) and the Constitution (prohibitions against general warrants embedded in the Fourth Amendment) would limit the use of the “plain view” exception greatly, meaning that if investigators went on “fishing expeditions” that exceeded the warrants they had, unrelated evidence would have to be suppressed under the exclusionary rule.

The potential problem has gotten more serious, of course, as the volume of electronic communications has exploded since the mid-1990’s. And now the potential evidence is being processed and stored not only by PBX’s but also by ISPs, including small private companies and large infrastructure providers, as well as intermediary storers and mirroring sites. On the other hand, data retention is expensive and for most operators a low priority, and it’s likely that in general most data traffic isn’t being retained, or at least not systematically.

There hasn’t been a criminal case since the mid-1990’s that exposes the misfit between the federal wiretap law and the Fourth Amendment I described, and it may be that all the secret wiretaps and sealed proceedings in prosecutions of accused terrorists are masking what was once a lack of sophistication on the part of law enforcement to make use of electronic data. Or perhaps the threat to privacy is still only potential and unlitigated. But the Constitutional argument is still there, and still, I think, a powerful one.

A somewhat related point has to do with search. There seems to be a general misconception among the public that if Google and Yahoo! and other search engines retain search data, it can potentially be subpoenaed and used in criminal or even civil matters, exposing the searcher’s identity. Assuming the search engines keep all the information about a search they can, however, it still doesn’t identify the searcher, which is to say, Google doesn’t know what “my” searches are.

At best they know the IP address where the search originated, and for users with private and fixed IP addresses that could lead to identification. But what about users with dynamically-assigned IP addresses or users (e.g., corporate users) who share an IP address, perhaps among many people? Actually, let me ask that as a question to those who understand the technical underpinnings of IP addressing better than I do—under what circumstances can an IP address be used to identify the person who entered a search through a search engine? I posed this question to Tim Wu, co-author of Who Controls the Internet? Tim hedged.


You might be surprised how much Google knows about you. IP addresses are one thing. But consider how many people are registered users of Yahoo and Google - where they have volunteered personal information to these companies. Then couple that with the cookies your browser downloads to help them know when you're the one doing the searching. Ever wonder how Google is able to "personalize" your search results? There's much more information available to the search engines than your IP address. In fact, the IP address is of least value. They can't go to your DSL provider to ask which subscriber uses a particular address. But when you volunteer that information (by registering, filling out forms, etc.) then you've handed it to them.

The issue of the legality of data retention for law enforcement purposes is one being debated worldwide. In fact, the US seem to be behind in terms of passing formal legislation on the topic. In Europe the EU Data Retention Directive passed on 2006 requiring 6-24 months of data retention for phone calls and eventually ISP data. Cases are now being brought forth to challenge the Directive, although it appears the prevailing attitude is to accept the law if it indeed helps track terrorists and child pornographers.

ISP records the association of the IP address to the user’s machine MAC address as part of the DHCP (dynamic host configuration protocol) service. Depending on how long this information is archived, the ISP can recall allocations of a dynamic IP address to the MAC address, at a specific time, and than identify what current dynamic IP address is associated with that MAC, thereby identifying the user. A lot depends on length of archived information, if the user hasn't changed their MAC address (i.e. by buying new equipment), or if they used a public computer to access Google.

A better question would be if Google tags the search data with the user's Web browser cookie, which could be discovered through warrant on the user's computer. If the cookie hasn't expired or been renewed with a new one, it may be possible to link query information by the user having the same cookie that Google assigned to the query data, (if they do associate data to the browser cookie).

Add new comment