
Earlier this week, the indefatigable Thomas Brewster at Forbes, a journalist who’s been covering the digital surveillance beat for years, reported on a search warrant to OpenAI seeking to unmask a particular ChatGPT user. Brewster says it’s “the first known federal search warrant asking OpenAI for user data.” (Note the “known”: similar warrants might still be under seal like this one was until recently.) The warrant’s supporting affidavit describes a long-running child exploitation investigation into multiple dark web sites hosting child sex abuse material (CSAM), whose administrator the government sought to identify. (It has now arrested a suspect.)
So what’s the link between various dark web sites and OpenAI? Well, like 800 million other people, the sites’ administrator is allegedly a frequent ChatGPT user. As the affidavit describes, an agent from Homeland Security Investigations (HSI) had been chatting undercover with the admin, who described to the agent his ChatGPT usage, including specific prompts and partial or full responses. Based on that information, the government sought and obtained a warrant to OpenAI for (per Brewster) “various kinds of information on the person who entered the prompts, including details of other conversations they’d had with ChatGPT, names and addresses associated with the relevant accounts, as well as any payment data.” The affidavit includes two “unique, specific” prompts and the “unique responses” that ChatGPT generated. OpenAI apparently complied with the warrant in the form of an Excel spreadsheet of data. (We don’t know what’s in the spreadsheet.)
This warrant may be a first, but it’s not a surprise. User data is “if you keep it, they will come”: If a tech company stores data about its users, in a form the company can legibly access (i.e., not end-to-end encrypted, or E2EE), it will receive a government request for some user’s data eventually… and then for more users after that. And the more data the company stores, the richer the trove for investigators to tap into. (The day after his ChatGPT story, Brewster reported on how HSI used WhatsApp user data in an immigration-related investigation.) That’s why, as digital rights attorney Jen Lynch commented in the ChatGPT article, “it’s more important than ever for OpenAI and other AI companies to think about how to limit the amount of data they collect on their users.”
This warrant may be the first, but it won’t be the last. OpenAI’s transparency reports (each of which is a whopping one page long) say it got a total of 71 government requests for user data in the second half of 2024 (the latest reporting period available), after receiving only half that many in the first half of that year. That number is still small, but it will keep increasing, and this news is surely going to add to the load. Once word gets out that a company can provide various information about its users and will comply with valid legal process, other investigative agencies will follow suit, and the volume of requests goes way up.
That’s what happened with so-called “reverse warrants” to Google for keywords users entered into Google Search and for users’ location history in Google Maps. It got so bad that Google stopped storing Android users’ location history. The company also pushes back on at least some reverse keyword warrants.
Reverse warrants pose major constitutional problems because they’re inherently overbroad: they seek information on everybody who was in a particular place during a particular timeframe, or everybody who searched for a particular query. That flies in the face of the Fourth Amendment, as former federal magistrate judge Brian Owsley explains in a recent article in the Stanford Technology Law Review. As his opening says: “Traditional law enforcement warrants begin with a suspect and, supported by a finding of probable cause, seek additional evidence about that person. Keyword search warrants reverse this process.” The ChatGPT warrant news immediately raised concerns that a new chapter in the tawdry saga of reverse warrants has just begun, with OpenAI replacing Google as the main character.
True, as Lynch noted, this particular warrant looks properly scoped to seek information about one specific ChatGPT user, for whom the affidavit provided ample information to support probable cause linking that user to the CSAM sites. The warrant was not casting a dragnet for every unknown user who entered a query without any reason to suspect each of them individually, the way reverse keyword warrants do. This warrant also makes false positives highly unlikely by limiting the request to whichever account entered one specific prompt and received two specific and lengthy responses to prompts. That’s different from, say, a reverse keyword warrant for information on every person who googled a particular street address, which could cover a suspected arsonist but also anyone from party guests to a taxi driver to a pizza delivery guy.
Nevertheless, we should take this warrant to OpenAI as a harbinger of more to come. That this warrant application was properly scoped doesn’t mean the next one will be. And even if they’re all properly scoped, there are going to be more of them, and it’ll take some amount of work for OpenAI to deal with them. Therefore, here are some questions we should be asking about user data requests to OpenAI (and its ilk) going forward:
- Judges are the gatekeepers who are supposed to reject unconstitutional or otherwise faulty warrant applications. Will judges prove to be as willing to authorize reverse warrants over ChatGPT prompts – let’s call them “reverse prompt warrants” – as so many of them have been for reverse keyword warrants and reverse location warrants?
- Given the novelty of reverse prompt warrants, will prosecutors judge-shop until they get a friendly judge who’ll sign off on one that (unlike this warrant) isn’t narrowed down to one specific user? Which courts are most susceptible to judge-shopping?
- OpenAI says warrant responses include “content data, such as text or files input or output in connection with the use of OpenAI services.” That sounds like OpenAI can produce both users’ prompts and ChatGPT’s responses – but what if the warrant doesn’t specify an account? What capabilities does OpenAI have to search for a specific prompt and/or response that surfaces results tied to specific accounts’ chat histories? How far back can it do that, and how technically burdensome is it? Is there any variance for free vs. paid plans?
- OpenAI’s Law Enforcement User Data Request Policy currently states that a request must “include sufficient information to unambiguously identify the user account(s) at issue.” How will OpenAI apply this policy to reverse prompt warrants that correspond to multiple users’ accounts (rather than just one user, as with this warrant)?
- At a time when the federal government and various state governments are cracking down on people and activities they disfavor, will OpenAI move to quash a warrant for a user’s account if the warrant specifies prompts relating to First Amendment-protected activity (such as protesting)? How about “crimes” like abortion, gender-affirming care, or lack of legal immigration status?
- When responding to legal process from the federal government, do political considerations influence OpenAI’s response, given CEO Sam Altman’s close ties to the Trump administration (which now involve the phrase “weapons-grade plutonium”)?
- Will OpenAI move to quash legal process if there’s a gag order that keeps the target user(s) in the dark so they can’t move to quash on their own? Has it ever fought to get a gag order lifted?
- What technical changes could OpenAI make so that they would be unable to comply with a reverse prompt warrant, like Google did with location history?
If OpenAI’s numbers are to be believed, one out of every ten people on Earth already uses ChatGPT at least once a week. In that sense, it’s remarkable that the volume of user data requests the company receives is as low as it is – and that it took this long for the first known reverse prompt warrant to arrive. As said, this one seems OK, but OpenAI needs to be ready for future ones to try to push the envelope. Navigating an increasing volume of requests from governments is a normal part of a startup tech company’s growth. But the larger its user base, the more important it is for the company to stand up for its users’ rights – and to do so vigorously straight out of the gate. Frankly, AI as a replacement for traditional web search has been disappointing enough already. We don’t need the AI version of superabundant unconstitutional reverse keyword search warrants too.