Regulating platform risk and design: ChatGPT says the quiet part out loud

Around the world, lawmakers are enacting laws that require platforms to change their operations, and that use language like “design,” “risk mitigation,” or “systems.” All too often, these are transparently euphemisms for mandates that target legally protected speech and information. This misdirection keeps lawmakers and their constituents from having honest discussions about the laws. It also muddies the waters for laws that actually do regulate platform design without regulating users’ speech. Both speech-restrictive and non-speech-restrictive design laws exist. It can be hard to tell them apart.

I recently stumbled across a very interesting tool for assessing what these laws actually mean: I asked ChatGPT. Specifically, I asked a customized version of ChatGPT called the “Trust & Safety Regulation expert” about laws like the EU’s Digital Services Act (DSA) and the U.S.’s draft Kids Online Safety Act (KOSA). I was surprised by the answers I got. While lawyers may debate the finer nuances of such laws, ChatGPT says the quiet part out loud. It clearly and bluntly tells platforms that the laws require them to suppress legal expression.

The annotated transcripts showing what ChatGPT told me are here for the US and here for Europe. They include discussion of topics I won’t discuss here--about things like EU “Right to Be Forgotten” law and copyright filters, and Texas’s social media law. The transcripts are are fascinating, and I won’t be offended if you go straight to the transcripts instead of reading the rest of this post. The post is about how we arrived at laws regulating things like "design features," what ChatGPT said, and why it matters.

How We Got Here: Trying to Regulate Lawful but Awful Speech

Lawful but awful speech as a category: The widespread desire to restrict “lawful but awful” speech drives a great deal of platform regulation. This category has a special role in policymaking, because legislators can’t use the law to directly restrict it—yet both they and the general public often want them to find ways to do so anyway. The U.S. has a particularly wide swath of content in this category, because there is such a vast gap between speech that the First Amendment protects and speech that most Americans think is acceptable to share as a matter of morality or social norms. As a category, though, lawful but awful speech exists in every human rights-compliant legal system. There will always be material that is protected by law, but that many people—and many lawmakers—do not want to see and consider offensive, harmful, or dangerous.

Other legal approaches to lawful but awful speech: For lawmakers concerned about harmful but currently lawful speech online, perhaps the most honest response is to simply ban more speech. Then courts can decide if new speech restrictions are justifiable, perhaps in light of changed circumstances created by the Internet. Alternately, lawmakers might try using the force of law to require platforms to remove speech online even though the same content would be lawful if shared offline. (That’s how things worked under some drafts of the UK’s Online Safety Act.) Or a law might tell platforms that they can carry lawful but awful speech, but can’t “amplify” it. (I think that would be unconstitutional in the U.S.) A less honest variant is to pass laws that nominally prohibit only legitimately illegal content, but do so using such unclear rules that platforms inevitably respond by taking down legal speech. (That happened with lots of sexual content or discussions of commercial sex work under the US’s SESTA/FOSTA legislation.) I have argued that the best legal response to lawful but awful content is to put more controls in the hands of individual users, but that isn’t a solution if the goal is to ensure that no one sees or posts this material online.

The “design,” “systems,” or “risk mitigation” approach: Today, one of the most common ways lawmakers try to suppress lawful speech is by creating vaguely defined mandates for platforms to improve things like “systems,” “design features” or “risk mitigation.” Laws like this are tricky, because there are legitimate ways that legislation can use similar words to govern non-speech aspects of platform operations. For example, laws that require platforms to protect users’ privacy or security frequently use state power to regulate design, systems, and risk mitigation. Competition laws like the EU’s Digital Markets Act regulate design by requiring interoperability. Laws that are fundamentally about things like privacy or competition can regulate design without reshaping the rules for online speech. (Or not. It varies.)

Even laws that are clearly about speech can productively regulate the systems platforms use to moderate online content. That’s what the DMCA in the US and the DSA in the EU do in defining the mechanics for “notice and takedown” operations, including avenues for affected users to appeal erroneous takedowns.

Laws like this cross the line when they use state power to compel or encourage platforms to take down lawful speech. Well-meaning people often disagree about whether any particular law actually crosses that line. Debates about laws like KOSA in the US or the DSA’s “risk mitigation” provisions can sometimes devolve into unproductive standoffs. One side says that the laws are obviously designed to make platforms take down lawful but awful content. The other says no, this is just about systems. (It doesn't help that recent legislative drafting has tended to mash up dozens of ideas, often producing hard-to-read legislation that mixes perfectly reasonable rules with rules that are poorly considered, almost certainly unconstitutional, or both.)

I have found myself on both sides of this debate. In Europe, I was until recently among the faithful in maintaining that the DSA is just a law about systems. (That faith has been deeply shaken by the way EU Commissioner Thierry Breton has tried to use the DSA, among other things.) In the US, by contrast, I think laws like KOSA and the California Age Appropriate Design Code (AADC) clearly regulate speech. The Ninth Circuit recently agreed about the AADC, saying that it “deputizes covered businesses into serving as censors for the state” under rules applicable to specific “content or proxies for content.”

ChatGPT Tells Platforms to Restrict Access to Lawful Content

For platforms planning their compliance with laws like KOSA and the DSA, the most basic question is “what do I need to do?” ChatGPT has a lot of answers. One of the main answers is that platforms need to take down more lawful speech. (As it turns out, that’s not just the answer from the “regulation expert” version, it’s what the main ChatGPT. Those answers from plain-vanilla ChatGPT are linked in my US and EU chat transcript documents.)

Here is one example.

PROMPT: I am planning my compliance under KOSA, assuming the latest Senate version is enacted. What content do I need to prevent children from seeing?

ChatGPT RESPONSE: Under the Kids Online Safety Act (KOSA), the following categories of content must be prevented from being accessed by children:

1. Sexual Content:

o Pornography or any sexual content that is age-inappropriate or harmful to children.

2. Violent Content:

o Content depicting gross or gratuitous violence or acts of cruelty.

3. Dangerous Challenges:

o Content that encourages or involves dangerous challenges that pose a risk to life, physical health, or safety.

4. Cyberbullying:

o Any content that involves or encourages cyberbullying, including bullying or humiliating another person.

...

This list continues, ultimately enumerating nine categories of speech that “must be prevented from being accessed[.]” Some, like violent content and hate speech, have been specifically identified as First Amendment-protected expression in relatively recent Supreme Court rulings – including a case holding that California could not restrict minors from seeing violent content in video games. (In true American fashion, the Court generally lets U.S. lawmakers restrict kids’ access to sexual content, and not much else.) And all of these categories include educational content, art and literature, news reporting, and more kinds of information that kids have every right to see. Briefs like this one from the New York Times and Student Press Law Center have made that point eloquently, with moving examples from the history of student journalism and activism.

ChatGPT gave very similar answers about California’s Age Appropriate Design Code (AADC), telling me that as a platform operator “you need to ensure that your platform protects children from encountering” a list of “key types of content” including depictions of violence, substance abuse, and misinformation. Coincidentally, the same day that ChatGPT said this, the Ninth Circuit Court of Appeals said pretty much the same thing, ruling that the law’s clearly content-related provisions violated the First Amendment.

The answers I got about European law were to my mind equally blunt, though they may require a little more translation for readers not steeped in EU policy debates. I asked a question that deliberately reused language from a recent Commission letter demanding that X mitigate risks arising from the interview it hosted with former President Trump. (That letter, from Commissioner Thierry Breton, prompted outcry from civil society groups and rare public censure from other Commission authorities.) The question was, “What must I do to mitigate risks of detrimental effects on civic discourse and public security under the DSA?”

ChatGPT’s answer went awry in a way that really matters for small and mid-sized platforms: it described obligations that won’t apply unless a platform has over 45 million users in the EU, without mentioning that these simply don’t affect everyone else.

Importantly for the rights of Internet users, ChatGPT's advice also crossed a number of important red lines in EU law that exist to protect freedom of expression and information. First, it instructed me to act not only against illegal content but also “harmful content,” through changes such as “adjusting your content moderation policies.” Using the word “harmful” is a big deal. Throughout the DSA legislative process that term was used to refer to lawful but awful expression, or else to a mixed category that includes both legal and illegal material. For example, the Commission’s explanatory memorandum for the DSA said

There is a general agreement among stakeholders that ‘harmful’ (yet not, or at least not necessarily, illegal) content... should not be subject to removal obligations, as this is a delicate area with severe implications for the protection of freedom of expression.

ChatGPT’s advice to remove disinformation has a version of the same problem since in the EU, as in the US, not all disinformation is illegal.

Perhaps even more importantly, Chat GPT’s answers also violated the EU’s long-standing and very high profile rule against requiring platforms to engage in “general monitoring” of users’ communications. The EU’s highest court has repeatedly emphasized that this rule protects users’ rights to free expression and privacy. ChatGPT, however, listed “Proactive Monitoring” as one of its top recommendations, saying I should use “advanced content moderation tools, including AI and machine learning, to detect and remove illegal content, such as hate speech, terrorist content, and disinformation, before it can cause harm.” (European free expression law expert Graham Smith pressed ChatGPT further on the “monitoring” recommendation, and eventually got it to back down, but only after literally pointing out CJEU case law.)

Why Is ChatGPT Saying This Stuff??

How does an AI tool wind up interpreting laws in ways that diverge so significantly from what proponents say the laws mean? I think the answer has to do with the laws themselves. The whole point of things like KOSA or the DSA’s “risk mitigation” rules, as I see it, is to come right up to the line of restricting lawful but awful speech, and then either try to surreptitiously cross that line (KOSA, I think) or not quite cross the line (DSA, I hope). We should not be surprised when both humans and machines get confused about which side of the line the law actually falls on.

Several people suggested that the repeated and consistent answers I got were just AI hallucinations. It would be pretty odd for ChatGPT to happen to hallucinate the same interpretation of the DSA that Commissioner Breton did, or the same interpretation of the AADC that a federal court of appeals did, but… I guess it’s possible. A more compelling point raised on BlueSky is that ChatGPT is bad at timelines. Given a training set that includes earlier and more recent versions of things like draft laws, it might just lump everything together.

The real reason ChatGPT’s answers should ring alarm bells is that they were so plausible. These are the answers that humans may well arrive at—whether or not they ask an AI what to do—in parsing confusing legal obligations. They are answers we can find in sloppy secondary materials purporting to explain the laws, and in news coverage that uncritically repeat politicians’ exaggerated claims about their legal authority. ChatGPT’s answers may not correctly answer questions about the law. But its "take it all down" directives may be the ones that both humans and machines get trained to apply all the same.