Around the world, new laws are requiring improved transparency from major Internet platforms about their content moderation. This is generally a very positive development. But it also raises important questions about what kinds of disclosures we expect from platforms -- and what kinds of enforcement we expect from governments. This post uses five concrete examples to illustrate the complexity of these disclosures. The examples also illustrate what I think is a very real risk: that state enforcers may abuse transparency laws, using them to reshape platforms’ actual policies. That is a threat not only to platforms’ editorial and speech rights, but to the rights of all their users. I think it should be possible to mitigate this risk. But we can only do so if we recognize it.
I have written before about the complexity of counting content moderation actions for aggregate transparency reports. This post addresses the seemingly simpler task of describing speech policies or enforcement decisions. This is also complicated, in ways that should come as no surprise to lawyers, parents of young children, or anyone else who has tried to explain and apply rules to disputatious parties. The degree of detail that platforms could include in their explanations is vast -- perhaps not as infinite as the diversity of human misbehavior on the Internet, but the two are certainly correlated. Facebook’s rules, for example, run to over a hundred pages, but are widely criticized as insufficiently clear or detailed.
Platforms’ speech rules are also deeply entangled with political questions and culture war flashpoints. Disputes about whether a platform has “really” disclosed its policies can easily devolve into substantive debates about gender, race, and speech -- as the examples below illustrate. State actors who are licensed to examine the adequacy of platform transparency will have unique leverage to influence the substantive speech policies applied to users. In the U.S., these enforcers will themselves be political office-holders: state Attorneys General (AGs), who are often fiercely partisan. Their use of new transparency laws may be presaged by Texas Attorney General Ken Paxton’s current, punitive investigation of Twitter in response to its ouster of former President Trump.
Professor Eric Goldman, who first identified this problem, sees it as a dealbreaker, rendering transparency mandates unconstitutional. I hope that this is not the case. That is in part because, purely as a policy matter, I think transparency from powerful platforms is incredibly important. But it is also because, as a legal matter, I believe that better “tailoring” of transparency laws should be possible. Calling out the risk of state abuse should be the first step in a conversation about how to mitigate it. I hope these examples can help spur that discussion.
1. Unpublished context cues in assessing violent extremist content: In enforcing the platform’s rules against violent extremist organizations, a person wearing a Hawaiian shirt and holding a torch at a civil rights rally is presumed to be making a prohibited signal of support for the Proud Boys. A person doing the same things at a luau is not. The platform does not specify these points in its posted policy. A user whose photos from the white power rally in Charlottesville were removed hears a rumor about this internal platform practice, and says he was deceived by the omission. He calls the state AG, who agrees and initiates an investigation, seeking discovery of any other unstated rules that might be similarly deceptive. Should this be considered a violation of transparency laws? What internal platform documents should the AG get access to in her search for other violations? How will content moderators’ internal discussions be affected if they know every email, document, or Slack message may be reviewed by prosecutors?
2. Assumptions about the meaning of gender: The platform has a posted rule against discrimination on the basis of gender. It does not spell out a point that the platform employee who drafted the policy considered obvious: Posts discriminating against transgender people violate the policy. A conservative AG believes that the platform’s posted policy is misleading, since to her it obviously refers only to males and females. She announces a press conference to discuss the platform’s secret woke agenda. But she tells the platform she will call it off if they simply apply what, to her, is the plain meaning of their policy, and stop removing transphobic posts. (Meanwhile, perhaps a liberal AG thinks the policy is misleading for the opposite reason: because the platform has not, so far, removed content discriminating on the basis of other evolving and disputed gender identities.) Is one of the two AGs right? Would the conservative AG’s actions be OK if she just didn’t suggest the quid pro quo -- canceling the press conference in exchange for the rule change? What if the platform suggested the change to the AG, instead? Or just made the change unilaterally once it learned about her investigation?
3. Machine learning thresholds: A platform’s posted policies say “we do our best to remove material that violates our policies.” In enforcing its anti-porn rules, the platform relies on a machine learning model that removes images if it has 90% confidence that they are pornographic. Removing based on a lower level of confidence – say, 75% -- would catch more porn. But it would also generate more false positives – mistakenly removing pictures of babies and onions. An AG thinks choosing the 90% threshold means that the platform clearly is not doing its best. He wants to see all images removed by the ML model, require the platform to lower the threshold, and conduct similar review of models used for other policies. When should AGs be empowered to investigate the adequacy of tools used to enforce platforms’ published speech rules? Are the inevitable errors of automated tools relevant in deciding if a platform’s posted policies are misleading? What error rates are acceptable, and who decides?
4. Racist terminology: A platform’s posted policies prohibit discrimination based on race. Whenever moderators learn of a new racist slang term, they add it to an internal list and document (1) how that word will be treated under the platform’s policy, (2) whether to use it in prioritizing reported posts in content moderation queues; and (3) whether to retrain machine learning models using a data set that reflects any policy adjustments. All three of these changes may be fine-tuned later as the platform encounters the same terms being parodied, reclaimed by marginalized groups or used in counterspeech. The platform does not list these terms in its posted policy, or in notices to users whose posts were removed for using these terms (who might then use this knowledge to evade detection of their next racist post). Several AGs are concerned that this lack of detail in itself violates transparency laws. They demand to see the list. Upon investigation, they learn that it contains far more terms disparaging to Black people than terms disparaging to white people. A conservative AG now thinks the posted policy is misleading for a second reason: It hides anti-white bias. A liberal AG thinks the policy is misleading because the list contains no terms used in disparaging other groups, including Asian-Americans and Native Americans, and because the platform does not prohibit a number of terms from the ADL’s database of coded racist language. Is one of these AGs right? If their overt or implicit pressure might influence platforms to change the list, is that OK? What rules, if any, should constrain AGs’ influence over platform rules or enforcement mechanisms like these?
5. Influence, experts, and advocacy organizations: For its rule against posts supporting violent extremism, the platform internally relies on a list of known extremist organizations. Over the years, the list has been shaped by expert input from dozens of academics and NGOs. These include the Southern Poverty Law Center (SPLC), which maintains a public list of extremist organizations. A conservative AG has heard that this list exists, and seeks discovery about the secret influence of liberal organizations on platform policies. She believes that if the platform does not publicly list all groups consulted, and explain which rule changes stemmed from which discussions, it will violate multiple transparency obligations including one, like Texas’s, to “publicly disclose accurate information regarding its content management, data management, and business practices….sufficient to enable users to make an informed choice regarding the purchase of or use of access to or services from the platform[.]” Once she sees the list of violent extremist organizations, she also concludes that the selection shows anti-conservative and anti-white bias, giving the lie to the platform’s public claims to be anti-racist and apolitical. Is the AG right about these claims? If she’s not, how could the quoted language from Texas’s law be amended to let her know not to bring such claims, or to let courts know not to accept them?
These examples are intended to provoke discussion about the legitimate scope of government enforcement for new transparency mandates, and about the limits on state authority. If some of the behaviors described above are improper exercises of state power over speech, how might we craft laws to avoid them? One very lawyerly toolkit for this is the set of presumptions, burdens of proof, and other procedural rules that courts will apply once an AG seeks to compel discovery, or sue platforms for alleged violations of transparency laws. AGs’ leverage will be reduced if both parties know the AG will have an uphill battle in bringing particular claims. A more radical tool, but one that appeals to me, would be to add much more transparency to the enforcement process itself. What if both AGs and platforms had to disclose their communications on these topics to the public? There are surely other responses to this problem that I have not thought of. Better ideas about these issues can help us devise better laws.