The Santa Clara Principles on Transparency and Accountability in Content Moderation

Chapeau

In 2018, alongside the Content Moderation at Scale conferences in the United States, a group of human rights organizations, advocates, and academic experts developed and launched a set of three principles for how best to obtain meaningful transparency and accountability around Internet platforms’ increasingly aggressive moderation of user-generated content. These principles, named after the group’s initial meeting place in Santa Clara, CA, represent recommendations for initial steps that companies engaged in content moderation should take to provide meaningful due process to impacted speakers and better ensure that the enforcement of their content guidelines is fair, unbiased, proportional, and respectful of users’ rights. This was the first iteration of the Santa Clara Principles.

Since 2018, twelve major companies—including Apple, Facebook (Meta), Google, Reddit, Twitter, and Github—have endorsed the Santa Clara Principles and the overall number of companies providing transparency and procedural safeguards has increased, as has the level of transparency and procedural safeguards provided by many of the largest companies.

At the same time, the importance of the role these companies play in society continues to increase, resulting in an ever greater responsibility to provide sufficient levels of transparency around the decisions they make, in order to enable accountability.

For these reasons, a broad coalition of organizations, advocates and academic experts worked together in 2020 and 2021 to develop this second iteration of the Santa Clara Principles. They were developed following a broad consultation exercise involving more than 50 organizations and individuals, and a thorough process of drafting and review. By drawing on experience and expertise from all parts of the world, this second iteration of the Santa Clara Principles better reflects the expectations and needs of the global community.

This second iteration of the Santa Clara Principles is divided into Foundational and Operational Principles. Foundational Principles are overarching and cross-cutting principles that should be taken into account by all companies, of whatever business model, age, and size, when engaging in content moderation. They set out each principle and guidance as to how to implement that principle. The Operational Principles set out more granular expectations for the largest or most mature companies with respect to specific stages and aspects of the content moderation process. Smaller, newer, and less resourced companies may also wish to use the Operational Principles for guidance and to inform future compliance. In contrast to the minimum standards set out in the first iteration, this second iteration provides greater specificity regarding precisely what information is needed to ensure meaningful transparency and accountability.

This second iteration of the Santa Clara Principles expands the scope of where transparency is required with respect to what is considered “content” and “action” taken by a company. The term “content” refers to all user-generated content, paid or unpaid, on a service, including advertising. The terms “action” and “actioned” refer to any form of enforcement action taken by a company with respect to a user’s content or account due to non-compliance with their rules and policies, including (but not limited to) the removal of content, algorithmic downranking of content, and the suspension (whether temporary or permanent) of accounts.

This second iteration of the Santa Clara Principles has been developed to support companies to comply with their responsibilities to respect human rights and enhance their accountability, and to assist human rights advocates in their work. They are not designed to provide a template for regulation.

Authors:

Access Now
ACLU Foundation of Northern California
ACLU Foundation of Southern California
ARTICLE 19
Brennan Center for Justice
Center for Democracy & Technology
Electronic Frontier Foundation
Global Partners Digital
InternetLab
National Coalition Against Censorship
New America’s Open Technology Institute
Ranking Digital Rights
Red en Defensa de los Derechos Digitales
WITNESS

Foundational principles

1. Human Rights and Due Process

Principle: Companies should ensure that human rights and due process considerations are integrated at all stages of the content moderation process, and should publish information outlining how this integration is made. Companies should only use automated processes to identify or remove content or suspend accounts, whether supplemented by human review or not, when there is sufficiently high confidence in the quality and accuracy of those processes. Companies should also provide users with clear and accessible methods of obtaining support in the event of content and account actions.

Implementation: Users should be assured that human rights and due process considerations have been integrated at all stages of the content moderation process, including by being informed of:

How the company has considered human rights—particularly the rights to freedom of expression and non-discrimination—in the development of its rules and policies;
How the company has considered the importance of due process when enforcing its rules and policies, and in particular how the process has integrity and is administered fairly; and
The extent to which the company uses automated processes in content moderation and how the company has considered human rights in such use.

2. Understandable Rules and Policies

Principle: Companies should publish clear and precise rules and policies relating to when action will be taken with respect to users’ content or accounts, in an easily accessible and central location.

Implementation: Users should be able to readily understand the following:

What types of content are prohibited by the company and will be removed, with detailed guidance and examples of permissible and impermissible content;
What types of content the company will take action against other than removal, such as algorithmic downranking, with detailed guidance and examples on each type of content and action; and
The circumstances under which the company will suspend a user’s account, whether permanently or temporarily.

3. Cultural Competence

Principle: Cultural competence requires, among other things, that those making moderation and appeal decisions understand the language, culture, and political and social context of the posts they are moderating. Companies should ensure that their rules and policies, and their enforcement, take into consideration the diversity of cultures and contexts in which their platforms and services are available and used, and should publish information as to how these considerations have been integrated in relation to all operational principles. Companies should ensure that reports, notices, and appeals processes are available in the language in which the user interacts with the service, and that users are not disadvantaged during content moderation processes on the basis of language, country, or region.

Implementation: Users should have access to rules and policies and notice, appeal, and reporting mechanisms that are in the language or dialect with which they engage. Users should also have confidence that:

Moderation decisions are made by those familiar with the relevant language or dialect;
Moderation decisions are made with sufficient awareness of anyrelevant regional or cultural context; and
Companies will report data that demonstrates their language, regional, and cultural competence for the users they serve, such as numbers that demonstrate the language and geographical distribution of their content moderators.

4. State Involvement in Content Moderation

Principle: Companies should recognise the particular risks to users’ rights that result from state involvement in content moderation processes. This includes a state’s involvement in the development and enforcement of the company’s rules and policies, either to comply with local law or serve other state interests. Special concerns are raised by demands and requests from state actors (including government bodies, regulatory authorities, law enforcement agencies and courts) for the removal of content or the suspension of accounts.

Implementation: Users should know when a state actor has requested or participated in any actioning on their content or account. Users should also know if the company believes that the actioning was required by relevant law. While some companies now report state demands for content restriction under law as part of their transparency reporting, other state involvement is not reported either publicly or to the actioned users. But companies should clearly report to users when there is any state involvement in the enforcement of the company’s rules and policies.

Specifically, users should be able to access:

Details of any rules or policies, whether applying globally or in certain jurisdictions, which seek to reflect requirements of locallaws.
Details of any formal or informal working relationships and/oragreements the company has with state actors when it comes to flagging content or accounts or any other action taken by the company.
Details of the process by which content or accounts flagged by state actors are assessed, whether on the basis of the company’s rules or policies or local laws.
Details of state requests to action posts and accounts.

5. Integrity and Explainability

Principle: Companies should ensure that their content moderation systems, including both automated and non-automated components, work reliably and effectively. This includes pursuing accuracy and nondiscrimination in detection methods, submitting to regular assessments, and equitably providing notice and appeal mechanisms. Companies should actively monitor the quality of their decision-making to assure high confidence levels, and are encouraged to publicly share data about the accuracy of their systems and to open their process and algorithmic systems to periodic external auditing. Companies should work to ensure that actioning requests are authentic and not the result of bots or coordinated attacks.

There are many specific concerns for automated systems, and companies should employ them only when they have confidence in them, and in a transparent and accountable manner.

Implementation: Users should have confidence that decisions about their content are made with great care and with respect to human rights. Users should know when content moderation decisions have been made or assisted by automated tools, and have a high level understanding of the decision-making logic employed in content-related automated processes. Companies should also clearly outline what controls users have access to which enable them to manage how their content is curated using algorithmic systems, and what impact these controls have over a user’s online experience.

Operational Principles

1. Numbers

The Numbers Principle reflects the importance of transparency in content moderation, both to users seeking to understand decisions about their own speech and to society at large. Companies should report information that reflects the whole suite of actions the company may take against user content and accounts due to violations of company rules and policies, so that users and researchers understand and trust the systems in place.

Companies should publish information about pieces of content and accounts actioned, broken down by country or region, if available, and category of rule violated, along each of these dimensions:

Total number of pieces of content actioned and accounts suspended.
Number of appeals of decisions to action content or suspend accounts.
Number (or percentage) of successful appeals that resulted in pieces of content or accounts being reinstated, and the number (or percentage) of unsuccessful appeals and;
Number (or percentage) of successful or unsuccessful appeals of content initially flagged by automated detection.
Number of posts or accounts reinstated by the company proactively, without any appeal, after recognition that they had been erroneously actioned or suspended.
Numbers reflecting enforcement of hate speech policies, by targeted group or characteristic, where apparent, though companies should not collect data on targeted groups for this purpose
Numbers related to content removals and restrictions made during crisis periods, such as during the COVID-19 pandemic and periods of violent conflict.

Special reporting requirements apply to decisions made with the involvement of state actors, which should be broken down by country:

The number of demands or requests made by state actors for content or accounts to be actioned
The identity of the state actor for each request
Whether the content was flagged by a court order/judge or other type of state actor
The number of demands or requests made by state actors that were actioned and the number of demands or requests that did not result in actioning.
Whether the basis of each flag was an alleged breach of the company’s rules and policies (and, if so, which rules or policies) or of local law (and, if so, which provisions of local law), or both.
Whether the actions taken against content were on the basis of a violation of the company’s rules and policies or a violation of local law.

Because there are special concerns that flagging processes will be abused, companies should consider reporting data that will allow users and researchers to assess the frequency of such abuse and the measures a company takes to prevent it. Specific metrics and/or qualitative reporting could be devised to help identify abuse-related trends in particular regional contexts. Companies should consider collecting and reporting the following, broken down by country or region if available:

The total number of flags received over a given period of time.
The total number of flags traced to bots.
The number of posts and accounts flagged, in total, and broken down by
- Alleged violation of rules and policies
- Source of the flag (state actors, trusted flaggers, users, automation, etc.)

Due to the increasing role that automated processes play in content moderation, a comprehensive understanding of companies’ processes and systems requires transparency around the use of automated decision-making tools. In addition to the numbers about the use of automation called for above, companies should publish information relating to:

When and how automated processes are used (whether alone or with human oversight) when actioning content.
The categories and types of content where automated processes are used;
The key criteria used by automated processes for making decisions;
The confidence/accuracy/success rates of automated processes, including changes over time and differences between languages and content categories;
The extent to which there is human oversight over any automated processes, including the ability of users to seek human review of any automated content moderation decisions;
The number (or percentage) of successful and unsuccessful appeals when the content or account was first flagged by automated detection, broken down by content format and category of violation;
Participation in cross-industry hash-sharing databases or other initiatives and how the company responds to content flagged through such initiatives.

All data should be provided in a regular report, ideally quarterly, in an openly licensed, machine-readable format.

2. Notice

Companies must provide notice to each user whose content is removed, whose account is suspended, or when some other action is taken due to non-compliance with the service’s rules and policies, about the reason for the removal, suspension or action. Any exceptions to this rule, for example when the content amounts to spam, phishing or malware, should be clearly set out in the company’s rules and policies.

When providing a user with notice about why their post has been actioned, companies should ensure that notice includes:

URL, content excerpt, and/or other information sufficient to allow identification of the content actioned.
The specific clause of the guidelines that the content was found to violate.
How the content was detected and removed (flagged by other users, trusted flaggers, automated detection, or external legal or other complaints).
Specific information about the involvement of a state actor in flagging or ordering actioning. Content flagged by state actors should be identified as such, and the specific state actor identified, unless prohibited by law. Where the content is alleged to be in violation of local law, as opposed to the company’s rules or policies, the users should be informed of the relevant provision of local law.

Other standards for adequate notice include:

Notices should be timely and should include an explanation of the process through which the user can appeal the decision, including any time limits or relevant procedural requirements.
Notices should be available in a durable form that is accessible even if a user’s account is suspended or terminated.
Users who flag content should be presented with a log of content they have reported and the outcomes of moderation processes.
Notices should be in the language of the original post or in the user interface language selected by the user.
Notices should provide users with information about available user support channels and how to access them.
Where appropriate, notice should also be provided to other relevant individuals, including group administrators and flaggers. This should include a notice posted at the original location of the content that has been removed.

3. Appeal

The Appeal principle covers the companies’ obligations to make explanation, review, and appeal processes available to users. Users should be able to sufficiently access support channels that provide information about the actioning decision and available appeals processes once the initial actioning decision is made. Companies should provide a meaningful opportunity for timely appeal of decisions to remove content, keep content up which had been flagged, suspend an account, or take any other type of action affecting users’ human rights, including the right to freedom of expression. According to the principle of proportionality, companies should prioritize providing appeal for the most severe restrictions, such as content removal and account suspension.

Companies should ensure that the appeal includes:

A process that is clear and easily accessible to users, with detailsof the timeline provided to those using them, and the ability to track their progress.
Human review by a person or panel of persons who were not involved in the initial decision.
The person or panel of persons participating in the review being familiar with the language and cultural context of content relevant to the appeal.
An opportunity for users to present additional information in support of their appeal that will be considered in the review.
Notification of the results of the review, and a statement of the reasoning sufficient to allow the user to understand the decision.

In the long term, independent review processes may also be an important component for users to be able to seek redress. Where such processes exist, companies should provide information to users about access to them. Companies should ensure that, to the extent that they exercise control or influence over independent review processes, they also embrace the Santa Clara Principles, and that they provide regular transparency reporting,clear information to users about the status of their appeal, and the rationale for any decision.

Companies should consider whether, in certain circumstances, appeal processes should be expedited, for example where the affected user may be the target of an abusive takedown scheme or where the affected content is time-sensitive, such as political content during an election period. Where appeal processes are expedited, companies should provide clear rules and policies as to when this takes place and whether users can request an expedited appeal.

Principles for Governments and Other State Actors

Governments of course have an obligation under various international legal instruments, for example, Article 19 of the Universal Declaration of Human Rights, to respect the freedom of expression of all persons. As a result, state actors must not exploit or manipulate companies’ content moderation systems to censor dissenters, political opponents, social movements, or any person.

With respect to transparency, transparency by companies is a critical element of ensuring trust and confidence in the content moderation processes. However, states must recognize and minimize their roles in obstructing transparency, and must also provide transparency about their own demands for content removal or restriction.

1. Removing Barriers to Company Transparency

Governments and other state actors should remove the barriers to transparency (and refrain from introducing such barriers) that prevent companies from fully complying with the above principles.

Governments and other state actors should ensure that companies are not prohibited from publishing information detailing requests or demands for content or account removal or enforcement which come from state actors, save where such a prohibition has a clear legal basis, and is a necessary and proportionate means of achieving a legitimate aim.

2. Promoting Government Transparency

Governments and other state actors should themselves report their involvement in content moderation decisions, including data on demands or requests for content to be actioned or an account suspended, broken down by the legal basis for the request. Reporting should account for all state actors and, where applicable, include subnational bodies, preferably in a consolidated report.

Governments and other state actors should consider how they can encourage appropriate and meaningful transparency by companies, in line with the above principles, including through regulatory and non-regulatory measures.

Acknowledgements

Thank you to all of the organizations and individuals who submitted comments, participated in the group consultations, and reviewed and commented on preliminary work. Organizations submitting comments include the following: 7amleh, Association for Progressive Communications, Centre for Internet & Society, Facebook/Meta, Fundación Acceso, GitHub, Institute for Research on Internet and Society, InternetLab, Laboratório de Políticas Públicas e Internet (LAPIN), Lawyers Hub, Montreal AI Ethics Institute, PEN America, Point of View, Public Knowledge, Taiwan Association for Human Rights, The Dialogue, Usuarios Digitales. The list of individuals and groups who coordinated and hosted consultations, and otherwise contributed to the process, includes, but is not limited to: ALT Advisory, Centro de Estudios en Libertad de Expresión y Acceso a la Información (CELE), UNESCO, Irina Raicu, Eduardo Celeste, Derechos Digitales, Robert Gorwa, Ivar A.M. Hartmann, Amélie Heldt, Tomiwa Ilori, Julian Jaursch, Clara Iglesias Keller, Paddy Leerssen, Martin J. Riedl, Christian Strippel, and Daphne Keller.

The Santa Clara Principles 2.0 are additionally supported by the Swedish Postcode Foundation.

Finally, we would like to thank the authors and supporters of the original principles: ACLU Foundation of Northern California, Center for Democracy & Technology, Electronic Frontier Foundation, New America’s Open Technology Institute, Irina Raicu, Nicolas Suzor, Sarah Myers West, and Sarah T. Roberts; Santa Clara University’s High Tech Law Institute for organizing the Content Moderation & Removal at Scale conference, as well as Eric Goldman for supporting the convening of the workshop that resulted in this document. That workshop was also made possible thanks to support from the Internet Policy Observatory at the University of Pennsylvania. Suzor is the recipient of an Australian Research Council DECRA Fellowship (project number DE160101542).