Posted in

Vague by Design? The Oversight Board, Meta’s DOI Policy, and the Kolovrat Symbol Decision

The Oversight Board – an independent body established by Meta to review and advise on difficult content moderation decisions – recently issued a decision in the Kolovrat symbol case that illuminates critical tensions in content moderation: balancing hate symbolism, cultural identity, and interpretive ambiguity. Meta removed two Instagram posts and retained a third post under its Dangerous Organizations and Individuals (DOI) policy. The two posts removed for hate glorification included one featuring the Kolovrat symbol with a “Slavic Army” caption urging Slavic people to “wake up” and another using #DefendEurope and M8 rifle symbols. The third post was retained as non-violating. While the board ultimately upheld the removals – a defensible position given the specific context – several critical gaps in its reasoning and scope remain unaddressed.

First, I highlight the scope of the board’s decision (which I agree with) in identifying a fundamental failure of transparency and foreseeability in Meta’s enforcement of its DOI policy. Second, while the board rightly focused on this failure, its analysis stopped short of addressing the structural incoherence and internal contradictions within Meta’s publicly available DOI policy – issues that generate user uncertainty regardless of internal guidelines. Third, despite acknowledging in its call for comments that extremists routinely evade moderation through coded tactics, the board failed to meaningfully engage with these strategies or press Meta to adopt stronger, forward-looking due diligence measures.

This piece will delve less into the merits of the case but instead concentrate on the underlying structural flaws highlighted above.

What the Board Got Right

The Oversight Board scrutinized Meta’s application invoking DOI policy in the Kolovrat symbol case, specifically challenging the company’s reliance on the term “reference” to justify removal. Surprisingly, Meta did not assert that the posts explicitly glorified white nationalism (a hateful ideology). Instead, it removed the content by classifying it as a “reference” to it under its internal enforcement rules.

To aid clarity, I have reproduced below all relevant excerpts from Meta’s DOI policy that address how content related to the glorification of hateful ideologies is to be treated. For ease of reference, I have classified these excerpts as Rule 1 through Rule 4.

Rule 1 (p. 2) Rule 2 (p. 5) Rule 3 (p. 7) Rule 4 (p. 8)
“We also remove content that Glorifies, Supports or Represents ideologies that promote hate, such as Nazism and white supremacy. We remove unclear references to these designated events or ideologies.” “For Tier 1 and designated events, we may also remove unclear or contextless references if the user’s intent was not clearly indicated. This includes unclear humor, captionless or positive references that do not glorify the designated entity’s violence or hate.” “We also do not allow Glorification, Support or Representation of designated hateful ideologies, as well as unclear references to them.” “In these cases, we designate the ideology itself and remove content that supports this ideology from our platform. These ideologies include:
• Nazism
• White supremacy
• White nationalism
• White separatism
We remove explicit Glorification, Support and Representation of these ideologies, and remove individuals and organisations that ascribe to one or more of these hateful ideologies.”

[Note: The page numbers referenced correspond to the pagination of the PDF print version of Meta’s DOI policy (as of 07-28-2025), given the online version is not paginated.]

Meta cited its public DOI policy (Rule 2) – which permits removal of “unclear or contextless references if the user’s intent was not clearly indicated” and explicitly includes “unclear humor, captionless or positive references that do not glorify” – as justification. However, in response to this, the board revealed a critical flaw: Meta’s internal definition of “reference” (disclosed to the board) was broader than the examples given in the DOI policy. Internally, “reference” encompassed five types of standalone violations: 1) Positive references (even without ambiguity); 2) Incidental depictions (e.g., accidental symbol appearances); 3) Captionless images; 4) Unclear satire/humor; and 5) Symbols.

This hidden framework created two major problems. First, it effectively rewrote the scope of enforcement. While the public facing policy (Rule 2) frames removal as discretionary and limited to ambiguous content, Meta’s internal definitions allowed removal even when meaning was clear and non-violative. Second, it left users without fair notice. Subcategories such as “incidental depictions” were never disclosed, meaning that users posting cultural or historical content had no way to anticipate enforcement. A user sharing a historical photo with a background Kolovrat symbol, for instance, could find their post removed without any indication of violation in the publicly accessible policy.

The board upheld the removal and rightly concluded that this undisclosed expansion violated Article 19 of the International Covenant on Civil and Political Rights (ICCPR), which protects the right to freedom of expression. Under the “legality” requirement of Article 19(3), any restriction on expression must be provided by law and be sufficiently clear and foreseeable to those subject to it. The board found that Meta’s undisclosed and overly broad internal definition of “reference” failed this standard. It ordered Meta to publicly define its internal use of the term and to clarify subcategories such as “positive references” and “incidental depictions.” Although these transparency measures were necessary, they were not sufficient for the reasons detailed below.

A Policy at War with Itself

A close reading of the DOI policy reveals a structural incoherence in how it addresses content linked to hateful ideologies. The four primary enforcement rules – Rules 1 through 4 – each speak to the types of content that may or must be removed. However, when considered together, they exhibit a fragmented logic that compromises the policy’s internal consistency and the predictability of enforcement.

Rule 1 asserts that Meta “removes unclear references” to designated events or ideologies. This is a categorical formulation that imposes a mandatory obligation to take down ambiguous content, regardless of whether it glorifies or supports hateful views.

Rule 2, by contrast, introduces a discretionary standard: Meta “may remove unclear or contextless references” where the user’s intent is not clearly indicated, including references that do not glorify violence or hate. This discretionary formulation fundamentally contradicts Rule 1’s mandatory tone.

If all unclear references must be removed (Rule 1), it is logically incoherent for Meta to also retain the discretion to remove only some (Rule 2), particularly where the content is explicitly non-glorifying. The rules pull in opposite directions: Rule 1 demands removal of ambiguity, while Rule 2 permits it selectively. Moreover, the inclusion in Rule 2 of examples like “unclear humor” or “positive references that do not glorify” widens the enforcement net even further, capturing content that is legally and morally distinguishable from hate promotion.

Rule 3 reiterates the language of prohibition: Meta “does not allow glorification, support or representation” of designated hateful ideologies, as well as “unclear references to them.” It mirrors Rule 1’s broad sweep by including both explicit and ambiguous content, again in categorical terms.

But Rule 4 marks a shift. It limits removal to only “explicit glorification, support and representation,” making no mention of unclear or ambiguous references. In doing so, it appears to elevate the threshold for enforcement, suggesting that unless a user unambiguously promotes or endorses a hateful ideology, their content is not removable under Rule 4.

Taken together, these rules generate a field of contradiction. Some provisions treat unclear references as requiring removal (Rules 1 and 3), another treats them as optionally removable (Rule 2), and the final rule does not include them at all (Rule 4).

Theoretically, one might infer a baseline principle: that any unclear or clear reference glorifying a designated hateful ideology is prohibited. However, in practice, the policy fragments this principle across four separate excerpts, each articulating different thresholds and enforcement triggers. This patchwork of obligations and permissions creates a regime where the same content may be interpreted as either mandatory for removal, permissible for retention, or outside of scope entirely – depending solely on which rule is applied.

For users, this multiplicity of standards could generate uncertainty: they are left unable to determine with any reliability what the actual enforcement criteria are. The net effect is a policy framework that fails the foreseeability requirement under the principle of legality, and that undermines user trust by obscuring where the lines of acceptable expression truly lie.

A Failure to Address Coded Extremism

The Oversight Board asked a key question in its “call for comments”: how do neo-Nazi and other extremist actors disguise their content to slip past moderation on social media? The board acknowledged that understanding how extremists disguise content was central to tackling online hate. In its final decision, it found that the Kolovrat post contained elements of white nationalism, citing references like “Slavic Army” and calls for people to “wake up” as glorifying this ideology. Yet despite recognizing the presence of coded hate, the ruling did not seriously examine the broader evasion techniques that enable white nationalist hate content to persist. Its focus remained limited to the specific posts and Meta’s enforcement framework, while the wider tactics of concealment were addressed only in passing.

The board’s ruling noted how extremists combine seemingly neutral symbols with subtle signals – like using the Odal rune (an ancient symbol appropriated by neo-Nazis), hashtags such as #DefendEurope (commonly used by anti-immigration and far-right groups), or styling posts in Fraktur font (Gothic script associated with Nazi-era propaganda). These nods showed the board understood that extremist content rarely announces itself outright. Instead, it hides in plain sight, coded in ways that resonate with in-groups but pass under the radar of moderation tools. As the Centre for Advanced Studies in Cyber Law and Artificial Intelligence, India emphasized in its submission to the board, unless platforms actively unpack the mechanics of such disguise – how extremists shape ambiguity to avoid detection – moderation will inevitably fall behind.

One of the most pervasive tactics in this arsenal is coded language, deliberately crafted to slip past moderation systems. An Al Jazeera investigation in 2020 revealed that as many as 120 Facebook pages espousing white supremacist ideology had collectively amassed over 800,000 likes, with some operating openly for more than a decade. One such page belonged to M8l8th, a Ukrainian black metal act. The use of “88” in its name – commonly understood in neo-Nazi circles as code for “Heil Hitler,” with H being the eighth letter of the alphabet – is emblematic of how extremist content conceals itself through numerical symbolism.

Another common method highlighted is the use of homoglyphs – visually similar characters substituted to bypass keyword filters. For example, if a platform were to block the term “far-right,” bad actors may re-spell it as “f4r-r!ght” to evade detection. These tactics are neither trivial nor accidental. They represent a calculated effort to craft content that appears innocuous to algorithms and casual users but carries unmistakable ideological signals for those in the know.

This is precisely where the Oversight Board’s analytical shortfall collides with its human rights obligations under the U.N. Guiding Principles on Business and Human Rights. Principle 17 requires companies like Meta to carry out due diligence that not only responds to harm but anticipates it, especially when that harm arises from the systemic misuse of platforms to spread hate through evasive tactics.

By failing to interrogate these structural vulnerabilities, the board missed a vital chance to demand stronger, future-proof enforcement standards. If the board continues to overlook how extremists exploit policy gaps, it risks becoming not a safeguard against harm, but a pillar of the very system that allows such harm to endure.

Conclusion

The Kolovrat decision reveals more than a dispute over a symbol; it lays bare the deep fault lines in Meta’s content moderation architecture. While the Oversight Board exposed Meta’s opaque enforcement, it left too much untouched: a contradictory policy and a platform still vulnerable to hate disguised in plain sight. In the end, it feels as though the board held up a mirror to Meta’s practices but failed to turn on the light. Until structural incoherence is resolved and evasive hate addressed with foresight, platform governance will remain a house on shifting sand – clear only to those who know how to read between the rules.

The post Vague by Design? The Oversight Board, Meta’s DOI Policy, and the Kolovrat Symbol Decision appeared first on Just Security.