Meta struggles with moderation in Hebrew, according to ex-employee and internal documents

https://www.theguardian.com/technology/article/2024/aug/15/meta-content-moderation-hebrew

Version 0 of 2.

Meta has system for evaluating the effectiveness of its own moderation for Arabic language content but not Hebrew

Meta is struggling with moderating content related to the Israel-Palestine war, particularly in Hebrew, despite recent changes to internal policies, new documents have revealed.

Internal policy guidelines shared with the Guardian by a former Meta employee who worked on content moderation outline a multilayered process for moderating content related to the conflict. But the documents indicate Meta, which owns the platforms Facebook, Instagram and WhatsApp, does not have the same processes in place to gauge the accuracy of moderation of Hebrew content and Arabic content.

The employee, whom the Guardian is not naming because of credible fears of professional reprisal, says Meta’s policies governing hate speech as it pertains to Palestine are inequitable, an assessment echoed by Palestinian advocates.

They also said some workers on the frontlines of the ongoing information battle surrounding the conflict feel wary of raising concerns for fear of retaliation, allegations echoed in a recent letter signed by more than 200 Meta workers. Those conditions, the former employee said, give the impression that the company’s priorities are “not about actually making sure content is safe for the community”.

The documents, current as of this spring, come as Meta and other social platforms have faced criticism for their approach to the divisive conflict, where language and moderation choices during fast-moving news events can have dire consequences. In June, a coalition of 49 civil society organizations and a number of prominent Palestinians sent Meta a letter accusing the company of “aiding and abetting governments in genocide” through its content moderation policies.

“When Palestinian voices are silenced on Meta platforms, it has a very direct consequence on Palestinian lives,” said Cat Knarr of the US Campaign for Palestinian Rights, which organized the letter. “People don’t hear about what’s happening in Palestine, but they do hear propaganda that dehumanizes Palestinians. The consequences are very dangerous and very real.”

Disparity in content moderation by language is a longstanding criticism of Meta, as the Facebook whistleblower Frances Haugen stated before a US Senate committee that, while only 9% of the social networking giant’s users were English speakers, 87% of its misinformation spending was dedicated to that category.

The content guidance documents, which were issued after the 7 October attack on Israel by Hamas and the war in Gaza, shed light on a variety of moderation policy decisions at Meta.

Among them are the company’s policies on hate speech and the boycott movement. The policies require the removal of the statements “boycott Jewish shops” and “boycott Muslim shops” but allow the phrase “boycott Arab stores,” according to internal documents.

Tracy Clayton, a spokesperson for Meta, said “in the context of and for the duration of this crisis”, Meta’s policy is to remove calls for a boycott based solely on religion, but to allow calls for boycotts of businesses “based on protected characteristics like nationality”, as they are usually “associated with political speech, or intended as a form of protest against a particular government”.

As such, the spokesperson said, the phrases “boycott Israeli shops” or “boycott Arab shops” are allowed. The policy as outlined in internal documents is more specific, stating a phrase like “no Israeli goods should be allowed here until they stop committing war crimes” is allowed, as is “boycott Arab stores”.

This policy underscores that Meta “does not have a nuanced or accurate understanding of the region”, argued Nadim Nashif, founder and director of 7amleh, a non-profit organization that advocates for Palestinian digital rights.

“The Arab identity is made up of people throughout the region from many different countries, whereas the Israeli identity is made up of individuals from one nation state that is occupying and oppressing Palestinians, as a subset of the Arab population,” he said. “Meta’s framing inherently shows a bias towards Israel for this reason.”

Gauging the effectiveness of Hebrew hate speech moderation

The recent documents also offer a new window into Meta’s ability to measure the quality of its own content moderation in Arabic and in Hebrew.

Meta has a system in place to track the “policy precision” of content enforcement in many languages. That means that the company’s quality system – using human experts – reviews the work of frontline moderators and systems and grades how well the decisions made by those moderators and systems align with Meta’s policies on what is and is not allowed on Facebook and Instagram. The quality measurement program then generates an accuracy score tracking the performance of content moderation across platforms, according to the documents and ex-employee.

The review system is in place for languages such as English, Spanish, Arabic and Thai. But for a portion of Hebrew-language content decisions, such scoring was declared “unfeasible” due to “absence of translation”, the documents show. The former employee said this was due to a lack of human reviewers with Hebrew language expertise.

Meta says that it has “multiple systems in place” to measure enforcement accuracy for Hebrew-language content moderation, including evaluation by Hebrew-speaking reviewers and auditors.

However, the documents show there is no “policy precision” measure for Hebrew-language enforcement, and the former employee said that, because Hebrew is not onboarded to the system, this type of enforcement review in the Hebrew market is done on an “ad hoc” basis, unlike in the Arabic market.

That discrepancy means that the company reviews content in the official language of Israel less systematically than that of Palestine, the former employee said. The difference implies “a bias on how they are enforcing content”, potentially leading to over-enforcement of Arabic-language content, they added.

Hebrew, spoken by approximately 10 million people, makes up a much smaller fraction of posts on Meta’s social networks than Arabic, spoken by about 400 million people. Critics say due to the ongoing war, more attention is needed to Hebrew-language content, and Meta has faced questions around its moderation of posts in Hebrew before. A 2022 independent analysis commissioned by the tech giant concluded that its moderation system had penalized Arabic speakers more often than Hebrew speakers amid in a time of heightened tensions in 2021 conflict between Israel and Palestine in 2021 – even when accounting for the disparity in number of speakers. Meta’s systems automatically flagged Arabic-language content at a “significantly higher” rate than Hebrew-language content, according to the analysis. The company’s policies “may have resulted in unintentional bias via greater over-enforcement of Arabic content compared to Hebrew content”, the report reads.

The disparity was partially attributed to the fact that Meta, at the time, had put in place an Arabic “hostile speech classifier”, allowing the automatic detection of violating content – like hate speech and calls for violence – but had not done the same for Hebrew language content. Such classifiers allow offending language to be removed automatically. The lack of a classifier in Hebrew meant content in Arabic was likely to be algorithmically removed more frequently, the analysis said. In response to the report, Meta launched a Hebrew machine-learning classifier that detects “hostile speech”. A more recent study from Human Rights Watch, published in December 2023, alleged “systemic censorship of Palestine content on Instagram and Facebook”.

Meta watchdogs say the new documents reviewed by the Guardian show that, even with the new Hebrew language classifiers, there is less of an effort to make sure those more recent measures are effective, allowing such disparities in enforcement to persist.

“This reporting shows that Meta is not taking its content moderation responsibilities seriously,” said Nashif of 7amleh.

Classifying hate speech algorithmically

In addition to hostile speech classifiers and other indicators, Meta uses collections of images, phrases and videos that allow its machine learning tools to automatically flag and remove posts that violate its policies. The algorithmic moderators match content posted to social networks with the material in the banks of content, which has been previously judged as breaking Meta’s rules.

The former employee said that their colleagues had noted that a number of images being used in one bank created after the 7 October events were added erroneously, but that there was “no process” to remove images from that bank, again leading to potential over-enforcement of content related to both sides of the Israel-Palestine conflict.

Meta disputed this characterization, stating that it was “relatively easy to remove an item from a bank if added in error”. The documents, however, state that at the time there was “no current process to remove non-violating clusters after policy calls are made that render previously-banked content benign”, leading to “over-enforcement” of content.

The former employee said that Meta has been responsive to complaints from content moderation workers in the past. However, many employees have expressed “a fear of retaliation” or being seen as antisemitic if someone were to make a complaint regarding the over-enforcement of pro-Palestine content in Arabic, despite the company’s history, they said.

“If I raised this directly, I feel my job would be on the line – it is very obvious where the company stands on this issue,” the worker said.

In an open letter, more than 200 Meta employees raised similar concerns, writing that Meta “censored, rebuffed and/or penalized” workers who spoke out against the company’s Palestine-related policies on internal forums. The workers said “any mention of Palestine is taken down” from these forums.

Clayton, the Meta spokesperson, said the company offered “many established channels” for employees to raise concerns and stated that Meta had worked to ensure content on its platforms is safe for users – including $20bn invested into “safety and security” since 2016 and 40,000 people working in the space. Meta’s posted full-year revenue for 2023 was $134.9bn. In its most recent earnings report, it posted total revenue of $39.07bn for the second quarter of 2024.

In addition to the recent letter delivered by Meta employees, the company received a separate petition in June from the US Campaign for Palestinian Rights, a pro-Palestine advocacy organization, and 49 other civil society organizations. The groups called for an end to what they allege is censorship of pro-Palestinian content, lack of transparency around moderation policies and a permissiveness toward anti-Palestine hate speech on Meta’s social networks.

In late 2023, the Massachusetts senator Elizabeth Warren sent a letter to Meta requesting more information on the company’s policies around the conflict and how they are enforced – specifically asking about whether and when the company has automatically flagged content related to certain topics over time. The letter sent in June to Meta from nearly 50 civil society organizations takes aim at the social media company over similar problems.

In response to the recent employee letter, Clayton said Meta “take[s] allegations of having content enforcement policies that disproportionately impact pro-Palestinian voices very seriously”.

“Our goal has been – and remains – to give everyone a voice, and to help ensure that our platforms remain safe spaces for people who use them,” he said.