The Ethics of Content Moderation: Who Protects the Protectors?

Many professions are known to cause psychological trauma. One can imagine the horrors faced by soldiers, disaster relief personnel, ER staff, law enforcement officers, and war journalists (among others). However, the rise of digital lifestyles has created a relatively new profession that can cause long-term psychological damage: content moderation. Content moderation is the process of assessing user-generated web content for suitability and deciding whether to allow it on a platform. Due to the vast amount of content continuously posted on websites and social media platforms, content moderation has become a critical function of web engagement. Content moderators are the individual workers who review questionable text, images, and videos in order to remove content that is inappropriate for general consumption or violates the platforms’ policies. Although recent investigations and inquiries (notably into Facebook and Google) have brought to light some of the psychological harms that content moderators experience, there is no clear path to making their work less traumatic. This has become not only an ethical issue but also a financial one. Facebook recently paid a $52 million settlement to its vendor-based content moderators in a class-action lawsuit over psychological damage. TikTok is facing similar suits. What follows is a discussion of some of the ethical issues in this field and possible ways to address them.

Types of Unwanted Content

Content moderators’ work varies based on the type of platform and unwanted content reviewed. Moderating content on an online marketplace or consumer goods site, for example, would look very different from moderating content on a social platform. In addition, there are many levels of “unsuitable” content. This includes spam, copyright infringement, false accounts, and repetitive posts, which hamper a site’s ability to conduct business or detract from the user experience. It also includes toxic content that could cause harm to users, which could be anything from misinformation, privacy violations, bullying, hate speech, and nudity to graphic photos and videos of homicide, suicide, terrorist acts, rape, and torture. In many cases, the victims are young children and animals. Some content moderators must view many hundreds of such posts per day, week after week, causing psychological damage that can persist well after they discontinue this work.

Effects of Toxic Content on Mental Health

The effects of toxicity on moderators’ mental health are directly related to their degree of exposure to toxic content. This is true regardless of whether content moderators work for the platforms themselves or outsourcing companies. In-house moderators enjoy larger compensation packages, a more pleasant work environment, a more flexible schedule, and comprehensive mental health benefits like private psychotherapy and psychiatric care. In contrast, according to recent investigations, content moderators working for contractors may face unpleasant working conditions, demanding volume and accuracy targets, restrictive schedules, rigid rules, and unflagging pressure to perform (or be fired). However, given enough exposure to toxic content, in-house and outsourced content moderators experience similar mental health consequences. These include:

Post-Traumatic Stress Disorder (PTSD) – symptoms include mood disturbances, reduced productivity, nightmares/flashbacks, sleeplessness, fatigue, avoidance of certain situations, anger, fear/paranoia, and sadness.
Panic attacks – for example, some content moderators report panic attacks in the presence of children and animals because they fear that serious harm will come to them.
Anxiety – this can be severe enough to disrupt daily life, as fears and sensitivities can cause normal activities and relationships to become untenable.
Depression – prolonged exposure to disturbing content can lead moderators to withdraw from loved ones and feel overwhelmed by sadness, apathy, and suicidal thoughts.
Self-destructive habits – these include abusing alcohol and drugs and engaging in indiscriminate sexual contact. Such behaviors have been reported in the workplace, presumably as an emotional escape from toxic content.
Inappropriate (dark and disturbing) humor and language – for example, jokes about cruelty, graphic violence, or sexual assault.
Adoption of fringe views – these may include conspiracy theories and fringe views like the flat-Earth theory. Repeated exposure to such material without alternate viewpoints can become persuasive.

Role of AI in Content Moderation

Artificial Intelligence is seen as a promising avenue for reducing the emotional load on human content moderators. On large platforms, like Facebook, artificial intelligence systems already detect over 90% of toxic content. According to Facebook CEO Mark Zuckerberg, AI removed more than 95% of hate speech and 98-99% of terrorist content last year. But the amount of harmful content that requires human moderation is still overwhelming. And while automated content moderation is consistent, scalable, and multilingual, machine learning systems lack the ability to understand context or detect nuance. This is especially problematic in memes, where the text, images, and context must be analyzed as a whole to extract meaning and intent.

Here are some areas in which AI can help with harmful content:

Automated moderation – any disturbing content that AI systems can detect and remove prior to human moderation reduces the emotional load on moderators.
Modifying the content – AI-powered, moderator-controlled blurring, grayscale, and muting of audio have been found to significantly reduce emotional response and psychological impact on moderators reviewing disturbing images and videos.
Providing key information before viewing – in some cases, moderators could classify content based solely on associated text produced by AI, eliminating the need to view the content directly. In addition, AI-based Virtual Question Answering (VQA) systems can answer questions about content that may allow a moderator to make a decision without consuming the content itself.
Prioritizing/Triage – AI can provide urgency and toxicity ratings for each piece of content based on previous data and strategically mix the types of content presented in moderators’ queues. Moderators can also manually sequence or group the content to give themselves mental breaks.

Best Practices to Protect Moderators’ Mental Health

In addition to AI solutions, employers can protect content moderators’ mental health through a variety of practices and services starting from the recruiting stage and continuing into the post-employment period:

Honest and accurate job descriptions – many moderators report being given deliberately vague job descriptions during the recruitment phase, in which the amount and types of disturbing content were either grossly underrepresented or not mentioned at all.
Mental health pre-screening – a simple pre-screening can rule out candidates with mental health histories or tendencies that would make them more vulnerable to psychological harm.
Resiliency training – moderators can be trained in psychological strategies to improve their resiliency and deal with disturbing content in healthy ways.
Exposure Limits – exposure to disturbing content should be limited in quantity (for example, 1-2 hours per week) and overall duration (for example, six months). These numbers require additional research but will need to be far lower than current exposure rates.
Scheduled wellness time – allowing moderators sufficient time and space to decompress after stressful moderation experiences can help mitigate the negative effects of toxicity.
Access to comprehensive mental health services at work – many companies offer these in some form, but they are largely inadequate and ineffective.
Post-employment mental health services – former moderators should be able to continue their treatment post-employment, if required.
Competitive pay and a pleasant work environment – pleasant working conditions, mutual respect and support from employers, and fair compensation can alleviate some of the stress that content moderators face on a daily basis.

Conclusion

Web platforms are ostensibly a place where users can share their knowledge and views in an open, safe, low-stress environment. In this largely anonymous setting, however, honest opinions and comments quickly give way to offensive content. Given that intense posts generate higher user engagement than more measured posts, content moderation policies must weigh potential business opportunities against potential harm. Further, since the web provides an easy medium to proliferate extreme content and find like-minded viewers, such users are emboldened and motivated to share evidence of harmful acts. The burden of consuming, reviewing, and assessing this ever-increasing barrage of toxic content falls to content moderators. This causes untold damage to moderators’ mental health. For content moderation to be sustainable in the face of intensifying demand, web platforms must find safe ways for moderators to perform this essential work. Sustainable, scalable content moderation entails strict limits on exposure, judicious use of automation, sufficient mental health support, and psychological screenings and tools to prevent harm and build resilience.

Innodata, a large-scale content moderation provider, has developed a comprehensive program of training, mentorship, and support for its content moderators, enabling them to do their work accurately and safely. Companies and platforms must not force content moderators to sacrifice their mental health to protect ours.

The Ethics of Content Moderation: Who Protects the Protectors?

Types of Unwanted Content

Effects of Toxic Content on Mental Health

Role of AI in Content Moderation

Best Practices to Protect Moderators’ Mental Health

Conclusion

Accelerate AI with Annotated Data

Check Out this Article on Why Your Model Performance Problems Are Likely in the Data

About

Company

Contact