Why Automating Content Moderation is Not Enough

by Pipl Team on Sep 17, 2023

As social media platforms, forums, and other digital communities have become modern-day public squares, online content moderation has emerged as the unsung custodian of digital integrity.

Tasked with sifting through an ever-expanding universe of user-generated content, the role of trust and safety teams has grown in both volume and complexity. Balancing growth goals and free speech with the need to curtail online scams, abuse, and other forms of online toxicity is an ever-evolving challenge that often feels like a game of digital whack-a-mole. As quickly as one issue is addressed, another pops up, each one accompanied by its own set of moral, legal, and social dilemmas that make the task increasingly daunting.

Manual content moderation has proven inadequate for stemming the overwhelming tide of harmful user generated content (UGC) online. While increased automation is clearly needed, a myopic focus on detecting individual instances of malicious content often misses the forest for the trees.

The uncomfortable truth is that over 90% of online scams, harassment, and other policy-violating behaviors come from inauthentic user accounts. As long as the ecosystem of fake identities persists, bad actors will continue devising new ways to exploit online platforms.

To truly scale safety efforts and future-proof against emerging threats, platform providers must make identifying inauthentic accounts a top priority. By targeting the root cause rather than continuously playing whack-a-mole with each new incarnation of bad content, we can eliminate the bulk of harm with a scalable and affordable solution. The time has come to address the inauthentic identity epidemic undermining trust and safety.

Charting the Evolutionary Path of Content Moderation

Our early internet interactions were quite different from what we experience today. Platforms like early chat rooms and forums had a quaint, neighborhood feel. It was like a community garden, where everyone took turns tending and watching over it. But as we journeyed deeper into the digital age, these platforms mushroomed into sprawling digital cities, and monitoring user generated content became comparable to managing the hustle and bustle of New York City during rush hour.

The Reign of Manual Moderation

In the early days of Facebook, the platform relied largely on its users to flag inappropriate content for human moderators to review. When the user base was small, this approach leveraged the nuanced judgment of people to identify policy violations. As the network ballooned into a massive global platform, this manual approach quickly hit its limits.

The volume of user generated content was soon far more than a team of humans could properly oversee. The solution was obvious: automation and AI would need to shoulder the bulk of moderation for it to scale. The challenge lay in determining what exactly to automate amidst an evolving ecosystem of harmful behaviors.

Automation's Allure: Speed Without Traction

As digital platforms like YouTube exploded with billions of hours of user uploads, companies realized they needed algorithms to automatically filter and categorize content. While these algorithms excelled at identifying explicit content based on predetermined parameters, they stumbled when confronted with the sophistication of rapidly adapting malicious actors. Occasionally they would flag benign content as harmful. It wasn’t (and still isn’t) a perfect science.

However, this approach, while accelerating review, focused primarily on identifying harmful actions rather than the actors themselves. This puts platforms in a perpetual game of catch-up. Each new wave of malicious strategies demands a corresponding update in detection algorithms—a reactive cycle that struggles to get ahead of those bent on sowing discord or spreading harm.

Targeting the Actor, Not Just the Actions

A deeper analysis of moderated content revealed a recurring theme - the use of inauthentic or fake identities to commit crime, spread misinformation, and instigate other harms. Imagine a play where a single actor plays multiple roles, using various disguises. If you focus only on each individual role (or action), you're constantly on the defensive. But by identifying and addressing the actor behind these roles, we can stop a multitude of deceptive acts.

Trusted identities, in contrast, rarely engage in malicious harmful behavior. They are built over time and made up of multiple traditional identifiers sourced from around the world. They also demonstrate connections with other people, organizations and places. And, they can’t be faked by scammers and bad actors.

Refocusing Automation Using the Power of Identities

Targeting the root cause rather than the symptom offers a more sustainable and scalable solution. By channeling automation towards separating trusted versus counterfeit identities, we can pre-emptively thwart a vast chunk of harmful content. This not only streamlines content moderation but also makes it more resilient against future threats.

With an identity trust solution, organizations can verify the identity behind user-generated content and enhance their moderation efforts—because a trusted identity will behave more reliably.

On the other side of the user spectrum, a proven trust solution can help moderators quickly—even automatically—assess when an individual is a risk to the platform, from the very beginning at signup or anywhere along their journey.

This approach is forward-thinking, as it narrows down the battlefield. By pinning down and acting against these fake profiles, platforms can effectively eliminate the primary culprits behind malicious content dissemination. It's an approach that is both proactive and future-proof, ensuring that as new malicious tactics emerge, they're met with a robust defense that stops them at their origin.

Bolster the Process with an Identity Trust Solution

Integrating an identity trust solution into the content moderation process empowers trust and safety teams to be even better at what they do best. Speed and accuracy of moderation efforts is improved, and non-trusted identities can get the scrutiny they deserve.

The ever-evolving landscape of content moderation emphasizes the necessity of proactive and preventative approaches. By shifting our focus from merely responding to individual actions to targeting the actors orchestrating them, we not only address today's challenges but also fortify our defenses against the threats of tomorrow.

The end result is an improved experience for good users, a safer environment for online interaction and a trust & safety team that can feel better and perform better.

All Posts

Share this