Moonbounce Secures $12 Million to Revolutionize AI-Era Content Moderation
In 2019, Brett Levenson transitioned from Apple to Facebook, aiming to enhance the platform’s business integrity amidst the Cambridge Analytica controversy. He initially believed that technological advancements could resolve Facebook’s content moderation challenges. However, he soon discovered that the issues were more profound than mere technological shortcomings.
Human moderators were tasked with memorizing extensive 40-page policy documents, often poorly translated into various languages. These moderators had approximately 30 seconds to assess flagged content, determine its compliance with policies, and decide on appropriate actions such as blocking content, banning users, or limiting content distribution. This rapid decision-making process resulted in accuracy rates barely exceeding 50%, akin to a coin toss.
Levenson remarked, It was kind of like flipping a coin, whether the human reviewers could actually address policies correctly, and this was many days after the harm had already occurred anyway.
The traditional, reactive approach to content moderation proved unsustainable, especially with the emergence of sophisticated adversarial actors. The proliferation of AI chatbots further exacerbated the problem, leading to incidents where chatbots provided harmful advice to teenagers or AI-generated images bypassed safety filters.
Motivated by these challenges, Levenson conceptualized policy as code, transforming static policy documents into dynamic, executable logic closely integrated with enforcement mechanisms. This vision materialized into the founding of Moonbounce, which recently announced a successful $12 million funding round co-led by Amplify Partners and StepStone Group.
Moonbounce collaborates with companies to implement an additional safety layer wherever content is generated, whether by users or AI systems. The company has developed a proprietary large language model capable of analyzing a client’s policy documents, evaluating content in real-time, and delivering responses within 300 milliseconds. Based on client preferences, the system can either delay content distribution pending human review or immediately block high-risk content.
Currently, Moonbounce focuses on three primary sectors:
1. User-Generated Content Platforms: Including dating applications.
2. AI Character and Companion Developers: Companies creating AI-driven interactive entities.
3. AI Image Generators: Platforms producing AI-generated visual content.
Levenson emphasized, Safety can actually be a product benefit. It just never has been because it’s always a thing that happens later, not a thing you can actually build into your product. And we see our customers are finding really interesting and innovative ways to use our technology to make safety a differentiator, and part of their product story.
The importance of integrating safety measures into products is underscored by recent developments in the industry. For instance, Tinder’s head of trust and safety highlighted how the dating platform utilized large language model-powered services to achieve a tenfold improvement in detection accuracy.
Lenny Pruss, general partner at Amplify Partners, stated, Content moderation has always been a problem that plagued large online platforms, but now with LLMs at the heart of every application, this challenge is even more daunting. We invested in Moonbounce because we envision a world where objective, real-time guardrails become the enabling backbone of every AI-mediated application.
AI companies are increasingly facing legal and reputational challenges due to content moderation failures. Incidents involving chatbots providing harmful advice to vulnerable users and AI-generated images being misused have highlighted the inadequacies of existing safety measures. As a result, AI companies are seeking external expertise to bolster their safety infrastructures.
Levenson explained, We’re a third party sitting between the user and the chatbot, so our system isn’t inundated with context the way the chat itself is. The chatbot itself has to remember, potentially, tens of thousands of tokens that have come before…We’re solely worried about enforcing rules at runtime.
Alongside Levenson, Moonbounce is co-led by Ash Bhardwaj, a former Apple colleague who previously developed large-scale cloud and AI infrastructures for Apple’s core services. The duo is now focusing on a feature called iterative steering, designed to address situations like the 2024 tragedy involving a 14-year-old Florida boy who became fixated on a Character AI chatbot. Instead of outright rejecting harmful topics, the system aims to intercept and redirect conversations, modifying prompts in real-time to guide chatbots toward more supportive responses.
Levenson elaborated, We hope to be able to add to our actions toolkit the ability to steer the chatbot in a better direction to, essentially, take the user’s prompt and modify it to force the chatbot to be not just an empathetic listener, but a helpful listener in those situations.
When questioned about potential acquisition plans, particularly by companies like Meta, Levenson acknowledged the strategic fit but expressed concerns about restricting the technology’s accessibility. He stated, My investors would kill me for saying this, but I would hate to see someone buy us and then restrict the technology. Like, ‘Okay, this is ours now, and nobody else can benefit from it.’