The nonprofit artificial intelligence research outfit OpenAI Inc. today announced a major update to its free content moderation tool, and it’s available now to all developers.
The updated Moderation endpoint enables access via an application programming interface to OpenAI’s Generative Pre-trained Transformer-based classifiers, which are AI models that have been trained to detect undesirable content in applications, the company said in a blog post.
When it receives a text input, the Moderation endpoint will analyze it to see if it contains anything that ought to be filtered out, such as sexual content, hateful or violent speech, or messages that promote self-harm. It will weed out and block all content prohibited by OpenAI’s content policy.
The improved version of Moderation endpoint has been designed to be quick, accurate and perform robustly across multiple types of applications, including AI-powered chatbots, messaging systems and social media sites. Importantly, OpenAI said it significantly reduces the chances of an AI model “saying” the wrong thing. That means AI can be used in more sensitive settings, such as educational applications, where people may have previously had reservations about deploying the technology.
The Moderation endpoint is free when used with content generated by the OpenAI API. For instance, Theai Inc., the company behind Inworld AI, uses OpenAI’s tools to enable developers to create AI-powered virtual characters for the metaverse, virtual worlds and virtual reality games. Inworld relies on the Moderation endpoint to ensure those characters stay “on-script” and don’t start talking about anything untoward. That allows it to focus more on creating memorable characters, rather than worrying about what those characters are saying.
As well as moderating bots, the Moderation endpoint can also block harmful content that isn’t generated by OpenAI’s APIs, but rather, by humans. The anonymous messaging platform NGL, which provides a platform for young people to share their feelings and opinions, uses OpenAI’s tool to detect hateful language and bullying. NGL said the Moderation endpoint is uniquely able to generalize around the latest slang, allowing it to match the evolution of the language used by teenagers, for example.
In the case of non-API traffic, the Moderation endpoint is subject to a fee.
OpenAI said developers can get started with Moderation endpoint by checking out its documentation. It has also published a paper detailing its training process and performance, together with an evaluation dataset that it said will hopefully inspire further research in the area of AI-powered moderation.