OpenAI Embraces GPT-4 LLM for Content Moderation, Cautions Against Bias

OpenAI Embraces GPT-4 LLM for Content Moderation, Cautions Against Bias

ChatGPT-creator OpenAI is working on the development of its GPT-4 large language model (LLM) to automate‍ the process of content moderation across digital platforms, especially social media.

OpenAI is ⁤exploring the use of GPT-4’s ability ‌to​ interpret rules‌ and nuances in long content policy documentation, along with its ‍capability ​to adapt ⁢instantly to policy updates, the company said in a blog post.

“We believe this offers a more positive vision of the ⁤future of digital platforms, where AI can help moderate ⁤online ⁣traffic according to platform-specific​ policy and relieve the mental burden of a ⁢large number of human moderators,” the company said,⁢ adding that​ anyone with access to OpenAI’s API can implement their‌ own moderation system.

In contrast to⁣ the present practice of content moderation, which​ is completely manual and time consuming, OpenAI’s GPT-4 large language model can be used‌ to create custom content policies in hours, the company said.

In ⁤order to do so, data scientists and engineers can use a policy guideline crafted by policy experts and data sets containing real-life ​examples of such policy ⁢violations in order to label the ‌data.

Humans to help⁤ test AI content moderation

“Then, GPT-4 reads the policy and assigns labels to ‌the same dataset, without seeing ‍the answers. By examining ​the discrepancies between GPT-4’s judgments and those‌ of a human,‍ the policy experts can ask ‌GPT-4 to come up with reasoning behind ⁣its labels, analyze‍ the ambiguity in policy definitions, resolve ⁣confusion and provide further‍ clarification in the policy‍ accordingly,” the‌ company said.

These steps may be repeated by data scientists and engineers before the large language model can generate‍ satisfying results, it added, explaining that the iterative process yields refined content policies that are translated into classifiers, enabling the deployment of the ⁣policy and content moderation at scale.

Other advantages of using GPT-4 over the present⁣ manual approach‌ to ‌content moderation include a decrease in inconsistent ‌labelling⁣ and faster feedback loop, according‌ to OpenAI.

“People ‍may interpret policies differently ‌or some moderators may take longer to digest ⁢new​ policy changes, leading‌ to inconsistent labels. In comparison, LLMs are sensitive to granular‍ differences in wording and can instantly adapt to policy updates to offer a consistent content experience ‌for users,” the ⁢company said.

The⁤ new approach, ‌according to the ⁢company, also takes less⁣ effort in terms of training the model.

Further, OpenAI⁤ claims that this approach is different from so-called constitutional AI, under which content moderation is dependent on⁢ the‌ model’s own internalized judgment of what is safe. Various companies, ​including ​Anthropic, have taken a constitutional AI approach in training⁢ their models to be free of bias and error.

Nevertheless, OpenAI warned that⁣ undesired biases may creep into content moderation​ models during training.

“As with any AI…

2023-08-17 22:48:03
Post from www.computerworld.com rnrn

Exit mobile version