OpenAI on Monday said that The New York Times (NYT) is not telling the full story about the lawsuit it filed against the Sam Altman-led company and Microsoft on December 27.
“Interestingly, the regurgitations The New York Times induced appear to be from years-old articles that have proliferated on multiple third-party websites. It seems they intentionally manipulated prompts, often including lengthy excerpts of articles, in order to get our model to regurgitate,” OpenAI wrote in a blog post.
As part of the lawsuit, the NYT submitted approximately 100 examples of copyright violations that showcase ChatGPT or its underlying model returning pieces of text that are nearly identical to paragraphs published as part of NYT articles or editorial content.
However, OpenAI has claimed that even when “manipulated” prompts are used, its models “don’t typically behave the way The New York Times insinuates, which suggests they either instructed the model to regurgitate or cherry-picked their examples from many attempts.”
OpenAI said the examples put forth by NYT are not typical examples of misuse or allowed user activity. It noted that the generated texts are not a substitute for the prestigious newspaper.
OpenAI working on solving the regurgitation issue
The Sam Altman-led company said it has identified and is working on solving the “regurgitation” issue of ChatGPT, which it terms as “memorization” and said is a failure of the model training process.
Memorization, according to the company, tends to happen more commonly when particular content appears more than once in training data, in this case, NYT’s articles appearing on other websites as well.
“So we have measures in place to limit inadvertent memorization and prevent regurgitation in model outputs. We also expect our users to act responsibly; intentionally manipulating our models to regurgitate is not an appropriate use of our technology and is against our terms of use,” the company wrote in the blog post.
Experts argue over copyright claims
While there has been a lot of commentary about the NYT lawsuit against OpenAI, several technology innovators seem to be sympathizing with OpenAI’s logic.
“After reading the @nytimes lawsuit against @OpenAI and @Microsoft, I find my sympathies more with OpenAI and Microsoft than with the NYT,” Andrew Ng, one of the leading scientists in the field of AI wrote on X, formerly Twitter.
Ng claimed that just as humans are allowed to read documents on the open internet, learn from them, and synthesize brand-new ideas, AI should be allowed to do so too.
“I would like to see training on the public internet covered under fair use — society will be better off this way — though whether it actually is will ultimately be up to legislators and the courts,” the AI scientist explained in DeepLearning.AI’s weekly newsletter.
Somewhat supporting OpenAI’s claims, Ng further said that the examples of violations put…
2024-01-09 05:00:05
Article from www.computerworld.com rnrn