The New York Times sued ChatGPT-maker OpenAI and Microsoft in a US court on Wednesday, alleging that the companies’ powerful AI models used millions of articles for training without permission.
Through their AI chatbots, the companies “seek to free-ride on The Times’ massive investment in its journalism by using it to build substitutive products without permission or payment,” the lawsuit said.
Copyright is becoming a major battleground for the much-hyped generative AI sector, with publishers, musicians and artists increasingly lawyering up to get paid for technology that is being built with their content.
With the suit, The New York Times also chose the more confrontational response to the sudden rise of AI chatbots, in contrast to other media groups such as Germany’s Axel Springer or the Associated Press that have entered content deals with OpenAI.
“If The Times and other news organizations cannot produce and protect their independent journalism, there will be a vacuum that no computer or artificial intelligence can fill,” said the Times’ complaint.
“Less journalism will be produced, and the cost to society will be enormous.”
The Times, one of the most respected news organizations in the United States, is seeking damages, as well as an order that the companies stop using its content for the training of AI models — and destroy data already harvested.
While no sum is specifically requested, the Times alleges that the infringement could have cost “billions of dollars in statutory and actual damages.”
OpenAI and Microsoft did not reply to requests for comment.
Microsoft, the world’s second biggest company by market capitalization, is a major investor in OpenAI, and swiftly implemented the powers of AI in its own products after the release of ChatGPT last year.
The AI models that power ChatGPT and Microsoft’s Copilot (formerly Bing) were trained for years on content available on the internet, under the assumption that it was fair to be used without compensation.
But the lawsuit, filed in a federal court in New York, argued that the use of the Times’ work was unlawful notably because the products being built created a potential rival and threatened the news company’s ability to provide quality journalism.
– Not ‘transformative’ –
The Times said it attempted to seal a content agreement with OpenAI and Microsoft, but that the companies maintained that their technology was “transformative” and therefore did not require a commercial arrangement.
“There is nothing ‘transformative’ about using The Times’ content without payment to create products that substitute for The Times and steal audiences away from it,” the lawsuit alleged.
The lawsuit also said that content generated by ChatGPT and Copilot closely mimicked New York Times style and that its output was given a privileged status in perfecting the chatbot technology.
It also said that bot-generated content that was false was sourced incorrectly to The New York Times.
“In the ocean of repetitive junk data that models are trained over, decades of news archives are very useful,” wrote competition expert Cristina Caffarra on X.
“This is about division of rents, expect a lot more of it,” she added.
The emerging AI giants are facing a wave of lawsuits over their use of internet content to build their AI systems that create content on simple prompts.
Last year, “Game of Thrones” author George RR Martin and other best-selling fiction writers filed a class-action lawsuit against OpenAI, accusing the startup of violating their copyrights to fuel ChatGPT.
Universal and other music publishers have sued artificial intelligence company Anthropic in a US court for using copyrighted lyrics to train its AI systems and in generating answers to user queries.
US photo distributor Getty Images has accused Stability AI of profiting from its pictures and those of its partners in order to make visual AI that creates original images on simple demand.
With lawsuits piling up, Microsoft and AI player Google have announced they would cover the legal fees of corporate customers sued for copyright infringement over content generated by their AI.