Comedian and novelists say ChatGPT and Meta copied books for AI tools
US comedian Sarah Silverman and other authors accuse Meta and OpenAI of copying material to train AI software | Separate case sees authors Mona Awad and Paul Tremblay level a similar claim against OpenAI.
US comedian, actress and writer Sarah Silverman and two other authors are suing Meta and the owner of ChatGPT for allegedly copying content from their books to train their artificial intelligence (AI) software.
Silverman, author of her memoir The Bedwetter, was joined by novelist Richard Kadrey, who wrote the supernatural noir series Sandman Slim; and horror writer Christopher Golden, whose books include Ararat; to file the two separate lawsuits on the same day, Friday, July 7, at a California court.
The trio accuse the two tech giants of using their copyrighted material without permission, credit or compensation, in large-language models (LLMs) used in AI chatbots.
As described by the complaint, “AI software is designed to algorithmically simulate human reasoning or inference, often using statistical methods”. In other words, AI software is used to generate text in response to human prompts.
They claim that Meta has copied their work for use in its LLaMA (Large Language Model Meta AI) training dataset. Launched in February, Meta describes this dataset as “a state-of-the-art foundational large language model designed to help researchers advance their work in this subfield of AI”.
The authors, however, claim that much of the material in Meta’s training dataset is derived from copyrighted works—including books written by the plaintiffs.
The way Meta’s LLaMA functions—thereby enabling the alleged infringement, the authors claim—was leaked in March to a public internet site, adding that “Meta has not disclosed what role it had, if any, in the leak.”
Later that month, Meta issued a DMCA takedown notice to a programmer on GitHub who had released a tool that helped users download the leaked LLaMA language models.
‘Very accurate summaries’
Meanwhile, the authors claim that OpenAI, owner of generative AI tool ChatGPT, has used their books for generating summaries of the texts, alleging that: “When ChatGPT was prompted to summarise books written by each of the plaintiffs, it generated very accurate summaries.”
ChatGPT is powered by two LLMs—GPT-3.5 and GPT-4.
As explained in the complaint, ‘GPT’ stands for ‘generative pre-trained transformer’; ‘pre-trained’ refers to the use of textual material for training; ‘generative’ refers to the model’s ability to emit text; and ‘transformer’ refers to the underlying training algorithm.
The plaintiffs argued that: “The summaries get some details wrong. This is expected, since a large language model mixes together expressive material derived from many sources. Still, the rest of the summaries are accurate, which means that ChatGPT retains knowledge of particular works in the training dataset and is able to output similar textual content.”
The complaints were submitted at the US District Court, Northern District of California, San Francisco Division.
More author complaints
In a similar case filed on June 28 in the same court, two other US-based authors are also suing OpenAI for allegedly copying their works in ChatGPT.
Horror and sci-fi writer Paul Tremblay’s books include The Cabin at the End of the World; and Mona Awad’s ‘darkly comic’ books include 13 Ways of Looking at a Fat Girl and Bunny.
Similar to Kadrey, Silverman and Golden’s case, Tremblay and Awad allege that ChatGPT used material from their books to generate “very accurate” summaries of their titles.
All three complaints are class actions and demand a jury trial, damages and injunctive relief. The plaintiffs are represented by the same attorneys— Joseph Saveri and other counsel at Joseph Saveri Law Firm, and Matthew Butterick of Butterick Law.
WIPR has contacted counsel for the plaintiffs, as well as OpenAI and Meta, without immediate response.
Did you enjoy reading this story? Sign up to our free daily newsletters and get stories sent like this straight to your inbox
Already registered?
Login to your account
If you don't have a login or your access has expired, you will need to purchase a subscription to gain access to this article, including all our online content.
For more information on individual annual subscriptions for full paid access and corporate subscription options please contact us.
To request a FREE 2-week trial subscription, please signup.
NOTE - this can take up to 48hrs to be approved.
For multi-user price options, or to check if your company has an existing subscription that we can add you to for FREE, please email Adrian Tapping at atapping@newtonmedia.co.uk