shutterstock_2269490437_rafapress
10 July 2023CopyrightSarah Speight

Comedian and novelists say ChatGPT and Meta copied books for AI tools

US comedian Sarah Silverman and other authors accuse Meta and OpenAI of copying material to train AI software | Separate case sees authors Mona Awad and Paul Tremblay level a similar claim against OpenAI.

US comedian, actress and writer Sarah Silverman and two other authors are suing Meta and the owner of ChatGPT for allegedly copying content from their books to train their artificial intelligence (AI) software.

Silverman, author of her memoir The Bedwetter, was joined by novelist Richard Kadrey, who wrote the supernatural noir series Sandman Slim; and horror writer Christopher Golden, whose books include Ararat; to file the two separate lawsuits on the same day, Friday, July 7, at a California court.

The trio accuse the two tech giants of using their copyrighted material without permission, credit or compensation, in large-language models (LLMs) used in AI chatbots.

As described by the complaint, “AI software is designed to algorithmically simulate human reasoning or inference, often using statistical methods”. In other words, AI software is used to generate text in response to human prompts.

They claim that Meta has copied their work for use in its LLaMA (Large Language Model Meta AI) training dataset. Launched in February, Meta describes this dataset as “a state-of-the-art foundational large language model designed to help researchers advance their work in this subfield of AI”.

The authors, however, claim that much of the material in Meta’s training dataset is derived from copyrighted works—including books written by the plaintiffs.

The way Meta’s LLaMA functions—thereby enabling the alleged infringement, the authors claim—was leaked in March to a public internet site, adding that “Meta has not disclosed what role it had, if any, in the leak.”

Later that month, Meta issued a DMCA takedown notice to a programmer on GitHub who had released a tool that helped users download the leaked LLaMA language models.

‘Very accurate summaries’

Meanwhile, the authors claim that OpenAI, owner of generative AI tool ChatGPT, has used their books for generating summaries of the texts, alleging that: “When ChatGPT was prompted to summarise books written by each of the plaintiffs, it generated very accurate summaries.”

ChatGPT is powered by two LLMs—GPT-3.5 and GPT-4.

As explained in the complaint, ‘GPT’ stands for ‘generative pre-trained transformer’; ‘pre-trained’ refers to the use of textual material for training; ‘generative’ refers to the model’s ability to emit text; and ‘transformer’ refers to the underlying training algorithm.

The plaintiffs argued that: “The summaries get some details wrong. This is expected, since a large language model mixes together expressive material derived from many sources. Still, the rest of the summaries are accurate, which means that ChatGPT retains knowledge of particular works in the training dataset and is able to output similar textual content.”

The complaints were submitted at the US District Court, Northern District of California, San Francisco Division.

More author complaints

In a similar case filed on June 28 in the same court, two other US-based authors are also suing OpenAI for allegedly copying their works in ChatGPT.

Horror and sci-fi writer Paul Tremblay’s books include The Cabin at the End of the World; and Mona Awad’s ‘darkly comic’ books include 13 Ways of Looking at a Fat Girl and Bunny.

Similar to Kadrey, Silverman and Golden’s case, Tremblay and Awad allege that ChatGPT used material from their books to generate “very accurate” summaries of their titles.

All three complaints are class actions and demand a jury trial, damages and injunctive relief. The plaintiffs are represented by the same attorneys— Joseph Saveri and other counsel at Joseph Saveri Law Firm, and Matthew Butterick of Butterick Law.

WIPR has contacted counsel for the plaintiffs, as well as OpenAI and Meta, without immediate response.

Did you enjoy reading this story?  Sign up to our free daily newsletters and get stories sent like this straight to your inbox

Already registered?

Login to your account

To request a FREE 2-week trial subscription, please signup.
NOTE - this can take up to 48hrs to be approved.

Two Weeks Free Trial

For multi-user price options, or to check if your company has an existing subscription that we can add you to for FREE, please email Adrian Tapping at atapping@newtonmedia.co.uk


More on this story

Global Trade Secrets
10 August 2023   Meta was previously accused of stealing source code via a former employee of tech startup | Facebook allegedly stole code for ‘groundbreaking’ AI algorithms | Parties settle with ‘confidential agreement’ ahead of jury trial.
Patents
31 August 2023   Judge Newman affirms lower court’s decision to invalidate a patent covering web browsing habits | Patent owner originally sued Facebook, claiming the 'News Feed' feature infringed its user-experience technology.
Trademarks
13 September 2023   Name Generator offers suggestions and instant trademark search | Tool searches 200 jurisdictions to streamline branding and trademark clearance.