shutterstock_2245793859_t_schneider
11 September 2023CopyrightSarah Speight

Authors sue OpenAI for alleged theft of works in ChatGPT

Group of writers accuses OpenAI of using their works to train its generative AI tool without permission | Among authors are Pulitzer Prize winner Michael Gabon and Tony Award winner David Henry Hwang.

In the latest lawsuit against AI platforms, a group of novelists, playwrights and screenwriters has sued OpenAI for allegedly using their works to train its generative AI tool ChatGPT.

The plaintiffs comprise Michael Chabon and his wife Ayelet Waldman, David Henry Hwang, Matthew Klam, Rachel Louise Snyder and a “class of authors holding copyrights in their published works”.

In a complaint filed in California on Friday, September 8, the group accused OpenAI of incorporating their copyrighted works in datasets used to train its GPT models, which power ChatGPT.

Bestselling novelist and screenwriter Chabon won the Pulitzer Prize in 2001 for The Amazing Adventures of Kavalier & Clay, with The Mysteries of Pittsburgh, Wonder Boys and The Yiddish Policemen's Union among his other works.

Waldman, an author and screenwriter married to Chabon, wrote Love and other Impossible Pursuits, Red Hook Road and Daughter’s Keeper, among other titles.

Tony Award winner Hwang is a playwright, screenwriter, television writer, librettist, and professor whose work includes M. Butterfly, Chinglish, Yellow Face and Flower Drum Song.

Training model

ChatGPT is a type of large language model (LLM), a deep-learning algorithm which is programmed through ‘training datasets’. These datasets “consist of massive amounts of text data copied from the internet by OpenAI,” the plaintiffs claim.

As explained in the complaint, ChatGPT relies on OpenAI’s other GPT—or Generative Pretrained Transformer—products to function.

‘Generative’ represents the model’s ability to respond to text prompts; ‘Pre-trained’ refers to the model’s use of training datasets to programme its responses; and ‘Transformer’ concerns the model’s underlying algorithm.

OpenAI’s various GPT models extract information from their training datasets in order to learn the statistical relationships between words, phrases, and sentences, which in turn allow them to generate coherent and contextually relevant responses to user prompts or queries.

The authors claim that “...when ChatGPT is prompted, it generates not only summaries, but in-depth analyses of the themes present in plaintiffs’ copyrighted works, which is only possible if the underlying GPT model was trained using plaintiffs’ works.”

They add that OpenAI uses ChatGPT to “benefit commercially and profit handsomely from their unauthorised and illegal use” of the copyrighted works.

Lawsuits against gen AI platforms

The case is one of several that have been levelled recently against generative AI platforms.

In July, US comedian, actress and writer Sarah Silverman and two other authors sued OpenAI and Meta for allegedly copying content from their books to train their artificial intelligence (AI) software.

And in April, AI image platforms Stability AI, Midjourney, and DeviantArt retaliated against a trio of artists who filed a class action against them over the alleged infringement of their work by the developers’ generative AI systems.

But generative AI developers are beginning to take preemptive action.

Last week, Microsoft announced that it will take responsibility for any potential legal risks and damages arising from copyright infringement claims incurred by users of its AI tool Copilot.

This comes nearly three months after Adobe pledged to tackle transparency and IP protection in the use of generative AI, with the promise of financial indemnification for IP claims against its “IP-safe” tool Firefly.

But there are also moves by regulators around the world to tighten up on potential infringement, or clarify the blurred line between artificial and human creators.

In May, the European Parliament voted for the EU AI Act, which proposes the first law of its kind on AI by a major regulator, recently added last-minute protections to the proposal for copyright in the use of generative AI.

On the flip side, in February, the US Copyright Office ( USCO) partially rescinded the copyright for a graphic novel which used an artificial intelligence (AI) tool to generate the book’s images.

The USCO has since clarified when applicants need to disclose AI contributions to their work, holding a webinar in June  to provide additional guidance to copyright applicants who have created works with the assistance of artificial intelligence (AI).

The US-based authors in this latest case against OpenAI filed their suit at the US District Court, Northern District of California-San Francisco Division.

Representing the plaintiffs is Daniel Muller at Ventura Hersey & Muller, and Bryan Clobes at Cafferty Clobes Meriwether & Sprengel.

Counsel for the defendants have not yet appeared.

Did you enjoy reading this story?  Sign up to our free daily newsletters and get stories sent like this straight to your inbox

Already registered?

Login to your account

To request a FREE 2-week trial subscription, please signup.
NOTE - this can take up to 48hrs to be approved.

Two Weeks Free Trial

For multi-user price options, or to check if your company has an existing subscription that we can add you to for FREE, please email Adrian Tapping at atapping@newtonmedia.co.uk


More on this story

Copyright
8 September 2023   US tech giant will compensate Copilot users for legal damages arising from generative AI claims | Move comes amid growing concerns related to the potential for IP infringement lawsuits arising from generative AI.
Copyright
1 September 2023   AI systems using creative works as training data will cause tension until rights holders are fairly compensated, but the road to resolution is a rocky one, says Andrew Wilson-Bushell of Simkins.
Trademarks
13 September 2023   Name Generator offers suggestions and instant trademark search | Tool searches 200 jurisdictions to streamline branding and trademark clearance.