Bloomberg hits back at authors over claims it stole works to train genAI tool
The news agency argues fair use since BloombergGPT is a non-profit research model | Plaintiffs include former US governor and presidential candidate Mike Huckabee and Christian author Lysa TerKeurst | ‘Books3’ training data cited in complaint is “vague and conclusory.
News agency Bloomberg has filed a motion to dismiss a lawsuit from several authors who claim it stole their works to train a generative AI tool “for the purpose of making a profit”.
Former governor of Arkansas and former Republican presidential candidate Mike Huckabee—along with fellow authors Lysa TerKeurst, David Kinnaman, Tsh Oxenreider and John Blase—filed a class action complaint against Bloomberg, Meta, Microsoft and the EleutherAI Institute in October 2023.
In their complaint, the authors accuse the tech firms of using the controversial ‘Books3’ dataset—which they allege contains almost 200,000 pirated ebooks, including their own works—to teach their large language models (LLMs).
One of the accused models is BloombergGPT, described as a large-scale, “50-billion-parameter” generative AI model intended for financial analysis.
The authors claim that Bloomberg confirmed via email that Books3 was used to train the initial model of BloombergGPT. They added that the company said it “will not include the Books3 dataset among the data sources used to train future versions of BloombergGPT.”
But in its motion to dismiss the claims against it, filed on March 22 in a New York court, Bloomberg argues fair use since the tool is a non-profit research model to explore “potential uses” of generative AI in the financial industry.
Bloomberg: Complaint is ‘vague and conclusory’
The company says it released a research paper (on March 30, 2023) detailing the development, performance and potential of BloombergGPT.
It argues that the authors do not specify which works they claim the tool is trained on, and when.
“Plaintiffs’ entire copyright infringement claim, therefore, rests on the vague, conclusory allegation that Bloomberg, at some unspecified time and in some unspecified manner, used the Books3 database,” Bloomberg argues.
“Seizing on nothing more than Bloomberg’s press release and news articles regarding that paper, plaintiffs allege that Bloomberg violated their copyrights.”
Their “vague and conclusory statements are insufficient bases on which to state a viable claim. And what plaintiffs fail to plead speaks volumes.”
Bloomberg concluded: “Simply put, a limited and private use of copyrighted works by a news reporting enterprise to teach a not-for-commercial-use AI model as part of an internal research project into the capabilities of generative AI, is not copyright infringement.”
The case is Huckabee v Meta Platforms, US District Court for the Southern District of New York.
Bloomberg is represented by Nicole Jantzi, Paul Schoenhard and Amir Ghavi at Fried, Frank, Harris, Shriver & Jacobson.
Counsel for Mike Huckabee et al is Seth Haines, Tim Hutchinson and Lisa Geary of RMP Law; Scott Poynter of Poynter Law Group; and Greg Gutzler, Adam Levitt, Amy Keller and James Ulwick of DiCello Levitt.
Did you enjoy reading this story? Sign up to our free daily newsletters and get stories sent like this straight to your inbox
Already registered?
Login to your account
If you don't have a login or your access has expired, you will need to purchase a subscription to gain access to this article, including all our online content.
For more information on individual annual subscriptions for full paid access and corporate subscription options please contact us.
To request a FREE 2-week trial subscription, please signup.
NOTE - this can take up to 48hrs to be approved.
For multi-user price options, or to check if your company has an existing subscription that we can add you to for FREE, please email Adrian Tapping at atapping@newtonmedia.co.uk