shutterstock_2195094943_zhuravlev_andrey
21 April 2023FeaturesCopyrightMuireann Bolger

AI strikes back: the ongoing wrangle between developers and creators

As the potential of generative artificial intelligence (AI) continues to astound—and in many cases, alarm—it seems that the legal path ahead of this revolutionary technology remains as unpredictable as ever.

This week saw AI developers Stability AI, Midjourney, and DeviantArt hit back at a trio of artists who filed a class action against them over the alleged infringement of their work by the developers’ generative AI systems.

The companies filed a counter motion at the US District Court for the Northern District of California on April 18, asking for the dismissal of the lawsuit, arguing that the tool at the centre of the dispute—Stable Diffusion—merely enables creativity, and does not infringe.

In February, artists Kelly McKernan, Sarah Andersen and Karla Ortiz, brought a complaint against Stability AI, arguing that Stable Diffusion used their work for training purposes without their consent or compensation.

The resulting artwork, they argued, could ultimately compete in the marketplace with the originals, as potential buyers will be able to access the artists’ works to generate new works without compensating them.

A ‘flawed argument’

At the time, a spokesperson for Stability AI told WIPR that the plaintiffs have a deeply flawed perception of the purpose of generative AI.

“The allegations in this suit represent a misunderstanding of how generative AI technology works and the law surrounding copyright. We intend to defend ourselves and the vast potential generative AI has to expand the creative power of humanity,” they said.

This counter motion comes amid heated debate prompted by the fast-paced development of generative AI, and whether its actions contravene IP rights.

Back in January, Getty Images also sued Stability AI in London for using images for training purposes without its permission, while Microsoft, Microsoft’s GitHub, OpenAI and Midjourney have faced infringement lawsuits over the alleged use of open source code to train their machine learning (ML) systems.

More recently, multiple streaming services were forced to remove a viral track, Heart on My Sleeve, which uses AI versions of vocals by musicians Drake and The Weeknd after a complaint was lodged by Universal Music Group.

Meanwhile, Berlin-based artist Boris Eldagsen pulled a high-profile publicity stunt by refusing the first prize from this year's Sony World Photography Award competition, after revealing that it was generated by AI.

And earlier this month, deepfake app company Reface faced a proposed class action lawsuit in California, over claims that its popular app violates the state’s rights of publicity statute.

Training models ‘do not infringe’

In its US motion this week, Stability AI fired a forceful rejection of the claims against its business model.

According to the filing, the artists’ suit should be dismissed with prejudice as they had presented “scattershot claims” that fail to cite any facts to support their claims, including an act of direct infringement or any output that is substantially similar to their artwork.

The developer further insisted that the court should dismiss the suit because the plaintiffs did not register their copyrights before filing suit.

Stability AI went on to insist that Stable Diffusion was trained on billions of images that were publicly available on the internet, and that training a model “does not mean copying or memorising images for later distribution”.

“Indeed, Stable Diffusion does not ‘store’ any images. Rather, training involves development and refinement of millions of parameters that collectively define—in a learned sense—what things look like. Lines, colours, shades, and other attributes associated with innumerable subjects and concepts.”

The purpose of doing so, it added, is not to enable the models to reproduce copies of training images.

“If someone wanted to engage in wholesale copying of images from the internet, there are far easier methods to do so, including cutting and pasting an image and perhaps using tools like Photoshop to edit them.”

A ‘hot topic’

According to Matt Hervey, head of AI at Gowling, generative AI will remain a hot, and somewhat thorny, topic for lawyers and courts for quite some time to come.

“The outputs of the generative AIs are remarkable: they are good enough to win awards, to be used for magazine covers and to be accepted into stock image libraries,” he reflects.

But as he points out, the crux of the issue is that the power of current generative AI largely depends on vast training sets—in some cases the whole ‘crawlable’ internet.

For example, GPT-3 was trained using multiple databases of text including Wikipedia, books hosted on the internet and Common Crawl, a database of text extracted monthly from billions of web pages.

The big question for courts to grapple with is whether such actions are infringing or fair use.

The current cases, explains Hervey, should ideally test the boundaries of US and UK exceptions for text and data mining.

“They may also delve into jurisdictional differences, the technical aspects of training data and models, and what damages are appropriate when an AI is trained on billions of works,” he says.

“Many companies will be interested to see the courts’ approaches to scenarios where data is mined from copyright works in one jurisdiction and then extracted data or trained models are used in another jurisdiction.”

No straightforward answer?

However, others argue that there is simply no straightforward answer. Mark Lemley, professor of law at Stanford Law School, has argued that, consequently, the courts face an unenviable task.

In a co-authored paper, Fair Learning published in the Texas Law Review, he held that because ML training sets are likely to contain millions of different works with thousands of different owners, “there is no plausible way to simply licence all of the underlying photographs, videos, audio files, or texts for the new use”.

And in an interview with  Forbes, David Holz, founder and CEO of Midjourney, staunchly defended AI developers.

“There isn’t really a way to get a hundred million images and know where they’re coming from,” he said. “It would be cool if images had metadata embedded in them about the copyright owner or something. But that's not a thing; there's not a registry.

“There’s no way to find a picture on the internet, and then automatically trace it to an owner and then have any way of doing anything to authenticate it.”

Time for reckoning

On the other side of the debate, Mark Milstein, a co-founder of vAIsual, a technology company developing algorithms and solutions to provide licensed content to generative AI developers, profoundly disagrees with this stance.

In fact, he insists that a reckoning is fast coming for generative AI developers who refuse to take the proper legal precautions ahead of unveiling new works.

“Most of the generative AI platforms have argued that ‘fair use’ protected them against any litigation concerning their rapacious vacuuming of copyrighted visual content for training,” Milstein told WIPR.

This, he argues, simply doesn’t bear scrutiny.

“They have opened a hornet’s nest by approaching their training sets in completely the wrong way.”

There is a need, he explains, for more pioneers of clean data—and AI developers need to support, not thwart, this desirable aim.

Further expanding on this point in a LinkedIn post, Milstein wrote: “Without legally clean training data, any generative algorithm (Stability AI, Midjourney, etc) that's bolted on to a DAM will produce nothing more than a stream of copyright infringing, legally toxic content that results in your clients being targets for all kinds of possible legal action.”

With such divergent opinions, this particular path looks set to twist and turn—with no clear end destination in sight.

Today’s top stories

EC’s controversial FRAND plan could 'create legal barriers'

Mercedes-Benz gears up for logo showdown

Already registered?

Login to your account

To request a FREE 2-week trial subscription, please signup.
NOTE - this can take up to 48hrs to be approved.

Two Weeks Free Trial

For multi-user price options, or to check if your company has an existing subscription that we can add you to for FREE, please email Adrian Tapping at atapping@newtonmedia.co.uk


More on this story

Copyright
2 February 2023   In part II of a series on art and AI, Muireann Bolger hears how lawsuits against machine learning ‘artists’ may end with Spotify-style databases of licensed works.
Copyright
23 February 2023   A British TV show that pits deepfaked celebrities as fictional ‘neighbours’ shows how far the tech has developed—but can unauthorised fakes be stopped, asks Sarah Speight.
Patents
25 May 2023   Focusing on a practical use for computer-based inventions is one tactic to ensure patent applications don’t short circuit, say Alexander Taousakis and Bradford Fritz of Birch, Stewart, Kolasch & Birch.