In generative AI legal Wild West, the courtroom battles are just getting started

Published Mon, Apr 3 202310:56 AM EDTUpdated Mon, Apr 3 202311:29 AM EDT

Key Points

As companies including Microsoft, Google and OpenAI launch generative AI to the general public, lawsuits are piling up from creative industries about copyrighted work co-opted or used by AI.
Getty Images, the photo licensing company, filed a lawsuit against Stable Diffusion, which can create photo realistic images from text, alleging that the company stole 12 million images.
Shutterstock has taken the approach of paying creators for AI-generated images that use their work, but legal experts say the lawsuits are just getting started.

This photo taken on Jan. 31, 2023, shows an artificial intelligence manga artist, who goes by the name Rootport, wearing gloves to protect his identity, demonstrating how he produces AI manga during an interview with AFP in Tokyo.

Richard A. Brooks | Afp | Getty Images

When Gordon Graham, a writer also known as That Whitepaper Guy, asked ChatGPT to define a whitepaper, he was struck by the similarities between the answer he got and the one he wrote on his website and contributed to Wikipedia. While the three-paragraph definition wasn't verbatim, it was similar enough that he suspects the AI scraped it from one of those sources.

Graham isn't alone in his concerns. As companies including Microsoft, Alphabet and OpenAI launch generative AI to the public, many in creative industries such as photography, art, writing and music are alarmed at how copyrighted work can be co-opted or used by AI. Text-to-image tools like OpenAI's DALL-E, Midjourney, Stable Diffusion, and DreamUp can render images in various styles in seconds with a few words of direction. ChatGPT can similarly create letters, articles, papers or stories based on a few prompts. Supporting the new technology are massive amounts of data — photos, artwork, articles, music, voices — pretty much anything online. While tech companies maintain that the data is used for training and the output is original, media companies and others disagree.

"Absolutely, I have concerns about the outrageous plagiarism going on with AI. My website clearly says (c) copyright on every page. Why even bother, when big tech companies can scrape anything they find on the web without even a please or thank you?" Graham said.

Getty Images leads a list of pending lawsuits

Several lawsuits are already in the works and experts say more are sure to come.

Getty Images, the photo licensing company, filed a lawsuit against Stability AI, creators of Stable Diffusion, which can create photo realistic images from text, alleging that the company copied 12 million images without permission or compensation "to benefit Stability AI's commercial interests and to the detriment of the content creators."

Stability AI, DeviantArt and Midjourney are also involved in a lawsuit that alleges that the companies' use of AI violates the rights of millions of artists. "These images were all taken without consent, without compensation. There's no attribution or credit given to the artists," said Matthew Butterick, a lawyer and computer programmer involved in the lawsuit. Butterick is also involved in another class-action lawsuit against Microsoft, GitHub, which is owned by Microsoft, and OpenAI — in which Microsoft is a major investor — alleging that GitHub's Copilot system, which suggests lines of code for programmers, does not comply with terms of licensing.

Prisma Labs, the company behind the viral Lensa app, which creates avatars, is facing a lawsuit alleging the company illegally took users' biometric data. TikTok recently settled a lawsuit with voice actress Bev Standing, who said the company used her voice without her permission for its text-to-speech feature.

While tech companies are highlighting the benefits of generative AI and are quickly integrating the technology into their products, media companies and creators see the downsides of their copyrighted work being co-opted. "Until now, when a purchaser seeks a new image 'in the style' of a given artist, they must pay a commission or license an original image from that artist. Now those purchasers can use the artist's work without compensating the artist at all," the class-action court filing against Stable Diffusion states. The similarities can sometimes be obvious — there are even instances where Stable Diffusion recreated the Getty Images watermark in its work, which Getty details in its legal filing.

Stability AI did not provide a comment by press time.

Large language models and metadata

"We are still in the Wild West days of this type of AI. Content creators are only now starting to realize that their website contents might have been surreptitiously scanned during the devising of the rapidly being released generative AI applications," said Lance Eliot, a fellow at Stanford University for the CodeX Center for Legal Informatics and CEO and founder of Techbrium, an innovation consultant to startups and enterprises.

Legal experts say the lawsuits are just getting started, and in a rush to get AI-enabled products and services out to the public, tech companies have shown some ignorance of, or disregard for data protection laws.

"I think the issue we're seeing now is that there's ignorance about there being legal implications," said Barry Scannell, a lawyer specializing in artificial intelligence, copyright, IP, technology law and data protection at William Fry LLP, an Irish law firm. "If you're using large language models, if you're using text-to-image generators, there are data protection implications," he said.

Text-to-image generators draw on metadata, a description of the data that's used to organize the data. That can include names — of the person depicted, the photographer or creator, or other information — which is considered personal data and protected under various laws.

Shutterstock, OpenAI and artist compensation

Technology has changed many jobs but creative industries have largely stayed out of the fray. Until now. Companies are selling AI-generated prints and Stable Diffusion can learn to copy an artist's style within hours. Voice actors are being asked to sign the rights to their voices so artificial intelligence can recreate synthetic versions, presumably to replace them. Writers worry about their work or style being co-opted without permission or compensation.

"Web-scraping and machine learning from huge digital databases requires massive amounts of human-created work as the sample. We want to make sure that work is legally and respectfully accessed, and that we are paid for its use," said John Degen, chief executive officer of The Writers' Union of Canada and chair of the International Authors Forum.

At least one company is looking to compensate human creators. Shutterstock, a provider of stock photography, footage and music which has worked with OpenAI since 2021 — OpenAI CEO Sam Altman has said it was "critical to the training" of its generative AI image and art platform DALL-E — has set up a contributors fund. It compensates content creators if their IP is used in the development of AI-generative models. Moreover, creators will receive royalties if new content produced by Shutterstock's AI generator includes their work. Contributors can opt out and exclude their content from any future datasets.

Meanwhile, both Microsoft and Google continue to launch additional AI-enabled capabilities across products. Microsoft announced last month it will embed OpenAI's ChatGPT into Microsoft 365 apps while Google said it will bring generative AI to Gmail and Google Docs. A Microsoft spokesperson said AI-generated content will be clearly labeled, encouraging users to review, fact-check and adjust. It will also make citations and sources easily accessible by linking to an email or document the content is based on or a citation when a user hovers over it. A Google spokesperson said Bard, the company's competitor to ChatGPT, "is intended to generate original content and not replicate existing content at length." The spokesperson also said Bard is meant to be a "complementary experience" to Google search, and includes a "Google It" button so people can move from Bard to explore information on the web.

Given how new generative AI is, it's not surprising the legal system has yet to catch up. In the meantime, companies and individuals will be duking out their rights in court.