Adobe faces class-action lawsuit over alleged AI training on pirated works

Adobe is facing fresh legal scrutiny over its AI ambitions, with a new lawsuit accusing the company of training one of its language models on pirated books without permission.

Sarthak Singh

December 18, 2025 / 20:49 IST

Adobe

Adobe’s aggressive push into artificial intelligence has landed it in legal trouble. A proposed class-action lawsuit alleges that the software giant trained one of its AI models using pirated books, including works written by the plaintiff.

The lawsuit was filed on behalf of Elizabeth Lyon, an author based in Oregon, and claims that Adobe used unauthorised copies of numerous books to train its SlimLM language model. According to the complaint, Lyon’s own writing was included in the training data without her consent.

Adobe describes SlimLM as a family of small language models designed for document assistance tasks, particularly on mobile devices. The company has said SlimLM was pre-trained using SlimPajama-627B, an open-source dataset released by AI chipmaker Cerebras in June 2023. That dataset is described as a large, deduplicated collection drawn from multiple sources.

Lyon’s lawsuit, first reported by Reuters, challenges that explanation. It argues that SlimPajama is itself derived from another dataset, RedPajama, which in turn incorporates the controversial Books3 collection. Books3 is a massive archive of roughly 191,000 books that has been widely used to train generative AI models and has become a flashpoint in copyright disputes.

According to the filing, SlimPajama was created by copying and modifying RedPajama, including material from Books3. Because of that lineage, the lawsuit claims SlimPajama contains copyrighted works belonging to Lyon and other authors, making Adobe’s use of the dataset unlawful.

Books3 and RedPajama have already appeared in several high-profile cases. In September, authors sued Apple, alleging that the company used copyrighted material to train its Apple Intelligence models without consent, credit, or compensation. A month later, Salesforce was hit with a similar lawsuit that also referenced RedPajama as a training source.

These cases reflect a broader legal reckoning for the AI industry. Large language models rely on enormous datasets, and questions about how those datasets were assembled are increasingly ending up in court. In one of the most significant cases to date, Anthropic agreed in September to pay $1.5 billion to settle claims from authors who accused the company of using pirated versions of their books to train its Claude chatbot.

That settlement was widely seen as a potential turning point, signalling that courts and companies may no longer treat copyright concerns around AI training as theoretical. If the case against Adobe proceeds, it could add further pressure on tech firms to clearly account for where their training data comes from, and whether they have the rights to use it.

Invite your friends and family to sign up for MC Tech 3, our daily newsletter that breaks down the biggest tech and startup stories of the day

Sarthak Singh Sarthak is an experienced writer having covered personal and consumer tech, gadgets news, social media trends, and more for several years

first published: Dec 18, 2025 08:48 pm

Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!

Subscribe to Tech Newsletters

Al Edge Newsletter On Saturdays

Find the best of Al News in one place, specially curated for you every weekend.
MC Tech 3 Newsletter Daily-Weekdays

Stay on top of the latest tech trends and biggest startup news.

Email address

Adobe faces class-action lawsuit over alleged AI training on pirated works

Adobe is facing fresh legal scrutiny over its AI ambitions, with a new lawsuit accusing the company of training one of its language models on pirated books without permission.

Related Stories

Subscribe to Tech Newsletters

Trending news