AI can now train on copyrighted books without author permission, U.S. judge rules in favor of Anthropic

In a big legal win for the AI industry, a U.S. federal judge has ruled that Anthropic is allowed to train its artificial intelligence models on copyrighted books, without needing permission from authors.
The decision, issued late Monday by U.S. District Judge William Alsup, marks the first time a court has explicitly said it’s fair use for AI firms to study and learn from copyrighted texts when building large language models (LLMs).
The ruling came just a few weeks after Reddit filed a lawsuit against Anthropic for allegedly scraping user data to train its Claude AI model, without permission
Anthropic, the Amazon-backed startup behind the Claude AI system, was sued last year by authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson. They accused the company of building a multibillion-dollar business by “stealing hundreds of thousands of copyrighted books,” some allegedly downloaded from pirated sources. The court didn’t ignore the piracy issue, but it separated it from the broader question of whether copyrighted works can legally be used to train an AI.
On that front, the court sided firmly with Anthropic.
Judge Rules Anthropic Did Not Violate Authors’ Copyrights With AI Book Training
Judge Alsup ruled that Anthropic can train its AI on published books without needing permission from authors, marking the first court decision to back fair use for AI training on copyrighted texts.
He wrote that using books to train LLMs is “quintessentially transformative,” likening it to how humans read, learn, and eventually write their own material. He added:
“To make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable.”
In other words, paying to access a book is different from paying to remember or learn from it.
Alsup went on to say that the models didn’t regurgitate an author’s unique style or recreate their work for the public, so there was no copyright infringement in the training process itself.
Anthropic issued a statement calling the decision consistent with copyright’s original purpose—to support creativity and scientific progress.
In a statement, Anthropic spokesperson said that the company was “pleased” with the ruling and that the decision was “Consistent with copyright’s purpose in enabling creativity and fostering scientific progress.”
According to a report from CNBC, the lawsuit also partly focuses on a cache of around 7 million pirated books that Anthropic kept in a so-called “central library.” The company later chose not to use those materials to train its language models,
Still, the case isn’t over. While Anthropic steered clear of using pirated texts for training its LLMs, the court is moving ahead with a trial to determine whether the company’s use of those pirated books to create its “central library” merits any damages.
As Judge Alsup pointed out: “That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft, but it may affect the extent of statutory damages.”
This case could shape the legal ground rules for AI training going forward. For now, the message is clear: AI companies can train their models on legally obtained books, without asking authors first.
🚀 Want Your Story Featured?
Get in front of thousands of founders, investors, PE firms, tech executives, decision makers, and tech readers by submitting your story to TechStartups.com.
Get Featured