Judge William Alsup issued the first decision on fair use and infringement in a generative AI case, Bartz v. Anthropic. It was a split decision: (1) copies used to train Anthropic’s AI model were “exceedingly transformative” and fair uses to develop a technology “among the most transformative many of us will see in our lifetimes,” but (2) copies of pirated books Anthropic downloaded and then later stored in a permanent, central library were infringing. (*Because Bartz didn’t file a motion for summary judgment, technically Judge Alsup didn’t make a ruling on infringement. But his opinion all but did so characterizing Anthropic’s building a library with pirated books as “stealing” and “theft” without any fair use justification, and stating that the case will go to trial on this issue, including “resulting damages” and potential “willfulness”).
This Solomonic judgment gives Anthropic and other AI companies a fair use precedent supporting AI training. But it also gives book authors an infringement precedent for use of pirated book copies at least if they are stored in a central library indefinitely. Judge Alsup even suggested the pirated book copies might be infringing even if they were temporary or intermediate copies used to train AI models, although he declined to rule on that issue. (This part of the decision is likely to be relevant to the Kadrey v. Meta case, which also involved use of pirated books, although it’s not clear if Meta stored a permanent library. The OpenAI MDL litigation also likely involves alleged use of pirated books. It could be that most, if not all, the AI companies used pirated books.)
The most surprising part of the opinion is that Judge Alsup framed or narrowed the issue of pirated books to the acts of acquiring plus storing in a central library. That’s a distinction I can’t find in the heavily redacted briefs of the parties, and I don’t think it ever came up during oral argument, except for a fleeting reference by Bartz’s attorney about the Texaco case involving “building a research library” (at p. 35). Bartz didn’t file a motion for summary judgment, as the Judge noted in the decision. Yet, because he rejected fair use and said “[w]e will have a trial on the pirated copies used to create Anthropic’s central library and the resulting damages, actual or statutory (including for willfulness),” that’s effectively a finding of infringement.
Accordingly, this Solomonic judgment is, on balance, more favorable to the Bartz book authors because statutory damages are calculated per work, not based on the number of copies. So, the Bartz plaintiffs can potentially recover the same amount of statutory damages regardless of how the training copies are treated. And, if this case is certified as a class action as I expect, the amount of statutory damages might be enormous.
Accordingly, this Solomonic judgment is, on balance, more favorable to the Bartz book authors because statutory damages are calculated per work, not based on the number of copies. So, the Bartz plaintiffs can potentially recover the same amount of statutory damages regardless of how the training copies are treated. And, if this case is certified as a class action as I expect, the amount of statutory damages might be enormous.
HOW MUCH STATUTORY DAMAGES ARE IN PLAY?
It all depends on the size of the class, which has yet to be determined. But the outer limit could be 7 million (*assuming the Judge’s reference to 7 million copies of books in his order means each copy is a different book, an interpretation that seems reasonable in context). Because the Judge said willfulness might be in play at trial, the maximum amount of statutory damages for willful infringement is $150,000 per work under the Copyright Act.
7 million pirated books (minus unregistered books) x $150,000 = $1,050,000,000,000.
Yes, that’s 1 trillion 50 billion dollars as the outer limit! (*We still must subtract the unregistered books.)
In a separate order, Judge Alsup had earlier proposed a definition of the class that included all registered copyrighted books in the (1) pirated books datasets Books3, PiLiMi, and LibGen, and (2) the dataset Anthropic prepared by scanning physical books. But the latter the Judge ruled was a fair use. So that would leave only the class defined by (1), which still encompasses the 7 million figure Judge Alsup cited in the order (at p. 3: “Anthropic thereby pirated over seven million copies of books, including copies of at least two works at issue for each Author.”) minus the books that were not registered.
So the outer limit has to be below 1 trillion 50 billion dollars. But Anthropic is valued at a mere $61.5 billion.
But is library building unfair?
We can expect that the issue of library building will now be hotly contested in other cases, and also on appeal in this case. Notably, in MGM Studios v. Grokster, the Supreme Court did state that the library building in recording broadcast TV shows mentioned in Sony was not “necessarily infringing.” Presumably, that implies that it could be justified by a legitimate fair use purpose. The Bartz case is likely to squarely raise that issue on appeal.
And, of course, in the long term, going beyond this lawsuit, if Judge Alsup’s decision that AI training is fair use, AI companies stand to benefit greatly.
Related Stories