, , , ,

US Copyright Office issues pre-publication version of 3rd Report on AI Training and Fair Use. AI training is transformative, but degree depends on how AI functions. Supports new market dilution theory of harm under Factor 4.

On late Friday, the U.S. Copyright Office issued its long-awaited report on AI training and fair use. It’s labeled “Pre-publication Version,” though it says no substantive changes will be made. UPDATE on May 8: President Trump reportedly fired Shira Perlmutter, the Register of Copyrights, which now raises a cloud of uncertainty over whether this report ever becomes official. It might not. It could be DOA.

Wow, this report will create a firestorm in the AI copyright lawsuits. The Copyright Office has taken sides on all of the four factors of fair use. On balance, the Office’s view seems to favor the copyright holders in the AI litigation, notwithstanding the Office’s agreement that AI training often serves a transformative purpose (especially for large AI models that require large and diverse datasets for training) under Factor 1 of fair use.

Perhaps the biggest surprise and most controversial aspect: the Copyright Office breaks new ground in admittedly “uncharted territory” and agrees with a new theory of market dilution advanced by copyright stakeholders under Factor 4, the effect of the use upon the potential market for or value of the copyrighted work. It’s a new (and untested) theory in the sense that it has yet to be recognized by any court in a copyright case. As discussed below, the Office’s example of flooding the market for the entire genre of romance novels that are AI-generated as a cognizable market harm under Factor 4–even without proof of any infringing output or AI-generated book that actually infringes any plaintiffs’ romance book–shows just how expansive the theory is. It’s open to question whether the Copyright Office should be opining on a new theory of market harm that no court has recognized yet.

Granted, the theory sounds similar to the concern of market “obliteration” that Judge Chhabria voiced at the hearing in Kadrey v. Meta, although he also questioned whether the evidence in the summary judgment record supported it. (At the hearing, Judge Chhabria also said he was inclined to view AI training as transformative in purpose, even highly transformative. Plus, he said Factor 4 is the most important factor in the AI lawsuits–and he may be right.) But what’s different between the court and the Copyright Office is that Judge Chhabria has actual evidence submitted by the parties as well as the law governing the burden of proof in a court of law and the briefs of both sides. The Office, by contrast, has embarked on “uncharted territory” with no rules of evidence, but merely the comment or hearsay example presented by UMG Recordings–the plaintiff in two pending lawsuits against Suno and Udio–about how streaming of AI generated music has distorted streaming royalties, pointing to one criminal case involving the alleged fraudulent streaming of AI generated music by Michael Smith. Yet this criminal indictment itself shows that existing criminal laws can and already are being used to address manipulated streaming. This example seems too tenuous a basis for the Copyright Office to endorse an expansive theory of market harm under fair use that no court has adopted yet. The incongruity between Factor 4 of fair use and the Office’s new theory of market dilution caused by AI-generated music is shown by the fact that any court decision rejecting the fair use defense of an AI company is unlikely to stop fraudsters from trying to manipulate streaming royalties, a problem that started before AI. In any event, we should expect that many of the plaintiffs in the AI litigation will advance this theory of market harm through dilution, provided their complaints sufficiently alleged it.

Highlights of important points from U.S. Copyright Office Report (in order in Report):

Memorization in weights after training

Different uses during AI development v. AI deployment require separate consideration but fair use “must also be evaluated in the context of the overall use(emphasis added)

Factor 1: Training AI models will often be transformative in purpose, but how the models function or can be used, once they are deployed, will affect the degree of transformativeness.

Copyright Office rejects theories that AI engages in nonexpressive use and fair learning.

Using “pirated” datasets should weigh against fair use without being determinative (aka “unlawful access”)

Factor 2: Using more expressive works in AI training weighs against fair use under Factor 2

Factor 3: Downloading entire works may factor against fair use but for some large AI models the amount copied may be reasonably necessary

Factor 4: Copyright Office analyzes potential market harm in (1) lost sales, (2) market dilution, and (3) lost licensing opportunities

Public benefits

Weighing the four factors

Conclusion

DOWNLOAD THE COPYRIGHT OFFICE REPORT BELOW

Related Stories

Leave a Reply


Discover more from Chat GPT Is Eating the World

Subscribe now to keep reading and get access to the full archive.

Continue reading