I am happy to share my latest law review article, “Fair Use and the Origin of AI Training,” forthcoming in Houston Law Review in the fall. Download a preprint copy of my paper at SSRN by following this link.
Main points in my article, “fair use and the origin of ai training”
The origin of AI training: My article traces the origin and history of AI training with large datasets. The practice started with university researchers who discovered that using larger and more diverse datasets–known as scaling or scaling up–achieved better results in developing and improving AI models I recommend that all U.S. courts consider this important history in analyzing fair use in the AI copyright litigation. Why? It sheds light on the “further purpose and different character” of using copyrighted works in AI training through scaling the datasets, whether at universities or AI companies.
The Copyright Office’s pre-publication Report on AI Training is wrong in key respects: My article also explains how the Copyright Office is wrong in key aspects of its Pre-Publication Report on AI Training, released on May 9. I recommend reading my entire draft, but let me preview three key points where I think the Copyright Office’s Report is wrong in its analysis.
(1) Generative capability and its transformativeness: The generative capability of AI models to create new, noninfringing works serves the goal of the Progress Clause and the First Amendment in fostering the creation of new expression. Contrary to the Copyright Office’s crabbed view, the degree of this transformativeness in developing AI is increased, not diminished, by deploying AI models to enable people in the United States to generate new, noninfringing works. This generative capability increases the accessibility of creative production to a much wider scope of creators in the United States, including people with disabilities. It is fundamental in this country that “more speech, not less, is the governing rule.”
(2) The Copyright Office’s “uncharted territory” of market dilution: The Copyright Office’s endorsement of a new theory of market dilution impermissibly expands the scope of copyright to protect copyright holders from general competition posed by new, noninfringing works. This is legal error. It turns copyright from a limited monopoly to a general monopoly against competition posed by noninfringing works. Copyright was never intended as a shield from the general competition from noninfringing works. As the Ninth Circuit recognized in Sega Enterprises v. Accolade, there is a public benefit in competition presented by others creating new, noninfringing works. Unless the copyright holder can prove a work in question is an infringing copy or derivative work, the competing work should be not only allowed, but encouraged.
Courts must not forget something the Copyright Office apparently did: “‘[T]he ultimate aim is, by this incentive, to stimulate artistic creativity for the general public good.’ ” Sony Corp., 464 U.S. at 432, 104 S. Ct. at 783 (quoting Twentieth Century Music Corp. v. Aiken, 422 U.S. 151, 156, 95 S. Ct. 2040, 2044, 45 L. Ed. 2d 84 (1975)). When technological change has rendered an aspect or application of the Copyright Act ambiguous, ” ‘the Copyright Act must be construed in light of this basic purpose.’ ”
(3) Protecting copyright holders from competition from noninfringing works in entire genres (as the Copyright Office did with romance novels, music) turns the Copyright Clause on its head. Instead of promoting progress, the goal is to protect copyrights—and perversely to reduce the creation of new, noninfringing works. Such an unbounded view of copyright violates the Progress Clause, which limits the scope of authors’ “exclusive right[s]” to “their writings.” It also likely violates the First Amendment rights of people who use AI generators to create new, noninfringing expression. As long as the new expression does not infringe the copyright for the original works, the new expression is protected speech under the First Amendment. Shrinking the amount of noninfringing works produced in the United States–out of speculative fears of market dilution or, even worse, viewpoint discrimination that some works are “cheap” or “inferior” to others–is contrary to both the Progress Clause and the First Amendment. Copyright is intended to be the engine of free expression, not its enemy.
ABSTRACT

Download a preprint copy of my paper at SSRN by following this link.
