, ,

Are AI researchers at U.S. universities engaged in infringement by unlicensed acquisition and use of copyrighted works, or use of AI models trained on unlicensed works? We shall likely find out.

It was bound to happen.

With over 100 copyright lawsuits against nearly every AI company in the United States, the very broad infringement claims would eventually sweep in the practices of AI researchers, including at U.S. universities, generally.

In fact, use of copyrighted works in AI training and development started first in academic research by university employees, not at AI companies. Indeed, some of the major breakthroughs in deep learning relied on large quantities of unlicensed copyrighted works. So, if developing AI with unlicensed works was the “original sin” of AI development, university researchers were the culprits. Not AI companies.

So, it’s good to see that the original practices of AI researchers at universities will likely stand trial.

If the acquisition and use of copyrighted works to research, develop, and improve AI models is categorically not fair use because all of it must be licensed, as some plaintiffs argue, then no AI training or research and development of AI models on unlicensed works should be allowed at universities.

Apple now joins other defendants in raising this broader reseach issue in Hendrix v. Apple. Apple acquired copyrighted works from datasets online and then used the unlicensed works to train OpenELM “exclusively for research and scholarship purposes.”

Apple says it released the OpenELM model to allow others to test it “because replicability is a key component to scientific research,” and also published a research paper for the AI research community. “Apple’s goal with the release of OpenELM and the publication of the OpenELM research paper was to empower and strengthen the open research community in order to pave the way for future open research endeavors.”

Cerebras, a defendant in James v. Cerebas Systems, has likewise raised the same research defense as Apple:

Even the lawsuits in which AI companies deployed their AI models commercially, after training, a related question can be asked: Is researching, developing, and improving AI models on unlicensed works to advance the state of the art in the United States a transformative fair use purpose—or not?

(Non)commercial character of use is a related but “other consideration” than purpose of use

Some people wrongly conflate two different but related factors in the fair use analysis: (1) whether the defendant had a “further” or “transformative” purpose in using copyrighted works and (2) whether such use was noncommercial or commercial.

As the Supreme Court in Andy Warhol Foundation v. Goldsmith explained, commerciality is the “other consideration[]” analyzed in conjunction with the defendant’s purpose, which is a “matter of degree.” But purpose and commerciality are not the same thing.

This important distinction could not be more clear in Hachette Book Group, Inc. v. Internet Archive. Even though the defendant Internet Archive engaged in a noncommercial use of redistributing books to the public, it was not a transformative purpose.

The Second Circuit nicely explained the difference between purpose and commerciality:

The Practices of AI Researchers Who Use Copyrighted Works in AI Research at Universities Will Be Tested Too

The upshot is that courts will be asked to examine the nature of the use of copyrighted works to research, develop, and improve AI models generally, whether at universities or at AI companies.

The question that Apple and Cerebras raise is the correct one.

For, if using unlicensed copyrighted works to research and develop AI models is generally illegal and not a fair use purpose as some plaintiffs are arguing, it should not be allowed at academic institutions either.

And, if all AI research must be licensed, it will stall.

Leave a Reply


Discover more from Chat GPT Is Eating the World

Subscribe now to keep reading and get access to the full archive.

Continue reading