- In 3 months, book author Darius H. James has filed 3 copyright suits related to use or creation of shadow libraries.
- 1 lawsuit was filed in Montana (Snowflake), while the other two in San Francisco (Together Computer and Cerebras Systems).
- All 3 lawsuits propose class actions.
Book author Darius H. James filed three class action copyright suits since last November:
(1) one against Snowflake Inc. for its alleged training of its LLMs using RedPajama and Books3 dataset;
(2) one against Together Computer for its infringement when it allegedly “orchestrated and assembled the RedPajama dataset, a dataset comprised of a mixture of publicly available works as well as copyrighted works”; and
(3) one against Cerebras Systems for admitting, in a research article, it “pre-train[s] models on the Pile dataset, which consists of data from 22 data sources, including Common Crawl, PubMed Central, Books3, OpenWebText2, Github, and arXiv (Gao et al., 2020).”
By our count, these lawsuits raise the total number of copyright suits against AI companies in the United States to 74.
DOWNLOAD JAMES’S COMPLAINTS BELOW:
Cerebras Systems:
Together Computer
