, , , ,

Magistrate Judge Wang rules v. OpenAI again, holding waived attorney-client privilege re: deletion of Books1, Books2 datasets. Cold comfort to OpenAI: Judge finds no crime-fraud exception applies.

Add another major discovery loss to OpenAI before Magistrate Judge Wang. This one of the biggest.

Judge Wang just issued an opinion finding that OpenAI waived its attorney-client privilege as to communications related to OpenAI’s deletion of the Books1 and Books2 datasets. OpenAI’s attorneys get to be deposed regarding their communications. Oh boy.

Cold comfort: Magistrate Judge also ruled that the plaintiffs had failed to establish the crime-fraud exception applied. This ground seems unnecessary given the Judge’s earlier holding that OAI waived its attorney-client privilege above.

I expect OpenAI will appeal this decision to Judge Stein.

What OpenAI must produce

What about Judge Alsup’s decision faulting Anthopric for permanently storing datasets in a library?

Very little information has been publicly disclosed about the reason(s) for OpenAI’s deletion of Books1 and Books2. Apparently, as Judge Wang recounts, OpenAI’s reasons have shifted from non-use of the datasets (presumably after they were used to train earlier models) to a retraction of the non-use reason in 2025 based on OpenAI’s assertion of attorney-client privilege.

Putting aside the precise reason for the deletion, it’s worth noting that deletion of datasets after using them to train an AI model would be consistent with the fair use approach Judge Alsup took in Bartz v. Anthropic. Judge Alsup faulted Anthropic for not deleting but instead permanently storing datasets in a general library. That ruling, coupled with the ruling on use of shadow libraries, is what prompted a settlement by Anthropic.

By contrast, Judge Wang ruled that OpenAI’s deletion of these 2 datasets — instead of storing them in a permanent library — could be evidence of OpenAI’s willful infringement.

In sum, Judge Alsup in Bartz v. Alsup expected, for fair use in training, the deletion of datasets after training was over. Otherwise, a permanent library derived from shadow libraries could be a separate use and infringing.

By contrast, here, the reasons for deleting Books1 and Books2 after training apparently can be evidence of willful infringement.

Under Bartz v. Anthropic and Judge Wang’s ruling on deletion of datasets, the rule appears to be: Damned if you do, damned if you don’t.

Plaintiffs’ description of timeline of OpenAI’s alleged shifting reasons

DOWNLOAD MAGISTRATE JUDGE WANG’S ORDER

Related Stories

Leave a Reply


Discover more from Chat GPT Is Eating the World

Subscribe now to keep reading and get access to the full archive.

Continue reading