Magistrate Judge Wang rules v. OpenAI again, holding waived attorney-client privilege re: deletion of Books1, Books2 datasets. Cold comfort to OpenAI: Judge finds no crime-fraud exception applies.

Add another major discovery loss to OpenAI before Magistrate Judge Wang. This one of the biggest.

Judge Wang just issued an opinion finding that OpenAI waived its attorney-client privilege as to communications related to OpenAI’s deletion of the Books1 and Books2 datasets. OpenAI’s attorneys get to be deposed regarding their communications. Oh boy.

“OpenAI has expressly waived privilege over attorney-client communications regarding ‘non-use’ as a reason for the deletion of Books1 and Book2.” (17)
“OpenAI has also waived privilege over all communications regarding the ‘reasons’ for the deletion of Books1 and Books2 by putting its good faith and state of mind at issue.” (17)
“Thus, given the circumstances of this case, and as a matter of fairness to prevent the selective and potentially misleading presentation of evidence, unless OpenAI explicitly and unambiguously states its intention to forego asserting its good faith defense and any factual arguments in opposition to Class Plaintiffs’ allegations of willfulness, OpenAI has waived attorney-client privilege.” (23)

Cold comfort: Magistrate Judge also ruled that the plaintiffs had failed to establish the crime-fraud exception applied. This ground seems unnecessary given the Judge’s earlier holding that OAI waived its attorney-client privilege above.

I expect OpenAI will appeal this decision to Judge Stein.

What OpenAI must produce

What about Judge Alsup’s decision faulting Anthopric for permanently storing datasets in a library?

Very little information has been publicly disclosed about the reason(s) for OpenAI’s deletion of Books1 and Books2. Apparently, as Judge Wang recounts, OpenAI’s reasons have shifted from non-use of the datasets (presumably after they were used to train earlier models) to a retraction of the non-use reason in 2025 based on OpenAI’s assertion of attorney-client privilege.

Putting aside the precise reason for the deletion, it’s worth noting that deletion of datasets after using them to train an AI model would be consistent with the fair use approach Judge Alsup took in Bartz v. Anthropic. Judge Alsup faulted Anthropic for not deleting but instead permanently storing datasets in a general library. That ruling, coupled with the ruling on use of shadow libraries, is what prompted a settlement by Anthropic.

Judge Alsup’s Solomonic judgment on fair use in AI training & acquiring pirated books: is it the blueprint for the future of AI training? Part I: Pirated copies

By contrast, Judge Wang ruled that OpenAI’s deletion of these 2 datasets — instead of storing them in a permanent library — could be evidence of OpenAI’s willful infringement.

In sum, Judge Alsup in Bartz v. Alsup expected, for fair use in training, the deletion of datasets after training was over. Otherwise, a permanent library derived from shadow libraries could be a separate use and infringing.

By contrast, here, the reasons for deleting Books1 and Books2 after training apparently can be evidence of willful infringement.

Under Bartz v. Anthropic and Judge Wang’s ruling on deletion of datasets, the rule appears to be: Damned if you do, damned if you don’t.

Has Magistrate Judge Wang overread the Bartz v. Anthropic fair use decision in her recent decision on OpenAI’s waiver of attorney-client privilege?

Plaintiffs’ description of timeline of OpenAI’s alleged shifting reasons

Timeline Download

DOWNLOAD MAGISTRATE JUDGE WANG’S ORDER

Judge Wang order re Books1 and Books2 OAI loss of attorney client privilege Download

Related Stories

Magistrate Judge Wang issues ominous order asking for more briefing on whether OpenAI waived attorney-client privilege re: its destruction of Books1 and Books2 datasets.

OpenAI loses yet another discovery motion as Magistrate Judge Wang orders OpenAI to produce “Slack messages between Adam Nace, Alex Paino, Bob McGrew, Matt Knight, Nick Ryder, Lilian Weng, Jeff Wu, Jakub Pachocki, Jason Kwon , and Che Chang about deleting Library Genesis”

Magistrate Judge Wang decides another discovery loss for OpenAI

Magistrate Wang denies OpenAI’s motion to reconsider order to preserve output log data

Magistrate Judge Wang allows discovery related to OpenAI’s alleged use of Library Genesis, shadow library

Magistrate Judge Wang sticks to her order to OpenAI to produce 20 million de-identified consumer ChatGPT logs. But Judge is entertaining reconsideration.

Magistrate Judge Wang denies OpenAI’s discovery motion seeking New York Times’ use of AI

Magistrate Judge Wang gives Book Authors, Newspapers discovery victory. California Labor Code § 980 does not bar discovery of social media messages on X.com

Magistrate Judge Wang denies OpenAI’s discovery motion in book authors cases re: fair use on same grounds as denial in New York Times lawsuit

Chat GPT Is Eating the World