Research: Did Apple researchers overstate “The Illusion of Thinking” in reasoning models. Opus, Lawsen think so.

In “The Illusion of the Illusion of Thinking,” Alex Lawsen of Open Philanthropy and “C. Opus” of Antrhopic–which may refer to Claude Opus, the model of Anthropic–try to debunk the paper recently posted by Apple researchers titled “The Illusion of Thinking.”

The Apple paper concluded: ““We show that state-of-the-art LRMs (e.g., o3-mini, DeepSeek-R1, Claude-3.7-Sonnet-Thinking) still fail to develop generalizable problem-solving capabilities, with accuracy ultimately collapsing to zero beyond certain complexities across different environments.”

In this response, C. Opus and Lawsen argue: “Shojaee et al.’s results demonstrate that models cannot output more tokens than their context limits allow, that programmatic evaluation can miss both model capabilities and puzzle impossibilities, and that solution length poorly predicts problem difficulty. These are valuable engineering insights, but they do not support claims about fundamental reasoning limitations.”

Chat GPT Is Eating the World

Research: Did Apple researchers overstate “The Illusion of Thinking” in reasoning models. Opus, Lawsen think so.

Like this:

Leave a ReplyCancel reply

Research: Did Apple researchers overstate “The Illusion of Thinking” in reasoning models. Opus, Lawsen think so.

Share this:

Like this:

Leave a ReplyCancel reply

Discover more from Chat GPT Is Eating the World