-
Sergey Brin admits Google fumbled developing AI after seminal transformer paper (Attention Is All You Need) that spawned ChatGPT
Read more: Sergey Brin admits Google fumbled developing AI after seminal transformer paper (Attention Is All You Need) that spawned ChatGPTSergey Brin recently admitted that Google underinvested in AI after Google researchers posted their seminal paper on the transformer architecture in the paper “Attention Is All You Need” in 2017. Because Google shared the discovery with everyone, OpenAI ran with it, which led to ChatGPT. Sam Altman also recently said that, had Google taken OpenAI…
-
DeepSeek new paper: model uses images of text for vision tokens instead of text tokens. Optical compression might cut costs, computation needed
Read more: DeepSeek new paper: model uses images of text for vision tokens instead of text tokens. Optical compression might cut costs, computation neededDeepSeek, based in China, is innovating again. In a fascinating paper posted on Github, Deepseek-OCR: Contexts Optical Compression, DeepSeek uses a new method of “compressing long contexts via optical 2D mapping.” Instead of text tokens, the model will use image or vision tokens of the text. Basically, like taking a screenshot of a page of…
-

Geoffrey Hinton: AI can develop “maternal instincts,” avoid doomsday scenarios
Read more: Geoffrey Hinton: AI can develop “maternal instincts,” avoid doomsday scenariosGeoffrey Hinton recently articulated a vision for AI’s development that is less of a doomsday scenario posing an existential threat to humanity. The solution: AI can develop empathy for humans, what Hinton calls a “maternal instinct.” Take a listen. I think there is a way that we can co-exist with things that are smarter and…
-
AI training in the shadow of Geoffrey Hinton: was it fair use or not?
Read more: AI training in the shadow of Geoffrey Hinton: was it fair use or not?AI is one of the most disruptive technologies of the 21st century. It may go down as the most disruptive when history books are written. AI’s disruption has triggered a wide spectrum of reactions among people, including hostility and outright vitriol. There’s probably no more triggering aspect of AI than the edifice on which it…
-
Meta Superintelligence Lab paper on Agents Research Environments
Read more: Meta Superintelligence Lab paper on Agents Research EnvironmentsInteresting paper showing the plateauing of scaling up models–and how to respond to that. You can download the paper from Meta.
-
Google DeepMind research paper on Generative Data Refinement to cleanse undesirable content
Read more: Google DeepMind research paper on Generative Data Refinement to cleanse undesirable contentGoogle DeepMind posed a preprint paper on “Generative Data Refinement: Just Ask for Better Data.“ Abstract: For a fixed parameter size, the capabilities of large models are primarily determined by the quality and quantity of its training data. Consequently, training datasets now grow faster than the rate at which new data is indexed on the…
-
Paper on The Illusion of Diminishing Returns [of scaling]
Read more: Paper on The Illusion of Diminishing Returns [of scaling]Fascinating research paper refuting the notion that scaling of datasets has diminishing returns in the performance of large language models. While that might be true for simple “single-step” tasks, it is not for more complex “long horizon” tasks, according to the following researchers (Akshit Sinha, Arvindh Arun, Shashwat Goel, Steffen Staab, Jonas Geiping), who posted…
-
Research: Did Apple researchers overstate “The Illusion of Thinking” in reasoning models. Opus, Lawsen think so.
Read more: Research: Did Apple researchers overstate “The Illusion of Thinking” in reasoning models. Opus, Lawsen think so.In “The Illusion of the Illusion of Thinking,” Alex Lawsen of Open Philanthropy and “C. Opus” of Antrhopic–which may refer to Claude Opus, the model of Anthropic–try to debunk the paper recently posted by Apple researchers titled “The Illusion of Thinking.” The Apple paper concluded: ““We show that state-of-the-art LRMs (e.g., o3-mini, DeepSeek-R1, Claude-3.7-Sonnet-Thinking) still…
