AI research – Chat GPT Is Eating the World

Sergey Brin admits Google fumbled developing AI after seminal transformer paper (Attention Is All You Need) that spawned ChatGPT

Sergey Brin recently admitted that Google underinvested in AI after Google researchers posted their seminal paper on the transformer architecture in the paper “Attention Is All You Need” in 2017. Because Google shared the discovery with everyone, OpenAI ran with it, which led to ChatGPT. Sam Altman also recently said that, had Google taken OpenAI…
Read more: Sergey Brin admits Google fumbled developing AI after seminal transformer paper (Attention Is All You Need) that spawned ChatGPT
DeepSeek new paper: model uses images of text for vision tokens instead of text tokens. Optical compression might cut costs, computation needed

DeepSeek, based in China, is innovating again. In a fascinating paper posted on Github, Deepseek-OCR: Contexts Optical Compression, DeepSeek uses a new method of “compressing long contexts via optical 2D mapping.” Instead of text tokens, the model will use image or vision tokens of the text. Basically, like taking a screenshot of a page of…
Read more: DeepSeek new paper: model uses images of text for vision tokens instead of text tokens. Optical compression might cut costs, computation needed
Research: Meta paper on The Art of Scaling Reinforcement Learning Compute for LLMs

Researchers from Meta with university researchers from UT Austin, UCL, UC Berkeley, Harvard, along with Periodic Labs, posted on arXiv a paper on scaling in reinforcement learning (instead of pre-training). One of their key findings is that “(3) Stable, scalable recipes follow predictable scaling trajectories, enabling extrapolation from smaller-scale runs.” ABSTRACT EXCERPT Related Stories I…
Read more: Research: Meta paper on The Art of Scaling Reinforcement Learning Compute for LLMs
Geoffrey Hinton: AI can develop “maternal instincts,” avoid doomsday scenarios

Geoffrey Hinton recently articulated a vision for AI’s development that is less of a doomsday scenario posing an existential threat to humanity. The solution: AI can develop empathy for humans, what Hinton calls a “maternal instinct.” Take a listen. I think there is a way that we can co-exist with things that are smarter and…
Read more: Geoffrey Hinton: AI can develop “maternal instincts,” avoid doomsday scenarios
AI training in the shadow of Geoffrey Hinton: was it fair use or not?

AI is one of the most disruptive technologies of the 21st century. It may go down as the most disruptive when history books are written. AI’s disruption has triggered a wide spectrum of reactions among people, including hostility and outright vitriol. There’s probably no more triggering aspect of AI than the edifice on which it…
Read more: AI training in the shadow of Geoffrey Hinton: was it fair use or not?
Meta Superintelligence Lab paper on Agents Research Environments

Interesting paper showing the plateauing of scaling up models–and how to respond to that. You can download the paper from Meta.
Read more: Meta Superintelligence Lab paper on Agents Research Environments
Google DeepMind research paper on Generative Data Refinement to cleanse undesirable content

Google DeepMind posed a preprint paper on “Generative Data Refinement: Just Ask for Better Data.“ Abstract: For a fixed parameter size, the capabilities of large models are primarily determined by the quality and quantity of its training data. Consequently, training datasets now grow faster than the rate at which new data is indexed on the…
Read more: Google DeepMind research paper on Generative Data Refinement to cleanse undesirable content
Paper on The Illusion of Diminishing Returns [of scaling]

Fascinating research paper refuting the notion that scaling of datasets has diminishing returns in the performance of large language models. While that might be true for simple “single-step” tasks, it is not for more complex “long horizon” tasks, according to the following researchers (Akshit Sinha, Arvindh Arun, Shashwat Goel, Steffen Staab, Jonas Geiping), who posted…
Read more: Paper on The Illusion of Diminishing Returns [of scaling]
Research: Did Apple researchers overstate “The Illusion of Thinking” in reasoning models. Opus, Lawsen think so.

In “The Illusion of the Illusion of Thinking,” Alex Lawsen of Open Philanthropy and “C. Opus” of Antrhopic–which may refer to Claude Opus, the model of Anthropic–try to debunk the paper recently posted by Apple researchers titled “The Illusion of Thinking.” The Apple paper concluded: ““We show that state-of-the-art LRMs (e.g., o3-mini, DeepSeek-R1, Claude-3.7-Sonnet-Thinking) still…
Read more: Research: Did Apple researchers overstate “The Illusion of Thinking” in reasoning models. Opus, Lawsen think so.
Mark Zuckerberg aims to hire 50 AI researchers to develop “superintelligence”

Cade Metz and Mike Isaac reported today in the New York Times that Mark Zuckerberg is personally recruiting AI developers to be a part of a team to develop superintelligence. Alexandr Wang, the founder of Scale AI, reportedly will join the team, with Meta possibly investing $10 billion in Scale. Bloomberg’s Kurt Wagner and Riley…
Read more: Mark Zuckerberg aims to hire 50 AI researchers to develop “superintelligence”

Category: AI research