Should AI companies pause frontier AI development to assess risk of AI’s ability to improve itself without human involvement?

The Anthropic Institute’s Marina Favaro and Jack Clark have suggested the idea of the leading AI companies agreeing to pause their development of the state-of-the-art AI models that can improve themselves without human intervention, what they call “recursive self-improvement.”

The idea of an AI industry pause isn’t new. The Future of Life Institute helped to organize a letter of leading scientists and experts calling for a pause of AI development for 6 months back on March 22, 2023. (I joined that letter.)

The Anthropic Institute’s blog post identifies the challenge of coordinating an AI industry-wide pause around the world: how to get a verifiable commitment from the respective AI companies? A unilateral decision by one company won’t solve the problem and will hurt that company alone in making it potentially fall behind the state of the art being advanced by others.

Here’s the Anthropic Institute’s analysis of this conundrum for striking a pause:

What should we do?
If it were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing. But if a slowdown simply lets the least cautious actors catch up technologically, it could leave everyone less safe. Without a global coordination mechanism, companies and governments will have to make difficult decisions about safety while under competitive and geopolitical pressures. [my emphasis added in bold]

We believe it would be good for the world to have the option to slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance of the technology. The Anthropic Institute will conduct research—in collaboration with many others—and take actions to help build the systems that a credible slowdown or pause would require. These systems would enable frontier AI developers to verify that others globally have actually stopped or slowed, and that a bad actor could not use the auspices of a coordinated slowdown to jump ahead in secret. If such systems existed, we expect that we would slow down or temporarily pause, if other developers at or near the frontier also did so in a verifiable manner.

A meaningful slowdown or pause would require multiple well-resourced labs at or near the frontier, in multiple countries, agreeing to stop under the same conditions. It would also require that each can verify that the others have actually stopped. [my emphasis added in bold] Due to the unique characteristics of AI systems, the detectability (a lower standard than verifiability) element of this arms control problem is much more challenging than with other technologies.

Training runs are far easier to conceal than missile silos, their inputs are general-purpose, and the incentive to defect quietly is enormous, because whoever continues while others pause could inherit the lead. A credible pause also has to specify what triggers it, what lifts it, and who adjudicates.

None of this is necessarily impossible in principle—the world has built verification regimes for other complex technologies (e.g., the Intermediate-Range Nuclear Forces Treaty)—but those regimes took decades to build both the infrastructure and the trust. We don’t have that long. A unilateral pause by one lab, by contrast, is achievable immediately, but accomplishes much less: it would change who the front-runner is, but it would not create the wider deliberative process that is currently missing. [my emphasis added in bold]

In the coming months, we will organize conversations where policymakers, researchers, civil society, and other AI companies can help answer some of the questions this piece raises, especially around full recursive self-improvement and how to create better options for coordination and deliberation. We’ll publish what comes out of it. The window to investigate the questions together is here, and people outside AI companies should be involved in this deliberation.
Marina Favaro and Jack Clark of the anthropic institute

Substack

Chat GPT Is Eating the World

Should AI companies pause frontier AI development to assess risk of AI’s ability to improve itself without human involvement?

Like this:

Leave a ReplyCancel reply

Should AI companies pause frontier AI development to assess risk of AI’s ability to improve itself without human involvement?

Share this:

Like this:

Leave a ReplyCancel reply

Discover more from Chat GPT Is Eating the World