Ross Andersen wrote a fascinating article on OpenAI’s Sam Altman. Buried in the article is what appears to be one of the most shocking revelations: the key breakthrough technology for ChatGPT, the transformer, came from a research paper written and shared online by Google Brain scientists in 2017.
The article titled, “Attention Is All You Need,” was written by Google Brain scientists Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N. Gomez Lukasz Kaiser Illia Polosukhin. The article, which now has been cited 89,824 times, described a simple design embodied in the transformer to use in large language models: “In this work we propose the Transformer, a model architecture eschewing recurrence and instead relying entirely on an attention mechanism to draw global dependencies between input and output. The Transformer allows for significantly more parallelization and can reach a new state of the art in translation quality after being trained for as little as twelve hours on eight P100 GPUs.”
In the Atlantic article, Ilya Sutskever, OpenAI’s chief scientist, who reportedly used to work at Google Brain, describes the Google Brain article as providing the key breakthrough for ChatGPT:
“The next day, when the paper came out, we were like, ‘That is the thing,’ ” Sutskever told me. “ ‘It gives us everything we want.’ ”
Ross Andersen, DOES SAM ALTMAN KNOW WHAT HE’S CREATING?, The Atlantic
The next year, OpenAI released its first GPT (generative pre-trained transformer). By the GPT-2 version, the AI had developed the emergent ability to perform language translations! And the rest is, well, history. Now Google is playing catch up to keep pace with OpenAI and its business partner, rival Microsoft.
So, did Google really hand over the key breakthrough technology for ChatGPT to OpenAI in 2017? That’s what it sounds like from Sutskever’s account in the Atlantic.
And it appears to be known in the AI community, and perhaps in business circles, that Google was responsible for the transformer. But, as far as we can tell, the media have not widely reported it, much less recognized its significance for ChatGPT’s success. (See, for example, the Wall Street Journal article, How Google Became Cautious of AI and Gave Microsoft an Opening, which tersely states: “Mr. Shazeer [then at Google Brain] had helped develop the Transformer, a widely heralded new type of AI model that made it easier to build increasingly powerful programs like the ones behind ChatGPT.).”