The Obligatory GPT-3 Post

Published: June 12, 2020, 1:30 p.m.

b'

https://slatestarcodex.com/2020/06/10/the-obligatory-gpt-3-post/

\\xa0

I.

I would be failing my brand if I didn\\u2019t write something about GPT-3, but I\\u2019m not an expert and discussion is still in its early stages. Consider this a summary of some of the interesting questions I\\u2019ve heard posed elsewhere, especially comments by\\xa0gwern\\xa0and\\xa0nostalgebraist. Both of them are smart people who I broadly trust on AI issues, and both have done great work with GPT-2. Gwern has gotten it to\\xa0write poetry,\\xa0compose music, and even\\xa0sort of play some chess; nostalgebraist has created\\xa0nostalgebraist-autoresponder\\xa0(a Tumblr written by GPT-2 trained on nostalgebraist\\u2019s own Tumblr output). Both of them disagree pretty strongly on the implications of GPT-3. I don\\u2019t know enough to resolve that disagreement, so this will be a kind of incoherent post, and hopefully stimulate some more productive comments. So:

OpenAI has released a new paper,\\xa0Language Models Are Few-Shot Learners, introducing GPT-3, the successor to the wildly-successful language-processing AI GPT-2.

GPT-3 doesn\\u2019t have any revolutionary new advances over its predecessor. It\\u2019s just\\xa0much\\xa0bigger. GPT-2 had 1.5 billion parameters. GPT-3 has 175 billion. The researchers involved are very open about how it\\u2019s the same thing but bigger. Their research goal was to test how GPT-like neural networks scale.

Before we get into the weeds, let\\u2019s get a quick gestalt impression of how GPT-3 does compared to GPT-2.

Here\\u2019s a sample of GPT-2 trying to write an article:

'