Perhaps It Is A Bad Thing That The World's Leading AI Companies Cannot Control Their AIs

Published: Dec. 14, 2022, 12:51 p.m.

https://astralcodexten.substack.com/p/perhaps-it-is-a-bad-thing-that-the

I. The Game Is Afoot \xa0

Last month I wrote about Redwood Research\u2019s fanfiction AI project. They tried to train a story-writing AI not to include violent scenes, no matter how suggestive the prompt. Although their training made the AI reluctant to include violence, they never reached a point where clever prompt engineers couldn\u2019t get around their restrictions.

Now that same experiment is playing out on the world stage. OpenAI released a question-answering AI, ChatGPT. If you haven\u2019t played with it yet, I recommend it. It\u2019s very impressive!

Every corporate chatbot release is followed by the same cat-and-mouse game with journalists. The corporation tries to program the chatbot to never say offensive things. Then the journalists try to trick the chatbot into saying \u201cI love racism\u201d. When they inevitably succeed, they publish an article titled \u201cAI LOVES RACISM!\u201d Then the corporation either recalls its chatbot or pledges to do better next time, and the game moves on to the next company in line.