96. Jan Leike - AI alignment at OpenAI

Published: Sept. 29, 2021, 2:17 p.m.

b'

The more powerful our AIs become, the more we\\u2019ll have to ensure that they\\u2019re doing exactly what we want. If we don\\u2019t, we risk building AIs that use dangerously creative solutions that have side-effects that could be undesirable, or downright dangerous. Even a slight misalignment between the motives of a sufficiently advanced AI and human values could be hazardous.

\\n

That\\u2019s why leading AI labs like OpenAI are already investing significant resources into AI alignment research. Understanding that research is important if you want to understand where advanced AI systems might be headed, and what challenges we might encounter as AI capabilities continue to grow \\u2014 and that\\u2019s what this episode of the podcast is all about. My guest today is Jan Leike, head of AI alignment at OpenAI, and an alumnus of DeepMind and the Future of Humanity Institute. As someone who works directly with some of the world\\u2019s largest AI systems (including OpenAI\\u2019s GPT-3) Jan has a unique and interesting perspective to offer both on the current challenges facing alignment researchers, and the most promising future directions the field might take.

\\n

--- 

\\n

Intro music:

\\n

\\u279e Artist: Ron Gelinas

\\n

\\u279e Track Title: Daybreak Chill Blend (original mix)

\\n

\\u279e Link to Track: https://youtu.be/d8Y2sKIgFWc

\\n

--- 

\\n

Chapters:  

\\n

0:00 Intro

\\n

1:35 Jan\\u2019s background

\\n

7:10 Timing of scalable solutions

\\n

16:30 Recursive reward modeling

\\n

24:30 Amplification of misalignment

\\n

31:00 Community focus

\\n

32:55 Wireheading

\\n

41:30 Arguments against the democratization of AIs

\\n

49:30 Differences between capabilities and alignment

\\n

51:15 Research to focus on

\\n

1:01:45 Formalizing an understanding of personal experience

\\n

1:04:04 OpenAI hiring

\\n

1:05:02 Wrap-up

'