747: Technical Intro to Transformers and LLMs, with Kirill Eremenko

Published: Jan. 9, 2024, noon

Attention and transformers in LLMs, the five stages of data processing, and a brand-new Large Language Models A-Z course: Kirill Eremenko joins host Jon Krohn to explore what goes into well-crafted LLMs, what makes Transformers so powerful, and how to succeed as a data scientist in this new age of generative AI.This episode is brought to you by Intel and HPE Ezmeral Software Solutions, and by Prophets of AI, the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:\u2022 Supply and demand in AI recruitment [08:30]\u2022 Kirill and Hadelin's new course on LLMs, \u201cLarge Language Models (LLMs), Transformers & GPT A-Z\u201d [15:37]\u2022 The learning difficulty in understanding LLMs [19:46]\u2022 The basics of LLMs [22:00]\u2022 The five building blocks of transformer architecture [36:29]- 1: Input embedding [44:10]- 2: Positional encoding [50:46]- 3: Attention mechanism [54:04]- 4: Feedforward neural network [1:16:17]- 5: Linear transformation and softmax [1:19:16]\u2022 Inference vs training time [1:29:12]\u2022 Why transformers are so powerful [1:49:22]Additional materials: www.superdatascience.com/747