Encoders, cross attention and masking for LLMs: SuperDataScience Founder Kirill Eremenko returns to the SuperDataScience podcast, where he speaks with Jon Krohn about transformer architectures and why they are a new frontier for generative AI. If you\u2019re interested in applying LLMs to your business portfolio, you\u2019ll want to pay close attention to this episode!This episode is brought to you by Ready Tensor, where innovation meets reproducibility, by Oracle NetSuite business software, and by Intel and HPE Ezmeral Software Solutions. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:\u2022 How decoder-only transformers work [15:51]\u2022 How cross-attention works in transformers [41:05]\u2022 How encoders and decoders work together (an example) [52:46]\u2022 How encoder-only architectures excel at understanding natural language [1:20:34]\u2022 The importance of masking during self-attention [1:27:08]Additional materials: www.superdatascience.com/759