Lecture 4: What are transformers?

In this lecture, we will understand the basics of the LLM secret sauce: transformers. The key reference book which this video series very closely follows is Build a Large Language Model from Scratch by Manning Publications. All schematics and their descriptions are borrowed from this incredible book! This book serves as a comprehensive guide to understanding and building large language models, covering key concepts, techniques, and implementations. Affiliate links for purchasing the book will be added soon. Stay tuned for updates! 0:00 Introduction 2:08 Transformer basics 5:33 Simplified transformer architecture 20:57 A note on attention 25:40 BERT and GPT 31:21 Difference between transformers and LLMs ================================================= ✉️ Join our FREE Newsletter: https://vizuara.ai/our-newsletter/ ================================================= Vizuara philosophy: As we learn AI/ML/DL the material, we will share thoughts on what is actually useful in industry and what has become irrelevant. We will also share a lot of information on which subject contains open areas of research. Interested students can also start their research journey there. Students who are confused or stuck in their ML journey, maybe courses and offline videos are not inspiring enough. What might inspire you is if you see someone else learning and implementing machine learning from scratch. No cost. No hidden charges. Pure old school teaching and learning. ================================================= 🌟 Meet Our Team: 🌟 🎓 Dr. Raj Dandekar (MIT PhD, IIT Madras department topper) 🔗 LinkedIn: / raj-abhijit-dandekar-67a33118a 🎓 Dr. Rajat Dandekar (Purdue PhD, IIT Madras department gold medalist) 🔗 LinkedIn: / rajat-dandekar-901324b1 🎓 Dr. Sreedath Panat (MIT PhD, IIT Madras department gold medalist) 🔗 LinkedIn: / sreedath-panat-8a03b69a 🎓 Sahil Pocker (Machine Learning Engineer at Vizuara) 🔗 LinkedIn: / sahil-p-a7a30a8b 🎓 Abhijeet Singh (Software Developer at Vizuara, GSOC 24, SOB 23) 🔗 LinkedIn: / abhijeet-singh-9a1881192 🎓 Sourav Jana (Software Developer at Vizuara) 🔗 LinkedIn: / souravjana131

Lecture 6: Stages of building an LLM from Scratch

Lecture 6: Stages of building an LLM from Scratch

Vision Transformers - Explained!

Vision Transformers - Explained!

Attention in transformers, step-by-step | Deep Learning Chapter 6

Attention in transformers, step-by-step | Deep Learning Chapter 6

China's Chip Breakthrough Terrifies America and Taiwan

China's Chip Breakthrough Terrifies America and Taiwan

How Agents Quietly Break Architecture

How Agents Quietly Break Architecture

Ex-Google Recruiter Explains Why "Lying" Gets You Hired

Ex-Google Recruiter Explains Why "Lying" Gets You Hired

Lecture 1: Building LLMs from scratch: Series introduction

Lecture 1: Building LLMs from scratch: Series introduction

Transformers Explained | Simple Explanation of Transformers

Transformers Explained | Simple Explanation of Transformers

Lecture 8: The GPT Tokenizer: Byte Pair Encoding

Lecture 8: The GPT Tokenizer: Byte Pair Encoding

The Best and Worst Engineering Degrees in the AI Era

The Best and Worst Engineering Degrees in the AI Era

The LLM Interview Series #1: What exactly is the KV Cache?

The LLM Interview Series #1: What exactly is the KV Cache?

I Hacked This Temu Router. What I Found Should Be Illegal.

I Hacked This Temu Router. What I Found Should Be Illegal.

Portugal – Demokratische Republik Kongo Highlights | Gruppe K, FIFA WM 2026 | sportstudio

Portugal – Demokratische Republik Kongo Highlights | Gruppe K, FIFA WM 2026 | sportstudio

Transformers and Self-Attention (DL 19)

Transformers and Self-Attention (DL 19)

Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 1 - Transformer

Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 1 - Transformer

Introduction to Vision Transformer (ViT) | An image is worth 16x16 words | Computer Vision Series

Introduction to Vision Transformer (ViT) | An image is worth 16x16 words | Computer Vision Series

What is a Hilbert Space?

What is a Hilbert Space?

How Attention Mechanism Works in Transformer Architecture

How Attention Mechanism Works in Transformer Architecture

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24