Vector Embedding : Build LLM from scratch

Welcome back to the series on building Large Language Models (LLMs) from scratch! In this lecture, we move beyond basic tokenization to explore Vector Embeddings and discover how machines actually grasp the semantic meaning of words. In this session, we cover: The Limits of Tokenization: Why simply converting words to numbers (e.g., assigning "cat" the ID 23) isn't enough, as it completely loses the semantic meaning and context of the word. Introduction to Vector Embeddings: Learn how models capture the "nearness" or similarity of words by assigning weighted characteristics across multiple dimensions (e.g., "is it an animal?", "is it a pet?", "is it edible?"). Embeddings in GPT Models: A historical look at how dimensionality has scaled, from GPT-1 using 768 dimensions for 40,000 words, up to GPT-3 leveraging 12,288 dimensions per token. The Optimization Process: Understand how algorithms start with random weights, make predictions, calculate error and loss functions, and optimize those weights during the model's self-attention training step. Exploring Google's Word2Vec: A hands-on demonstration using the Word2Vec library, which is trained on 3 million words across 300 dimensions. Vector Math & Word Similarity: Watch the model calculate semantic relationships, like finding that "Yen" minus "Japan" plus "India" equals "Rupee," or seeing that the similarity between "King" and "Queen" is 65% compared to just 22% for "King" and "man". Custom vs. Pre-Trained Embeddings: Discover why you might encounter out-of-vocabulary errors (like with the name "Bill Gates") and why developing your own custom vector embeddings is essential for specialized training data. Stay tuned for our next steps in the LLM-building journey!

Is RAG Still Needed? Choosing the Best Approach for LLMs

Is RAG Still Needed? Choosing the Best Approach for LLMs

The Simple Algorithm at the Heart of AlphaZero

The Simple Algorithm at the Heart of AlphaZero

Yann LeCun's $1B Bet Against LLMs [Part 1]

Yann LeCun's $1B Bet Against LLMs [Part 1]

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Positional Embedding : LLM From Scratch

Positional Embedding : LLM From Scratch

Attention in transformers, step-by-step | Deep Learning Chapter 6

Attention in transformers, step-by-step | Deep Learning Chapter 6

Using Large Language Models | Build Your Own LLM Workshop #1

Using Large Language Models | Build Your Own LLM Workshop #1

What is a Vector Database? Powering Semantic Search & AI Applications

What is a Vector Database? Powering Semantic Search & AI Applications

We're 99.9% sure this pattern is true, but no one can prove it

We're 99.9% sure this pattern is true, but no one can prove it

Stanford CS25: Transformers United V6 I From Language Models to Native Multimodal Intelligence

Stanford CS25: Transformers United V6 I From Language Models to Native Multimodal Intelligence

The Strange Math That Predicts (Almost) Anything

The Strange Math That Predicts (Almost) Anything

I Built My Own LLM Completely From Scratch (for pirates)

I Built My Own LLM Completely From Scratch (for pirates)

Bond Future contract : Cheapest to Deliver

Bond Future contract : Cheapest to Deliver

Learn Text Embeddings in 20 Minutes (full guide for beginners)

Learn Text Embeddings in 20 Minutes (full guide for beginners)

Most devs don't understand how LLM tokens work

Most devs don't understand how LLM tokens work

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

If You Have A Bad Memory, I’ll Help You Fix It In 28 Minutes

If You Have A Bad Memory, I’ll Help You Fix It In 28 Minutes

Google DeepMind Distinguished Eng (L9): How To Land a Job at a Frontier Lab | Vlad Feinberg

Google DeepMind Distinguished Eng (L9): How To Land a Job at a Frontier Lab | Vlad Feinberg

How AI Learned to Teach Itself [JEPA]

How AI Learned to Teach Itself [JEPA]

6. Monte Carlo Simulation

6. Monte Carlo Simulation