(GPT-2) Language Models are Unsupervised Multitask Learners | Paper Explained

Here’s another video from my GPT series where I analyze the GPT-2(Language Models are Unsupervised Multitasks Learners) paper. I took a closer look at data gathering process, results and safety concerns that prevented the initial public release of the model. Paper: https://d4mucfpksywv.cloudfront.net/b... ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Links: https://huggingface.co/datasets https://openai.com/blog/better-langua... ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Connect with me on: Linkedin - / maciej-balawejder-rt8015 GitHub - https://github.com/maciejbalawejder Medium - / maciejbalawejder Buy Me a Coffee - [https://www.buymeacoffee.com/mbalawejder](https://www.buymeacoffee.com/mbalawejder) ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Timestamps: 0:00 Introduction 0:30 GPT-1 Recap 1:18 Abstract 2:20 Dataset 3:30 Byte Pair Encoding 6:30 Architecture 7:25 Results 7:50 Lambada 8:40 CBT 10:25 Winograd Schema Challenge 11:14 CoQA 11:35 Summarization 12:30 Translation 13:13 Question Answering 13:38 Conclusions 15:00 Safety Concerns

Swin Transformer - Paper Explained

Swin Transformer - Paper Explained

RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

GPT-3: Language Models are Few-Shot Learners (Paper Explained)

GPT-3: Language Models are Few-Shot Learners (Paper Explained)

Yann LeCun's $1B Bet Against LLMs [Part 1]

Yann LeCun's $1B Bet Against LLMs [Part 1]

GPT-1 | Paper Explained & PyTorch Implementation

GPT-1 | Paper Explained & PyTorch Implementation

Attention in transformers, step-by-step | Deep Learning Chapter 6

Attention in transformers, step-by-step | Deep Learning Chapter 6

Interpretability: Understanding how AI models think

Interpretability: Understanding how AI models think

How To Think SO CLEARLY People Assume You're A Genius

How To Think SO CLEARLY People Assume You're A Genius

The Strange Math That Predicts (Almost) Anything

The Strange Math That Predicts (Almost) Anything

ASMR Best Triggers For Sleep Collection (No Talking) 3 Hours of Tapping & Scratching

ASMR Best Triggers For Sleep Collection (No Talking) 3 Hours of Tapping & Scratching

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Let's build GPT: from scratch, in code, spelled out.

Let's build GPT: from scratch, in code, spelled out.

Everything You Need To Know About Large Language Models (LLMs)

Everything You Need To Know About Large Language Models (LLMs)

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

Turing Award Winner: Disagreeing with Google, Postgres, Future Problems | Mike Stonebraker

GPT - Explained!

GPT - Explained!

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Train Your Brain to Never Forget (5 Feynman Habits)

Train Your Brain to Never Forget (5 Feynman Habits)

Transformers, explained: Understand the model behind GPT, BERT, and T5

Transformers, explained: Understand the model behind GPT, BERT, and T5

GPT-1 Paper Explained

GPT-1 Paper Explained