BERTopic Explained

90% of the world's data is unstructured. It is built by humans, for humans. That's great for human consumption, but it is very hard to organize when we begin dealing with the massive amounts of data abundant in today's information age. Organization is complicated because unstructured text data is not intended to be understood by machines, and having humans process this abundance of data is wildly expensive and *very slow*. Fortunately, there is light at the end of the tunnel. More and more of this unstructured text is becoming accessible and understood by machines. We can now search text based on *meaning*, identify the sentiment of text, extract entities, and much more. Transformers are behind much of this. These transformers are (unfortunately) not Michael Bay's Autobots and Decepticons and (fortunately) not buzzing electrical boxes. Our NLP transformers lie somewhere in the middle, they're not sentient Autobots (yet), but they can understand language in a way that existed only in sci-fi until a short few years ago. Machines with a human-like comprehension of language are pretty helpful for organizing masses of unstructured text data. In machine learning, we refer to this task as *topic modeling*, the automatic clustering of data into particular topics. BERTopic takes advantage of the superior language capabilities of these (not yet sentient) transformer models and uses some other ML magic like UMAP and HDBSCAN (more on these later) to produce what is one of the most advanced techniques in language topic modeling today. 🌲 Pinecone article: https://www.pinecone.io/learn/bertopic 🔗 Code notebooks: https://github.com/pinecone-io/exampl... 🤖 70% Discount on the NLP With Transformers in Python course: https://bit.ly/3DFvvY5 🎉 Subscribe for Article and Video Updates! / subscribe / membership 👾 Discord: / discord 00:00 Intro 01:40 In this video 02:58 BERTopic Getting Started 08:48 BERTopic Components 15:21 Transformer Embedding 18:33 Dimensionality Reduction 25:07 UMAP 31:48 Clustering 37:22 c-TF-IDF 40:49 Custom BERTopic 44:04 Final Thoughts

Generative AI and Long-Term Memory for LLMs (OpenAI, Cohere, OS, Pinecone)

Generative AI and Long-Term Memory for LLMs (OpenAI, Cohere, OS, Pinecone)

BERTopic for Topic Modeling - Maarten Grootendorst - Talking Language AI Ep#1

BERTopic for Topic Modeling - Maarten Grootendorst - Talking Language AI Ep#1

AlphaFold - The Most Useful Thing AI Has Ever Done

AlphaFold - The Most Useful Thing AI Has Ever Done

Reinventing Entropy | Compression is Intelligence Part 1

Reinventing Entropy | Compression is Intelligence Part 1

An Introduction to Topic Modeling

An Introduction to Topic Modeling

Latent Dirichlet Allocation (Part 1 of 2)

Latent Dirichlet Allocation (Part 1 of 2)

Transformers, the tech behind LLMs | Deep Learning Chapter 5

Transformers, the tech behind LLMs | Deep Learning Chapter 5

6. Monte Carlo Simulation

6. Monte Carlo Simulation

Yann LeCun's $1B Bet Against LLMs [Part 1]

Yann LeCun's $1B Bet Against LLMs [Part 1]

How to use BERTopic - Machine Learning Assisted Topic Modeling in Python

How to use BERTopic - Machine Learning Assisted Topic Modeling in Python

Self-Attention Explained: How Transformers Actually Work (Full Visual Breakdown)

Self-Attention Explained: How Transformers Actually Work (Full Visual Breakdown)

Something is jamming GPS over Europe. Here's what we found

Something is jamming GPS over Europe. Here's what we found

The Best Way to do Topic Modeling in Python - Top2Vec Introduction and Tutorial

The Best Way to do Topic Modeling in Python - Top2Vec Introduction and Tutorial

Unfortunately, I Was Right

Unfortunately, I Was Right

But how do AI images and videos actually work? | Guest video by Welch Labs

But how do AI images and videos actually work? | Guest video by Welch Labs

Why AI Can Never Escape Turing's 1936 Proof

Why AI Can Never Escape Turing's 1936 Proof

Transformers, explained: Understand the model behind GPT, BERT, and T5

Transformers, explained: Understand the model behind GPT, BERT, and T5

The Strange Math That Predicts (Almost) Anything

The Strange Math That Predicts (Almost) Anything

BERT Neural Network - EXPLAINED!

BERT Neural Network - EXPLAINED!

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24