Lev Konstantinovskiy - Text similiarity with the next generation of word embeddings in Gensim
Description What is the closest word to "king"? Is it "Canute" or is it "crowned"? There are many ways to define "similar words" and "similar texts". Depending on your definition you should choose a word embedding to use. There is a new generation of word embeddings added to Gensim open source NLP package using morphological information and learning-to-rank: Facebook's FastText, VarEmbed and WordRank. Abstract There are many ways to find similar words/docs with an open-source Natural Language processing library Gensim that I maintain. I will give an overview of modern word embeddings like Google's Word2vec, Facebook's FastText, GloVe, WordRank, VarEmbed and discuss what business tasks fit them best. What is the most similar word to "king"? It depends on what you mean by similar. "King" can be interchanged with "Canute", but it's attribute is "crown". We will discuss how to achieve these two kinds of similarity from word embeddings. www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases. 00:00 Welcome! 00:10 Help us add time stamps or captions to this video! See the description for details. Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...

Robert Meyer - Analysing user comments with Doc2Vec and Machine Learning classification

Data Pre-Processing for Word2Vec - NLP for Tensorflow ep.1

Natural language processing (for the impatient) - Sebastian Dziadzio

The Strange Math That Predicts (Almost) Anything

How To Think SO CLEARLY People Assume You're A Genius

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

What are Word Embeddings?

Lev Konstantinovskiy - Word Embeddings for fun and profit in Gensim

LSTM is dead. Long Live Transformers!

Understanding Word2Vec

Chris Moody introduces lda2vec

Künstliche Intelligenz - Richard David Precht mit Prof. Jürgen Schmidhuber

How To Become Dangerously Self-Educated (with AI)

Lecture 2 | Word Vector Representations: word2vec

Inside the Mind of Anthropic CEO Dario Amodei | The Circuit | Extended Interview

Topic Modeling with Python

Lev Konstantinovskiy - Next generation of word embeddings in Gensim

Die Zombie-Simulation, die niemand erklären kann

Matti Lyra - Evaluating Topic Models

