Nicolas Makaroff - AI Scientist - Hands-On with Tabular Foundation Models | Pydata London 26
This hands-on tutorial takes participants from zero to confident use of tabular foundation models. Using real datasets, we will run TabICL-style models, benchmark them rigorously against XGBoost and Random Forest, diagnose their behavior, and build intuition for when they help and when they don't. Tabular foundation models are generating excitement, but most practitioners haven't used them yet. This 90-minute hands-on tutorial bridges that gap. Participants will work through four progressive notebooks on real-world datasets of varying difficulty. By the end, they won't just know about tabular FMs — they'll have run them, broken them, and compared them against familiar baselines. Who is this for? Data scientists and ML engineers who: Use sklearn / XGBoost / LightGBM regularly Are curious about tabular FMs but haven't tried them Want to build informed opinions grounded in hands-on experience What we'll use Models: Any TFMs (TabICL, TabPFN or Neuralk proprietary model with free credits), XGBoost, Random Forest Datasets: 3 curated real-world datasets chosen to expose different behaviors: A small medical dataset (~500 rows, 12 features) — where TFMs tend to shine A medium e-commerce dataset (~5K rows, 40+ features with mixed types) — a realistic "grey zone" A large, noisy dataset (~50K rows) — where trees typically dominate Stack: Python 3.9+, sklearn, tabicl, xgboost, matplotlib, pandas www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases. 00:00 Welcome! 00:10 Help us add time stamps or captions to this video! See the description for details. Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...
