Open Source Model Performance Optimization With SGLang - Yineng Zhang, Together AI
Open Source Model Performance Optimization With SGLang - Yineng Zhang, Together AI SGLang is an open-source fast inference framework in the PyTorch ecosystem built for performant, flexible, extensible model serving. SGLang's growing popularity is in large part thanks to its community ethos and the participation of developers from around the world. Join this BoF session hosted by SGLang core maintainer Yineng Zhang to discuss the future of SGLang and learn how to get involved in the project.

▶︎
Introduction to LLM serving with SGLang - Philip Kiely and Yineng Zhang, Baseten

▶︎
AI Agent Inference Performance Optimizations + vLLM vs. SGLang vs. TensorRT w/ Charles Frye (Modal)

▶︎
Qwen 3.7 Plus: The Most Underrated AI Release Right Now

▶︎
Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

▶︎
What is OpenClaw? Inside AI Agents, LLMs and the Agentic Loop

▶︎
🔍 AI Serving Frameworks Explained: vLLM vs TensorRT-LLM vs Ray Serve | Which One Should You Use?

▶︎
Inference Office Hours with SGLang: Performance Optimizations for LLM Serving

▶︎
Lianmin Zheng on Efficient LLM Inference with SGLang

▶︎
How to pick a GPU and Inference Engine?

▶︎
PyTorch Symmetric Memory: A New Programming Paradigm for Distributed AI - Ke Wen & Chien-Chin Huang

▶︎
Mastering LLM Inference Optimization From Theory to Cost Effective Deployment: Mark Moyou

▶︎
EASIEST Way to Fine-Tune a LLM and Use It With Ollama

▶︎
Model Context Protocol (MCP) Explained for Beginners: AI Flight Booking Demo!

▶︎
torch.compile and Diffusers: A Hands-On Guide to Peak Performance - Sayak Paul, Hugging Face

▶︎
The Best Local Agentic Coding Workflow (Complete Guide)

▶︎
🚗 BYD : The biggest SCAM of the car industry ?

▶︎
Lecture 35: SGLang

▶︎
SGLang Step by Step Beginner Tutorial

▶︎
They Lied to You About AI (This Study Proves It)

▶︎
