Ray Agent Engine: Deploying AI Agents with Ray Serve | Ray Summit 2025

At Ray Summit 2025, Juncheng Ma, Deepak Chandramouli, and Bhumik Thakkar from Apple share how Ray Serve can be used as a powerful, framework-agnostic foundation for deploying the next generation of AI agents at scale. They begin by outlining the growing complexity of modern AI agents—spanning multi-step reasoning, tool use, memory, planning, and dynamic interaction patterns—and the corresponding challenges in deploying these systems reliably in production. Traditional serving frameworks often assume static inference graphs, making them ill-suited for agentic workloads that require flexible orchestration and adaptive execution. The speakers then walk through how Apple leverages Ray as an Agent Engine, combining Ray Serve’s distributed execution model with its built-in autoscaling, request routing, and traffic management capabilities. This approach enables: Framework-agnostic agent deployment, allowing teams to integrate agents built using any architecture or library Dynamic, scalable execution for agents that require multi-step reasoning and complex control flows High robustness and resiliency, even under unpredictable load patterns Simplified operational workflows, reducing the burden of managing complex agent pipelines They share practical lessons learned from deploying agentic systems internally at Apple, highlighting patterns that generalize to a wide variety of agent frameworks and application domains. Attendees will leave with actionable insights on how to build scalable, resilient, and production-ready agent applications using Ray Serve—regardless of the underlying agent design or development workflow. Liked this video? Check out other Ray Summit breakout session recordings    • Ray Summit 2025 - Breakout Sessions   Subscribe to our YouTube channel to stay up-to-date on the future of AI!    / anyscale   🔗 Connect with us: LinkedIn:   / joinanyscale   X: https://x.com/anyscalecompute Website: https://www.anyscale.com/

Model Context Protocol (MCP) Explained for Beginners: AI Flight Booking Demo!
▶︎

Model Context Protocol (MCP) Explained for Beginners: AI Flight Booking Demo!

LIVE Departure Bay  Weather Cam & BC Ferries Views | Nanaimo, BC
▶︎

LIVE Departure Bay Weather Cam & BC Ferries Views | Nanaimo, BC

GenAI Essentials – Full Course for Beginners
▶︎

GenAI Essentials – Full Course for Beginners

How Coinbase Uses Ray, vLLM & LiteLLM to Power Secure LLM Services | Ray Summit 2025
▶︎

How Coinbase Uses Ray, vLLM & LiteLLM to Power Secure LLM Services | Ray Summit 2025

The Agent Development Lifecycle: Build, Test, Deploy, Monitor | Interrupt 26
▶︎

The Agent Development Lifecycle: Build, Test, Deploy, Monitor | Interrupt 26

From Prototype to Production: Securely Accelerating Physical AI with Vision-Language-Action Models
▶︎

From Prototype to Production: Securely Accelerating Physical AI with Vision-Language-Action Models

🔍 AI Serving Frameworks Explained: vLLM vs TensorRT-LLM vs Ray Serve | Which One Should You Use?
▶︎

🔍 AI Serving Frameworks Explained: vLLM vs TensorRT-LLM vs Ray Serve | Which One Should You Use?

Prompt Learning: A Reinforcement Learning-Inspired Approach to AI Optimization | Ray Summit 2025
▶︎

Prompt Learning: A Reinforcement Learning-Inspired Approach to AI Optimization | Ray Summit 2025

How AI agents & Claude skills work (Clearly Explained)
▶︎

How AI agents & Claude skills work (Clearly Explained)

Why The Russian Accent Terrifies Everyone
▶︎

Why The Russian Accent Terrifies Everyone

A2A vs MCP: AI Agent Communication Explained
▶︎

A2A vs MCP: AI Agent Communication Explained

From Zero to Your First AI Agent in 25 Minutes (No Coding)
▶︎

From Zero to Your First AI Agent in 25 Minutes (No Coding)

Transformers, the tech behind LLMs | Deep Learning Chapter 5
▶︎

Transformers, the tech behind LLMs | Deep Learning Chapter 5

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan
▶︎

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Webinar: Getting Started with Distributed Training at Scale
▶︎

Webinar: Getting Started with Distributed Training at Scale

Model Context Protocol (MCP), clearly explained (why it matters)
▶︎

Model Context Protocol (MCP), clearly explained (why it matters)

Zig 2026: No-AI Policy, $670K Foundation, Left GitHub & Why Zig Isn’t 1.0 - Andrew Kelley Explains
▶︎

Zig 2026: No-AI Policy, $670K Foundation, Left GitHub & Why Zig Isn’t 1.0 - Andrew Kelley Explains

Agent Academy Live
▶︎

Agent Academy Live

Don't learn AI Agents without Learning these Fundamentals
▶︎

Don't learn AI Agents without Learning these Fundamentals

MCP Tutorial: Build Your First MCP Server and Client from Scratch (Free Labs)
▶︎

MCP Tutorial: Build Your First MCP Server and Client from Scratch (Free Labs)