Secure & Scalable AI on Ray + Kubernetes: Google’s Decoupled Agent Pattern | Ray Summit 2025
At Ray Summit 2025, Alex Bulankou and Brandon Royal from Google share how to bring agentic AI systems out of the lab and into production through the Decoupled Agent Pattern—a scalable, resilient, and secure architecture built on Ray and Kubernetes. They begin by outlining the core production challenge of agentic systems: integrating LLMs, tools, and long-lived stateful agents while ensuring security, elasticity, and high-throughput execution. Traditional architectures struggle to balance these constraints. The Decoupled Agent Pattern solves this by cleanly separating the stateful agent logic from the stateless, scalable tools it invokes. At the heart of this pattern: The agent’s core logic runs as a durable Ray Actor, with lifecycle and placement managed by Ray’s Global Control Store (GCS) for high availability. Tools are executed as thousands of stateless Ray Tasks, enabling massive parallelism and elasticity. Untrusted or dynamically generated code runs in gVisor sandboxes, providing kernel-level isolation without compromising throughput—made possible through Kubernetes’ secure runtime capabilities. Alex and Brandon demonstrate the architecture with a series of live scenarios, including a financial analysis agent running on a Ray cluster on Google Kubernetes Engine (GKE). They then show how the architecture leverages deep Kubernetes-native integrations: KubeRay’s topology-aware placement allows Ray to understand node-level characteristics, enabling optimal scheduling. This unlocks intelligent capacity management with tools like Kueue for cost-efficient batch scheduling. And it provides a clear pathway to mission-critical resilience, supporting zero-downtime upgrades and fault-tolerant agent execution. Attendees will leave with a practical blueprint for deploying agentic AI systems in production—combining Ray’s distributed computing strengths with Kubernetes’ security and orchestration capabilities to build scalable, resilient, and secure agentic runtimes. Liked this video? Check out other Ray Summit breakout session recordings • Ray Summit 2025 - Breakout Sessions Subscribe to our YouTube channel to stay up-to-date on the future of AI! / anyscale 🔗 Connect with us: LinkedIn: / joinanyscale X: https://x.com/anyscalecompute Website: https://www.anyscale.com/

Scaling Post-Training Workflows with Ray Data, Ray Data LLM, and vLLM | Ray Summit 2025

Brendan Burns: Lessons from Building Kubernetes and the Future of AI Infrastructure

Ray in 30 min

Transformers, the tech behind LLMs | Deep Learning Chapter 5

Full Archon Guide - Build AI Coding Harnesses That Actually Ship (LIVE)

Microsoft Build 2026: See All the Highlights in 15 Minutes

Google Generative AI Leader Certification Course – Pass the Exam!

How xAI Scales Image & Video Processing with Ray | Ray Summit 2025

Keynote: Ray: A Distributed Compute Engine for AI - Robert Nishihara & Ion Stoica

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Red Hat Summit 2026 - Day 1 Keynote - The next platform is choice

RL for Agents Workshop - Deep Dive on Training Agents with RL and Open Source

Don't learn AI Agents without Learning these Fundamentals

From Prototype to Production: Securely Accelerating Physical AI with Vision-Language-Action Models

How AI agents & Claude skills work (Clearly Explained)

Webinar: Getting Started with Distributed Training at Scale

I Tested Every Claude Code Feature, These 12 Are the Best

Beginner's Guide to Ray! Ray Explained

A Brief History of AI: From Machine Learning to Gen AI to Agentic AI

