Safer unsafe with Codex and miri (2026) - Predrag Gruevski

Can Unsafe Rust save millions in data center costs? Learn how OpenAI uses Unsafe Rust and Codex to solve massive scaling challenges. In this Rust NYC talk, we explore the high-stakes world of GPU model training, where unchecked tail latency (P99999) can cost millions. Discover how "hedging" storage requests mitigates this, but requires walking a dangerous tightrope of Unsafe Rust, dodging aliasing violations and Undefined Behavior (UB). See how the OpenAI team leveraged Codex to build Miri-approved test harnesses, separate high-value engineering from tedious plumbing, and ship flawless infrastructure faster. Join the community on Discord via https://rusteastcoast.com ⏱️ Chapters 00:00 Intro and Background 01:06 Unsafe Rust Stakes 02:10 Codex TLDR and Outline 03:21 Training and Checkpoints 05:26 Outage Costs at Scale 08:04 Tail Latency Reality 09:29 Hedging to Beat Tails 12:04 Rust UB and Async Reads 13:45 Tokio Poll Read Tightrope 15:45 Codex Workflow and Testing 22:07 Results and Takeaways 22:50 Video editing sponsor 🔗 Links & Resources *Rust NYC Meetup:* https://www.meetup.com/rust-nyc/ *OpenAI Codex:* https://openai.com/blog/openai-codex *Tokio `AsyncRead` Docs:* https://docs.rs/tokio/latest/tokio/io... *Miri (Rust UB Checker):* https://github.com/rust-lang/miri