Software 3.0: Building the Neural Computer

An engineering deep dive into the architecture of "Neural Computers" and Software 3.0 (as defined by Andrej Karpathy). Flipbook is a browser where every page is generated pixels, streamed live from a model. The interactive world models that make that possible (Genie 3, Oasis, MirageLSD) only just cleared 24 fps at 720p in 2025. We break down the latest research and concurrent advances in closing this 100x latency gap. Consistency models and distribution matching distillation collapse 50 denoising steps to just one or two. Autoregressive factorization with KV cached attention allows frame streaming similar to LLM token generation. To fix the autoregressive drift that used to break simulations after thirty seconds, the field is relying on Diffusion Forcing, Self Forcing, and Attention Sinks. For user input, we examine Latent Action Models. A network learns its own discrete control vocabulary from unlabeled gameplay and maps standard keystrokes or clicks directly into the inference slot. There are a lot of open problems left in this domain. Following Sora's high compute burn and subsequent March 2026 shutdown, the financial reliability of the "March of Nines" is in question. Furthermore, persistent state has no native place in a stateless pixel stream. We look at NeuralOS and system prompt learning as early bets to handle drafts, open tabs, and scroll positions. The neural substrate exists in research preview today, but the work ahead to build a fully generative operating system is a decade long challenge. Chapters: ------------------------------ 00:00 Flipbook: Every Pixel Streamed from a Model 00:57 Andrej Karpathy's Software 3.0 and the Intelligent Canvas 02:10 World Model Lineage: GameNGen to Genie 3 03:23 World Models as the AGI Substrate 04:19 The 16ms Wall: Diffusion's Real-Time Budget 05:02 Diffusion Distillation: From 50 Steps to Two 05:56 Streaming Frames with KV-Cached Autoregression 06:33 Autoregressive Drift: Diffusion Forcing, Self Forcing, Attention Sinks 07:42 Latent Action Models: Genie's Control Vocabulary from Video 08:58 Inside Flipbook: LTX Studio and VLM Click Routing 09:46 Persistent State: NeuralOS and System Prompt Learning 11:15 Sora's Compute Bill and the March of Nines 12:06 Hybrid Roadmap: From Artifacts to Neural Substrate 13:27 Substrate in Preview: A Decade of Work Ahead References: -------------------------- • Shah, Jiao, O'Carr (2026), Flipbook announcement: https://x.com/zan2434/status/20469823... • Andrej Karpathy (2025), Software Is Changing (Again), YC AI Startup School: • Andrej Karpathy: Software Is Changing (Again) • Karpathy (2024), "GUI hasn't been invented yet" tweet: https://x.com/karpathy/status/1917920... • Valevski, Leviathan, Arar, Fruchter (2024), Diffusion Models Are Real-Time Game Engines (GameNGen): https://arxiv.org/abs/2408.14837 • Decart, Etched (2024), Oasis: A Universe in a Transformer: https://oasis-model.github.io/ • Microsoft Research (2025), Introducing Muse (WHAM, Nature): https://www.microsoft.com/en-us/resea... • Google DeepMind (2025), Genie 3: A new frontier for world models: https://deepmind.google/blog/genie-3-... • Decart (2025), MirageLSD: The First Live-Stream Diffusion AI Video Model: https://decart.ai/publications/mirage • World Labs (2025), Marble: A Multimodal World Model: https://www.worldlabs.ai/blog/marble-... • Bruce et al. (2024), Genie: Generative Interactive Environments (ICML): https://arxiv.org/abs/2402.15391 • Huang, Li, He, Zhou, Shechtman (2025), Self Forcing (NeurIPS spotlight): https://arxiv.org/abs/2506.08009 • Chen, Monsó, Du, Simchowitz, Tedrake, Sitzmann (2024), Diffusion Forcing (NeurIPS): https://arxiv.org/abs/2407.01392 • Xiao, Tian, Chen, Han, Lewis (2024), Efficient Streaming Language Models with Attention Sinks (ICLR): https://arxiv.org/abs/2309.17453 • Rivard, Sun, Guo, Chen, Deng (2025), NeuralOS: https://arxiv.org/abs/2507.08800 • Karpathy on Dwarkesh Patel (2025), AGI is still a decade away: https://www.dwarkesh.com/p/andrej-kar... • Anthropic (2024), Artifacts are now generally available: https://www.anthropic.com/news/artifacts • OpenAI (2024), Introducing canvas: https://openai.com/index/introducing-... • Lightricks, LTX-Video (Flipbook backbone): https://github.com/Lightricks/LTX-Video #generativeui #worldmodels #neuralcomputers #aivideo #genie3 #flipbook #karpathy #diffusion #neuralrendering #llm #google #deepmind #openai #anthropic

Inside AI Video Generation: A Technical Breakdown

Inside AI Video Generation: A Technical Breakdown

I Decoupled Attention from Weights - Gemma 4 26B

I Decoupled Attention from Weights - Gemma 4 26B

All Ai Chips Explained in 8 minutes

All Ai Chips Explained in 8 minutes

I Tested the Cheapest Path to 96GB of VRAM

I Tested the Cheapest Path to 96GB of VRAM

Everything I Learned Training Frontier Small Models — Maxime Labonne, Liquid AI

Everything I Learned Training Frontier Small Models — Maxime Labonne, Liquid AI

AI Is No Longer A Black Box.

AI Is No Longer A Black Box.

Yann LeCun Says LLMs Have 2 Years Left…

Yann LeCun Says LLMs Have 2 Years Left…

The Research Ideas That Make Computer-Use Agents Actually Work

The Research Ideas That Make Computer-Use Agents Actually Work

The insane engineering of Deepseek V4

The insane engineering of Deepseek V4

The Engineering Behind Training a 2 Trillion Parameter LLM

The Engineering Behind Training a 2 Trillion Parameter LLM

How AI Learned to Reason: DeepSeek and o1 Explained

How AI Learned to Reason: DeepSeek and o1 Explained

Evolution of NVIDIA Tech Demos 1998-2026

Evolution of NVIDIA Tech Demos 1998-2026

Can AI Data Centers in Space Actually Work?

Can AI Data Centers in Space Actually Work?

Full Archon Guide - Build AI Coding Harnesses That Actually Ship (LIVE)

Full Archon Guide - Build AI Coding Harnesses That Actually Ship (LIVE)

Why Google Just Gave Away Gemma 4 for Free

Why Google Just Gave Away Gemma 4 for Free

Mechanistic Interpretability: Reverse Engineering LLMs

Mechanistic Interpretability: Reverse Engineering LLMs

Rethinking AI Agents: The Rise of Harness Engineering

Rethinking AI Agents: The Rise of Harness Engineering

The most complex model we actually understand

The most complex model we actually understand

The Engineering Behind LLM Inference: The Memory Wall

The Engineering Behind LLM Inference: The Memory Wall

How Meta Went From Open Source Hero to AI's Biggest Villain

How Meta Went From Open Source Hero to AI's Biggest Villain