Streaming, Fast and Slow: Cognitive Load-Aware Streaming for Efficient LLM Serving
Streaming, Fast and Slow: Cognitive Load-Aware Streaming for Efficient LLM Serving Chang Xiao, Brenda Z. Yang UIST 2025: The 38th Annual ACM Symposium on User Interface Software and Technology Session: 4. Addressing Cognitive Load Generative conversational interfaces powered by large language models (LLMs) typically stream output token-by-token at a rate determined by computational budget, often neglecting actual human reading speeds and the cognitive load associated with the content. This mismatch frequently leads to inefficient use of computational resources. For example, in cloud-based services, streaming content faster than users can read appears unnecessary, resulting in wasted computational resources and potential delays for other users, particularly during peak usage periods. To address this issue, we propose an adaptive streaming method that dynamically adjusts the pacing of LLM streaming output in real-time based on inferred cognitive load. Our approach estimates the cognitive load associated with streaming content and strategically slows down the stream during complex or information-rich segments, thereby freeing computational resources for other users. We conducted a statistical analysis and simulation based on a statistical model derived from data collected in a crowdsourced user study across various types of LLM-generated content. Our results show that this adaptive method can effectively reduce computational consumption while largely maintaining streaming speed above user's normal reading speed. DOI:: doi.org/10.1145/3746059.3747721 Web:: https://programs.sigchi.org/uist/2025... Video presentations for UIST 2025 papers

NeuroSync: Intent-Aware Code-Based Problem Solving via Direct LLM Understanding Modification

What Nobody Tells You About Being a Quant

UIST 2025 Opening + Keynote

The Future of AI Agents with Andrew Ng | Interrupt 26

Using Large Language Models | Build Your Own LLM Workshop #1

This is not the AI we were promised | The Royal Society

Why Aliens Would NEVER Invade Africa

Inside the Mind of Anthropic CEO Dario Amodei | The Circuit | Extended Interview

How US Air Force B 52 Pilot Performed an Emergency Takeoff at Full Speed

Stop Prompting Claude. Use Karpathy's Method Instead.

AI Software Development Is Near-Impossible

Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

Judge Can’t Stop Laughing At Sovereign Citizen’s Courtroom Meltdown!!!

How To Think SO CLEARLY People Assume You're A Genius

Inside Anthropic, the $965 Billion AI Juggernaut | The Circuit

Google DeepMind Distinguished Eng (L9): How To Land a Job at a Frontier Lab | Vlad Feinberg

She Asks if I Know Coldplay and This Singer Shocks The Street

MEDebiaser: A Human-AI Feedback System for Mitigating Bias in Multi-label Medical Image Classific...

Framed Art Screensaver Spring | TV Art Slideshow Modern | Floral Frame Background

