OSDI '24 - Managing Memory Tiers with CXL in Virtualized Environments

Managing Memory Tiers with CXL in Virtualized Environments Yuhong Zhong, Columbia University, Microsoft Azure; Daniel S. Berger, Microsoft Azure, University of Washington; Carl Waldspurger, Carl Waldspurger Consulting; Ryan Wee, Columbia University; Ishwar Agarwal, Rajat Agarwal, Frank Hady, and Karthik Kumar, Intel; Mark D. Hill, University of Wisconsin–Madison; Mosharaf Chowdhury, University of Michigan; Asaf Cidon, Columbia University Cloud providers seek to deploy CXL-based memory to increase aggregate memory capacity, reduce costs, and lower carbon emissions. However, CXL accesses incur higher latency than local DRAM. Existing systems use software to manage data placement across memory tiers at page granularity. Cloud providers are reluctant to deploy software-based tiering due to high overheads in virtualized environments. Hardware-based memory tiering could place data at cacheline granularity, mitigating these drawbacks. However, hardware is oblivious to application-level performance. We propose combining hardware-managed tiering with software-managed performance isolation to overcome the pitfalls of either approach. We introduce Intel® Flat Memory Mode, the first hardware-managed tiering system for CXL. Our evaluation on a full-system prototype demonstrates that it provides performance close to regular DRAM, with no more than 5% degradation for more than 82% of workloads. Despite such small slowdowns, we identify two challenges that can still degrade performance by up to 34% for "outlier" workloads: (1) memory contention across tenants, and (2) intra-tenant contention due to conflicting access patterns. To address these challenges, we introduce Memstrata, a lightweight multi-tenant memory allocator. Memstrata employs page coloring to eliminate inter-VM contention. It improves performance for VMs with access patterns that are sensitive to hardware tiering by allocating them more local DRAM using an online slowdown estimator. In multi-VM experiments on prototype hardware, Memstrata is able to identify performance outliers and reduce their degradation from above 30% to below 6%, providing consistent performance across a wide range of workloads. View the full OSDI '24 program at https://www.usenix.org/conference/osd...

OSDI '24 - Harvesting Memory-bound CPU Stall Cycles in Software with MSH
▶︎

OSDI '24 - Harvesting Memory-bound CPU Stall Cycles in Software with MSH

SDC 2023 - CXL Memory Disaggregation and Tiering: Lessons Learned from Storage
▶︎

SDC 2023 - CXL Memory Disaggregation and Tiering: Lessons Learned from Storage

System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra
▶︎

System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra

CXL Memory Disaggregation and Tiering: Lessons Learned from Storage
▶︎

CXL Memory Disaggregation and Tiering: Lessons Learned from Storage

7 - Security - Andre Testa 1080p
▶︎

7 - Security - Andre Testa 1080p

USENIX ATC '24 and OSDI '24 - Joint Keynote Address: Scaling AI Sustainably: An Uncharted Territory
▶︎

USENIX ATC '24 and OSDI '24 - Joint Keynote Address: Scaling AI Sustainably: An Uncharted Territory

Trump Sends Vance to Concede to Iran & Reflecting Pool Is Filled with Corruption | The Daily Show
▶︎

Trump Sends Vance to Concede to Iran & Reflecting Pool Is Filled with Corruption | The Daily Show

They can't hide SpaceX losses
▶︎

They can't hide SpaceX losses

The Rising Cost of Dissent in America | Miles Taylor | TED
▶︎

The Rising Cost of Dissent in America | Miles Taylor | TED

Allocators, Explained Simply
▶︎

Allocators, Explained Simply

Creator of C++: Bell Labs, Negative Overhead Abstraction, Mistakes | Bjarne Stroustrup
▶︎

Creator of C++: Bell Labs, Negative Overhead Abstraction, Mistakes | Bjarne Stroustrup

COLLAPSE of Personal Computing | Investigation Into the Destruction of Ownership
▶︎

COLLAPSE of Personal Computing | Investigation Into the Destruction of Ownership

Conan O’Brien Mocks Trump At Harvard Commencement | Crowd Erupts During Viral Speech
▶︎

Conan O’Brien Mocks Trump At Harvard Commencement | Crowd Erupts During Viral Speech

Something is jamming GPS over Europe. Here's what we found
▶︎

Something is jamming GPS over Europe. Here's what we found

How To Think SO CLEARLY People Assume You're A Genius
▶︎

How To Think SO CLEARLY People Assume You're A Genius

Memory Tiering and Persistence Enablement with CXL Memory Module - presented by Samsung
▶︎

Memory Tiering and Persistence Enablement with CXL Memory Module - presented by Samsung

OSDI '24 - InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV...
▶︎

OSDI '24 - InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV...

System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra
▶︎

System Design Explained: APIs, Databases, Caching, CDNs, Load Balancing & Production Infra

Samsung | Expanding Beyond Limits With CXL™-based Memory
▶︎

Samsung | Expanding Beyond Limits With CXL™-based Memory

Instant Focus Mode – 40Hz Gamma Brainwave Music for Deep Focus & Productivity
▶︎

Instant Focus Mode – 40Hz Gamma Brainwave Music for Deep Focus & Productivity