Demystifying CXL Memory Computation Yongil Jung, XCENA

Modern analytical engines demand massive memory capacity and bandwidth for large-scale scans, aggregations, and joins, yet conventional server architectures face hard limits in slot count, density, and cost. CXL offers a compelling alternative by enabling cache-coherent access to device-attached memory over PCIe. Memory expansion allows more datasets to reside in-memory, memory pooling shares capacity across multiple hosts to improve utilization and reduce inter-host data transfer overhead, and near-data processing pushes query operations closer to where data resides, reducing unnecessary movement. In this talk, we introduce the MX1, a CXL computational memory device that goes beyond expansion by offloading columnar query operations — decompression, filtering, aggregation, and string search — directly at the memory controller. We present microbenchmark results showing up to 5× throughput and 19× energy efficiency improvements over host CPU execution with CXL memory, and demonstrate how these kernels compose into TPC-H query plans with end-to-end performance gains. We then share our experience integrating with Velox, describing how we leveraged its extensibility interfaces to offload query operators to XFLARE, our Rust-based OLAP query engine built for accelerating MX1. We discuss what worked, the extensibility challenges we encountered, and future directions including a contribution idea for a CXL-aware memory allocation.

Stunning goal! Iraq shocks weak Spain: Spain - Iraq | Friendlies | DAZN Highlights

Stunning goal! Iraq shocks weak Spain: Spain - Iraq | Friendlies | DAZN Highlights

IO Programming Tutorial - Part I: The Problem

IO Programming Tutorial - Part I: The Problem

Veloxifying Spark workloads at Meta Ankur Pathela, Meta

Veloxifying Spark workloads at Meta Ankur Pathela, Meta

Moving Bytes at The Speed of Light Mario Angelis, IBM

Moving Bytes at The Speed of Light Mario Angelis, IBM

Status of Linux Boot-time Work - Tim Bird, Sony Electronics

Status of Linux Boot-time Work - Tim Bird, Sony Electronics

The ASML Replacement Nobody Saw Coming

The ASML Replacement Nobody Saw Coming

Taiwan's DRAM Failure

Taiwan's DRAM Failure

Accelerating Oracle AI & Data Platform with Velox Koushik Kumar Mondal, Oracle

Accelerating Oracle AI & Data Platform with Velox Koushik Kumar Mondal, Oracle

New Chip Factory That Terrifies TSMC

New Chip Factory That Terrifies TSMC

The Local AI Hardware Mistake Everyone Makes

The Local AI Hardware Mistake Everyone Makes

Velox at IBM Volkmar Uhligh, IBM

Velox at IBM Volkmar Uhligh, IBM

Hardware Architect Answers Microchip Questions | Tech Support | WIRED

Hardware Architect Answers Microchip Questions | Tech Support | WIRED

🚗 BYD : The biggest SCAM of the car industry ?

🚗 BYD : The biggest SCAM of the car industry ?

How does Computer Memory Work? 💻🛠

How does Computer Memory Work? 💻🛠

AMD's CEO Wants to Chip Away at Nvidia's Lead | The Circuit with Emily Chang

AMD's CEO Wants to Chip Away at Nvidia's Lead | The Circuit with Emily Chang

Inside China’s Top Factory: How Premium CPU Air Coolers Are Made | Deepcool

Inside China’s Top Factory: How Premium CPU Air Coolers Are Made | Deepcool

NVIDIA didn't want me to do this

NVIDIA didn't want me to do this

NVIDIA Just Slapped Apple Silicon - RTX Spark

NVIDIA Just Slapped Apple Silicon - RTX Spark

Trump Brags About His Brain, Crowd Size & Pool, CBS Fires Scott Pelley & Don Jr's Honeymoon Video

Trump Brags About His Brain, Crowd Size & Pool, CBS Fires Scott Pelley & Don Jr's Honeymoon Video

Hybrid CPU GPU Acceleration John Janakiraman, Datapelago

Hybrid CPU GPU Acceleration John Janakiraman, Datapelago