SELF-DIRECTED P̶h̶D̶ EXD in AI Ep. 4: Inference Benchmarking continued
Welcome back to the EXD. Last week we took our first look at LLM inference using vLLM. More specifically, we learned about the prefill and decode phases of an inference pass and how their performance characteristics differ. This week we’ll dig a little deeper and make our first attempt to tune vLLM for better performance. My name is Ram, and I work at the Ethereum Foundation on internal AI ops, and this is an open learning log for what I call the EXD. Episode 01: • SELF-DIRECTED P̶h̶D̶ EXD in AI Ep. 1: What... EXD: github.com/Ramshreyas/EXD Llama-benchy: https://github.com/eugr/llama-benchy

▶︎
SELF-DIRECTED P̶h̶D̶ EXD in AI Ep. 5: Speculative Decoding

▶︎
Mastering LLM Inference Optimization From Theory to Cost Effective Deployment: Mark Moyou

▶︎
Keynote: After the AI Hype – What’s Real, and What’s Next - Richard Campbell - 2026

▶︎
Why Aliens Would NEVER Invade Africa

▶︎
Using Large Language Models | Build Your Own LLM Workshop #1

▶︎
Inside the Mind of Anthropic CEO Dario Amodei | The Circuit | Extended Interview

▶︎
SELF-DIRECTED P̶h̶D̶ EXD in AIEp. 8: Tokenization Deep Dive - Part 2

▶︎
Andrej Karpathy: From Vibe Coding to Agentic Engineering w/ Stephanie Zhan

▶︎
But what is quantum computing? (Grover's Algorithm)

▶︎
How ASML Makes Chips Faster With Its New $400 Million High NA Machine

▶︎
The insane engineering of Deepseek V4

▶︎
Taiwan's DRAM Failure

▶︎
Yann Lecun: Meta AI, Open Source, Limits of LLMs, AGI & the Future of AI | Lex Fridman Podcast #416

▶︎
Visualizing transformers and attention | Talk for TNG Big Tech Day '24

▶︎
But what is the Fourier Transform? A visual introduction.

▶︎
How to Get and Evaluate Startup Ideas | Startup School

▶︎
The Passage of Time and the Meaning of Life | Sean Carroll

▶︎
Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI

▶︎
God Says:"TAKE THIS MESSAGE SERIOUSLY, BECAUSE ONLY YOU ARE SEEING IT"/God Message Now/God Message

▶︎
