맥북에서 초거대 4000억 AI 돌려봤습니다... 속도도 빠르다? Flash-MoE 분석 || 메모리는 더욱 중요해질겁니다

We covered the remarkable case of a massive 397B-class AI model running on a MacBook. While it may appear to be a simple demo on the surface, it is actually closer to a signal showing how AI inference structures are evolving. In this video, we systematically examine why Flash-MoE was possible and the differences between Dense and MoE models. In particular, we provided an easy-to-understand explanation of how attention, FFN, and expert layers are separated and selected within a single layer. We also discussed why it is possible to compute by reading only a portion from an SSD without loading the entire model into memory. In this process, we explored the possibility that SSDs could evolve from simple storage devices into layers supporting inference. We also discussed how, ultimately, what matters is not just GPU performance, but how HBM, DRAM, and SSDs are distributed. We summarized the point that future competition in AI semiconductors could expand beyond simple computational performance to a competition in memory tiering. Finally, we addressed the reasons why this structure cannot be immediately applied to all models, as well as its realistic limitations. This video focuses on understanding the big picture of where future AI infrastructure is headed through the example of Flash-MoE. Written by Error Edited by Jin-i Lee [email protected]

NVIDIA is at every AI bottleneck | Trends in AI investment visible in Jensen Huang's keynote pres...
▶︎

NVIDIA is at every AI bottleneck | Trends in AI investment visible in Jensen Huang's keynote pres...

인텔 제국의 화려한 부활, CPU가 AI로 뜰 수 밖에 없는 진짜 이유 | AMD와 퀄컴마저 주목
▶︎

인텔 제국의 화려한 부활, CPU가 AI로 뜰 수 밖에 없는 진짜 이유 | AMD와 퀄컴마저 주목

“전 세계 씹어 먹던 테무” 파는 물건 싹 다 가짜로 밝혀지며 나락행 급행열차 탄 진짜 이유
▶︎

“전 세계 씹어 먹던 테무” 파는 물건 싹 다 가짜로 밝혀지며 나락행 급행열차 탄 진짜 이유

NVIDIA H100의 20배 성능... Cerebras의 세계에서 가장 빠른 AI 추론기 등장 |  On-Chip 메모리와 웨이퍼 기반 칩 설계 의미
▶︎

NVIDIA H100의 20배 성능... Cerebras의 세계에서 가장 빠른 AI 추론기 등장 | On-Chip 메모리와 웨이퍼 기반 칩 설계 의미

개발자들이 클로드를 떠나고 있는 이유 - 김아람 이사(IT커뮤니케이션연구소)
▶︎

개발자들이 클로드를 떠나고 있는 이유 - 김아람 이사(IT커뮤니케이션연구소)

"Stop asking AI questions." The employees most likely to disappear in the next 5 years | Intellec...
▶︎

"Stop asking AI questions." The employees most likely to disappear in the next 5 years | Intellec...

The Rise of Chinese Memory
▶︎

The Rise of Chinese Memory

Something is jamming GPS over Europe. Here's what we found
▶︎

Something is jamming GPS over Europe. Here's what we found

"천안문 사태 알아?" 맥 3500만원 어치에 중국산 AI 딥시크 설치해 봄 / 오목교 전자상가
▶︎

"천안문 사태 알아?" 맥 3500만원 어치에 중국산 AI 딥시크 설치해 봄 / 오목교 전자상가

NVIDIA's $7.5 million mini supercomputer? I tried running it on AI infinitely, lol.
▶︎

NVIDIA's $7.5 million mini supercomputer? I tried running it on AI infinitely, lol.

AI 미래를 정리한 스탠포드·빅테크 공동 논문 | AI 발전의 진짜 병목, 알고리즘이 아닌 컴퓨팅 인프라
▶︎

AI 미래를 정리한 스탠포드·빅테크 공동 논문 | AI 발전의 진짜 병목, 알고리즘이 아닌 컴퓨팅 인프라

"Gangnam is inevitable": The reason why Seoul's commercial districts are collapsing | Advisor Noh...
▶︎

"Gangnam is inevitable": The reason why Seoul's commercial districts are collapsing | Advisor Noh...

Building a house in 4 hours. The shockingly perfect current state of AI robots | Knowledge Expedi...
▶︎

Building a house in 4 hours. The shockingly perfect current state of AI robots | Knowledge Expedi...

I Spent $8,700 on a MacBook. The M5 Made It Obsolete.
▶︎

I Spent $8,700 on a MacBook. The M5 Made It Obsolete.

PCB, CCL, Copper Foil… Green Boards Become Core AI Components | The PCB Shock Created by NVIDIA’s...
▶︎

PCB, CCL, Copper Foil… Green Boards Become Core AI Components | The PCB Shock Created by NVIDIA’s...

"외계인급 기술.." 미국이 한국 반도체를 절대 못 이기는 이유 l 국경없는 클래스 EP.12 (권석준 교수 1부)
▶︎

"외계인급 기술.." 미국이 한국 반도체를 절대 못 이기는 이유 l 국경없는 클래스 EP.12 (권석준 교수 1부)

本地AI哪家强?统一内存大横评!
▶︎

本地AI哪家强?统一内存大横评!

After GPU/memory, 'this' is the next bottleneck.
▶︎

After GPU/memory, 'this' is the next bottleneck.

Shocking AI Difference! M3 vs M4 vs M5 MacBook Performance Comparison
▶︎

Shocking AI Difference! M3 vs M4 vs M5 MacBook Performance Comparison

[Korean dubbing] "AI does all the coding, so why learn?" A Harvard professor's cider-inducing les...
▶︎

[Korean dubbing] "AI does all the coding, so why learn?" A Harvard professor's cider-inducing les...