Aaron Zisk January 2, 2026

Fastest 1000000 tokens

Summary

The transcript explores a comprehensive performance benchmark comparing various GPUs and machines to determine which can generate 1 million tokens the fastest, focusing specifically on the Quen 3 4B language model across different hardware configurations. The speaker tested multiple systems including an AMD Radeon, DGX Spark, Mac Studio M3 Ultra, and other GPUs with varying VRAM capacities, using consistent testing parameters to ensure a fair comparison. The technical investigation aims to demonstrate that performance isn't solely dependent on expensive or high-end hardware, with the speaker hinting at surprising results that challenge conventional expectations about computational performance.

View original episode ↗

Mobile experience coming soon

Fastest 1000000 tokens

Summary