Aaron Zisk October 24, 2025

FP4 quants on nvidia are different

Summary

The transcript discusses performance gains in quantizing from FB8 to FB4 on Nvidia's hardware, highlighting unexpected speed improvements and efficiency. Nvidia reports a 5x speed increase, which is more significant than the expected 2x due to complementary formats and optimizations in Hopper and Blackwell architectures. The most striking revelation is the dramatic 25-50x reduction in power consumption, suggesting major advancements in computational efficiency for AI and machine learning technologies.

View original episode ↗

Mobile experience coming soon

FP4 quants on nvidia are different

Summary