Aaron Zisk May 31, 2025

Windows Handles Local LLMs… Before Linux Destroys It

Summary

The transcript explores performance comparisons of running Large Language Models (LLMs) across different operating systems, specifically Windows and Linux, using LM Studio and a Nvidia GeForce RTX 5080 GPU. The presenter demonstrates benchmarking LLM performance by running a Gemma 34B model and generating tokens, aiming to provide a relative comparison between operating systems while maintaining consistent hardware configurations. The practical takeaway suggests that understanding token generation speed and comparative performance across different OS environments can help developers optimize their LLM deployments and understand potential performance variations.

View original episode ↗

Mobile experience coming soon

Windows Handles Local LLMs… Before Linux Destroys It

Summary