Cluster of Macs doing AI
Summary
The transcript discusses performance testing of the Formax Studio M3 Ultra using RDMA connections with Quen 34B instruct, demonstrating scalable GPU processing across multiple nodes. The experiment shows increasing speed and efficiency as more nodes are added, starting from 166 tokens per second on one node and progressively improving with two and four nodes. The key practical takeaway is the effectiveness of RDMA technology with Apple Silicon in achieving faster computational performance through distributed computing.