AI Engineer July 20, 2025

Building Effective Voice Agents — Toki Sherbakov + Anoop Kotha, OpenAI

Summary

The talk focuses on building practical audio agents, transitioning from text-based AI to the emerging multimodal era of images and audio. Key references include advancements in speech-to-speech capabilities, making audio agents faster, more expressive, and accurate for production-level applications. The practical takeaway is that audio models have reached a tipping point where high-quality, scalable applications can now be built.

View original episode ↗

Mobile experience coming soon

Building Effective Voice Agents — Toki Sherbakov + Anoop Kotha, OpenAI

Summary