AI Engineer June 10, 2025

Break It 'Til You Make It: Building the Self-Improving Stack for AI Agents - Aparna Dhinakaran

Summary

The main theme is the difficulty and importance of agent evaluation in building AI applications, highlighting the challenges of iterative prompt and tool call refinement. The transcript points to the need for systematic tracking, collaboration, and understanding bottlenecks through observability and tracing, rather than relying on subjective, Excel-based approaches. The practical takeaway is to implement robust evaluation techniques at various levels, including tool calls and entire conversational trajectories, to systematically improve agent performance.

View original episode ↗

Mobile experience coming soon

Break It 'Til You Make It: Building the Self-Improving Stack for AI Agents - Aparna Dhinakaran

Summary