AI Engineer July 29, 2025

[Full Workshop] Building Metrics that actually work — David Karam, Pi Labs (fmr Google Search)

Summary

The transcript focuses on the challenges and complexities of conducting evaluations (evals) in machine learning and data science, particularly around defining appropriate metrics and understanding performance. Participants discuss the labor-intensive nature of creating meaningful evaluations, the importance of feedback mechanisms for training AI agents, and the growing recognition that validation and testing now consume significant development time. The key takeaway is that while evaluations are critical for improving AI systems, they remain a sophisticated and challenging process that requires careful, context-specific approaches and continual experimentation.

View original episode ↗

Mobile experience coming soon

[Full Workshop] Building Metrics that actually work — David Karam, Pi Labs (fmr Google Search)

Summary