🚧 📱

Mobile experience coming soon

Mobile development is in progress. Until it is complete, please use your desktop or laptop.

Thanks!

← Back
AI Engineer July 10, 2025

The Wild World of AI: 6 Months That Changed Everything

Summary

The main theme is skepticism towards traditional AI benchmarks and leaderboards, with a personal anecdote about using a unique "pelican riding a bicycle" SVG generation task to evaluate text models. Key subjects include Deepseek's model release impacting Nvidia's stock, the capabilities of text models in generating code like SVG, and the ethical reporting features of Claude 4. The practical takeaway is to develop personal, unconventional benchmarks to better assess AI model performance beyond standard metrics.

View original episode ↗