AI Engineer July 9, 2025

2025 in LLMs so far, illustrated by Pelicans on Bicycles — Simon Willison

Summary

The talk reviews the rapid acceleration of LLMs over the past six months, highlighting over 30 significant model releases. Traditional benchmarks are losing credibility, leading to the presenter's reliance on a unique, practical test: generating an SVG of a pelican riding a bicycle. The takeaway is that while benchmarks provide numbers, real-world, complex tasks reveal model capabilities and limitations more effectively.

View original episode ↗

Mobile experience coming soon

2025 in LLMs so far, illustrated by Pelicans on Bicycles — Simon Willison

Summary