Discussion about this post

User's avatar
Sairam Sundaresan's avatar

One side says evals are broken, the other says they’re essential. I say: have we tried turning the benchmark off and on again? 😂

Devansh's avatar

I'm a bit surprised that AI Evals debate is getting this much attention. It seemed like mostly pointless intellectualizing + marketing to me

1 more comment...

No posts

Ready for more?