Gemini 3.0 Pro (or what is deemed to be 3.0 Pro - you can get access to it via A...

rozab · 2025-10-16T13:12:23 1760620343

It was Google that featured a bicycling pelican in a presentation a few months back:

https://simonwillison.net/2025/Jun/6/six-months-in-llms/#ai-...

So I think the benchmark can be considered dead as far as Gemini goes

fellowmartian · 2025-10-15T23:37:59 1760571479

There’s obviously no improvement on this metric and hasn’t been in a while.

jiggawatts · 2025-10-16T00:55:14 1760576114

How do people trigger A/B testing?

simonw · 2025-10-16T03:30:03 1760585403

As far as I can tell they just keep on hammering the same prompt in https://aistudio.google.com/ until they get lucky and the A/B test triggers for them on one of those prompts.

qingcharles · 2025-10-16T02:09:31 1760580571

That 2nd one is wild.

Ugh. I hate this hype train. I'll be foaming at the mouth with excitement for the first couple of days until the shine is off.