This is just one task it fails at, hardly enough to generalize from.

int_19h · on May 11, 2023

That's not the only task it fails at, though. Just the one that I found the most interesting when it comes to broader implications because of so many self-contradictions in the output.

Broadly speaking, I haven't seen a single complex example yet where the output was comparable to GPT-4. How close it is to GPT-3.5 is debatable - the overall feeling that I get is that it's better on some tasks and worse on others; this might actually be down to fine-tuning.

cubefox · on May 11, 2023

Makes sense. Others also point out it is not as good as GPT-4 in several benchmarks.

https://news.ycombinator.com/item?id=35895404

They did in fact mostly avoid comparison with GPT-4 in the report. It could of course also be that Bard isn't even running on the largest PaLM 2 model, Unicorn. It seems they would have mentioned that though.

But PaLM 2 seems to be just an intermediate step anyway, since their big new model is "Gemini" (i.e. twins, an allusion to the DeepMind/Brain merger?), which is currently in training, according to Pichai. They also mentioned Bard will switch to Gemini in the future.