If the current Bard is really running on PaLM 2, it still hallucinates worse tha...

EvgeniyZh · on May 10, 2023

I don't think current bard runs on palm 2, otherwise it's complete failure

int_19h · on May 10, 2023

In their official blog post today, Google says this:

"PaLM 2’s improved multilingual capabilities are allowing us to expand Bard to new languages, starting today. Plus, it’s powering our recently announced coding update."

and when I check the Updates tab in Bard UI, it has this entry for today:

"Expanding access to Bard in more countries and languages. You can now collaborate with Bard in Japanese and Korean, in addition to US English. We have also expanded access to Bard in all three languages to over 180 countries."

which seems to strongly imply that it is, indeed, PaLM 2. Just to be sure, I gave it the same puzzle in Korean, and got a similarly lackluster response.

DashAnimal · on May 10, 2023

In their presentation, they talked about multiple sizes for the PaLM 2 model, named Gecko, Otter, Bison and Unicorn, with Gecko being small enough to run offline on mobile devices. I can't seem to find any info on what size model is being used with Bard at the moment.

int_19h · on May 10, 2023

Indeed, it's likely that they're running a fairly small model. But this is in and of itself a strange choice, given how ChatGPT became the gateway drug for OpenAI. Why would Google set Bard up for failure like that? Surely they can afford to run a more competent model as a promo, if OpenAI can?

cubefox · on May 11, 2023

This is just one task it fails at, hardly enough to generalize from.

int_19h · on May 11, 2023

That's not the only task it fails at, though. Just the one that I found the most interesting when it comes to broader implications because of so many self-contradictions in the output.

Broadly speaking, I haven't seen a single complex example yet where the output was comparable to GPT-4. How close it is to GPT-3.5 is debatable - the overall feeling that I get is that it's better on some tasks and worse on others; this might actually be down to fine-tuning.

cubefox · on May 11, 2023

Makes sense. Others also point out it is not as good as GPT-4 in several benchmarks.

https://news.ycombinator.com/item?id=35895404

They did in fact mostly avoid comparison with GPT-4 in the report. It could of course also be that Bard isn't even running on the largest PaLM 2 model, Unicorn. It seems they would have mentioned that though.

But PaLM 2 seems to be just an intermediate step anyway, since their big new model is "Gemini" (i.e. twins, an allusion to the DeepMind/Brain merger?), which is currently in training, according to Pichai. They also mentioned Bard will switch to Gemini in the future.

sgt101 · on May 10, 2023

it claims to run on LaMDA at the moment

int_19h · on May 10, 2023

If you mean asking it what it's running on, it just hallucinates. As others have noted in the comments here, you can get it to say that it runs on PaLM 3 quite easily.

splatzone · on May 11, 2023

In chat history you can see which model generated each request - for me it’s always LaMDA

int_19h · on May 11, 2023

It just says "Bard", even if I click on "Details". Are you, perhaps, using some kind of internal preview?

splatzone · on May 11, 2023

Strange, this is what I see: https://imgur.com/a/sgtVt2O

I'm based in the UK, I wonder if that makes any difference

callmekit · on May 11, 2023

I don't see which model generated each request, where exactly do you see this?

splatzone · on May 11, 2023

I see it on the 'My Activity' page: https://myactivity.google.com/u/1/product/bard?utm_source=ba...

Here's a screenshot: https://imgur.com/a/sgtVt2O

MacsHeadroom · on May 11, 2023

Mine says "Bard" where yours says LaMDA. https://i.imgur.com/p8wIPHj.png