It's very clear that the current Bard model is weaker than the largest PaLM 2 model. But for certain things, Bard seems worse than even the smallest model described. It's hard to say without someone doing a comprehensive benchmark, but the artifically limited context size makes testing useless for data.
The model was surprisingly confident when I tried to ask it about the relationship between better language comprehension and parameter size. The coherence displayed by the model, when it argued that a smaller model size will be capable of matching and surpassing competitive model performance, was a little jarring. Especially when, in the question right before, it said that the large PaLM 2 model has 540 trillion parameters.
The largest PaLM 2 model is smaller than 540 billion parameters of PaLM 1 (let alone 540 trillion!). From the PDF "The largest model in the PaLM 2 family, PaLM 2-L, is significantly smaller than the largest PaLM model but uses
more training compute."
The model was surprisingly confident when I tried to ask it about the relationship between better language comprehension and parameter size. The coherence displayed by the model, when it argued that a smaller model size will be capable of matching and surpassing competitive model performance, was a little jarring. Especially when, in the question right before, it said that the large PaLM 2 model has 540 trillion parameters.