It says to me that they're best viewed as really highly functional sifting devices.
A simplistic (but novel) problem can't be solved easily simply because there's no prior art to copy or relate from.
It would be much more useful to talk about how specific LLMs perform at specific tasks. LLMs are complex and varied.
The comments are rife with examples of LLMs that, unlike Bard, pass this particular test.
LLMs are not capable of reasoned intelligence in the way many think they are.
It seems to me that the comments are actually full of variations on this particular failure.
It says to me that they're best viewed as really highly functional sifting devices.
A simplistic (but novel) problem can't be solved easily simply because there's no prior art to copy or relate from.