If that's the point, shouldn't they ask the model to explain the principle for a...

johnecheck · 2025-06-15T09:00:26 1749978026

Because that would prove absolutely nothing. There are numerous examples of tower of Hanoi explanations in the training set.

elbear · 2025-06-15T11:28:22 1749986902

How do you check that a human understood it and not simply memorised different approaches?

YeGoblynQueenne · 2025-06-15T16:30:49 1750005049

You ask them to solve several instances of the problem?

godelski · 2025-06-15T16:54:10 1750006450

It's hard. But usually we ask several variations and make them show their work.

But a human also isn't an LLM. It is much harder for them to just memorize a bunch of things, which makes evaluation easier. But they also get tired and hungry, which makes evaluation harder ¯\_(ツ)_/¯

elbear · 2025-06-15T17:20:18 1750008018

If we're talking about solving an equation, for example, it's not hard to memorize. Actually, that's how most students do it, they memorize the steps and what goes where[1].

But they don't really know why the algorithm works the way it does. That's what I meant by understanding.

[1] In learning psychology there is something called the interleaving effect. What it says is that you solve several problems of the same kind, you start to do it automatically after the 2nd or the 3rd problem, so you stop really learning. That's why you should interleave problems that are solved with different approaches/algorithms, so you don't do things on autopilot.

godelski · 2025-06-15T17:24:24 1750008264

Yes, tests fail in this method. But I think you can understand why the failure is larger when we're talking about a giant compression machine. It's not even a leap in logic. Maybe a small step

elbear · 2025-06-16T06:37:24 1750055844

I'm not sure what you mean. Btw, I'm not in the field, just have thought a lot about the topic.