Hacker Newsnew | past | comments | ask | show | jobs | submit | michwilinski's commentslogin

interesting, but I don't agree that if we see the "token reasoning" chain it somehow explains how the model got the final answer. what if we trained deceiver models that would provide a sound chain of explanation but then perform some kind of deception and output an incorrect answer? for me personally, explainability has to show how the answer arose from the model mechanics, not sequential model outputs


> what if we trained deceiver models that would provide a sound chain of explanation but then perform some kind of deception and output an incorrect answer?

You're right on target! That's exactly what they're doing in the paper. They train three models -- a verifier (that rates answers as sounding correct or sounding wrong), a "helpful prover" (that provides correct answers), and "sneaky prover" (that provides incorrect answers that attempt to deceive the verifier into scoring its answer highly).

This adversarial relationship between the "helpful prover" and the "sneaky prover" is the cool part of the paper (IMO).


I mean, SSMs are in fact under the hood RNNs


At the end of the day, either you carry around a hidden state, or you have a fixed window for autoregression.

You can call hidden states "RNN-like" and autoregressive windows "transformer-like", but apart from those two core paradigms I don't know of other ways to do sequence modelling.

Mamba/RWKV/Griffin are somewhere between those two extremes.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: