Found my mistake. I wrongly assumed the algorithms shown in the paper were the original sorting algorithm and the one improved by Alphadev.
Apparently it just shows some algorithm that was modified and resulted in a different one, neither being a sorting algorithm, but the original still being better.
The text praises Alphadev by saying it makes "moves" that look like a mistake, but are actually brilliant. After that passage the code is shown that does not corroborate that statement, and just illustrates that Alphadev can make changes to code.
The instruction that was removed was: `P=min(A,C)`, which means that `P=A` at that point.
The next instructions:
with `S=min(A,C)` and `Q=B` can be translated into or which means: if B is the smallest, then `P=B`. Otherwise P stays as before, meaning `P=A` for Alphadev and `P=min(A,C)` for the original.So, the end result for sorting A,B,C=3,2,1 would be 3,2,3 for Alphadev's code.
I can't believe, I'm the first one to notice, so I'm probably wrong, but I cannot see where.