It depends on the data the model is using to generate the answer. In the case of the example, it seemed to prioritize the logic over the mathematics. So it sought patterns in logic to mimic. That is the ELI5 version.
The more complicated version would be it is not prioritizing mathematical functions as much and instead relying on various deductions, and these deductions are based on a whole chain of logics that are not properly sorted out for reliability and applicability.
The more complicated version would be it is not prioritizing mathematical functions as much and instead relying on various deductions, and these deductions are based on a whole chain of logics that are not properly sorted out for reliability and applicability.