I think you’re describing something else. The defining characteristic of ugly code is it’s hard to reason about and thus not easy to maintain. It can still reliable, functional, and ugly.
I think you’re getting at a deeper issue. The need for bug-for-bug parity internally is itself a sign of technical debt.
At this point we can’t for example fix historical calendars, so all highly accurate date and time related code is going to be complicated and full of edge cases. Such code is in effect an interest payment on society’s existing technical debt. So, the temptation to refactor such code is generally attacking the wrong side of a problem. The best approach may be to quarantine such code/systems and try and minimize the impact of such issues.
I think step zero for this kinda thing should be writing a test suite using the current code as the reference implementation. Then you spike it out with a first pass and if your clean version can’t get 90%+ without lots of edge case handling or abstraction breaking then you keep what you have.