Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Oh, I want to play...

So, an optimizing compiler would see that pretty much everything is dead code it would assign 0 to the return register and done.



I changed the return stmt to "return sum2" so the for-loop calculations aren't optimized away and fed the code to godbolt.org (compiler explorer).

gcc-trunk at -O3 for x64 will vectorize the loops but there's no register pressure, so the register allocator wasn't taxed much.

No niche optimization pass to convert to using Gauss's shortcut - https://physicsdb.com/sum-natural-numbers/


x86-64-icx-latest and x86-64-clang-trunk use the Gauss shortcut.

wow. wonder if there's much use for that optimization pattern.

edit: clang discussion https://stackoverflow.com/questions/74417624/how-does-clang-...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: