Bit of both. I don't have to do it often and when you really need to press all the performance you can get out of some code it can be really maddening when there is no "obvious" reason. Quite often it is just an accumulation of many very small slowdowns which together are significant and then the measure, change, measure, change, measure ... loop can be very tedious. Sometimes, there are interesting things like changing the whole algorithm to something which performs better or reading up on bit-fiddling tricks which make some code remarkably faster (Hacker's Delight is not for the faint of the heart, but an interesting book nonetheless).