Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Bithacks.h - bit hack macros (catonmat.net)
57 points by pkrumins on Sept 3, 2009 | hide | past | favorite | 18 comments


Whenever I need to do bit manipulations, I always check the Linux kernel source. I figure that any bit manipulation I'd ever want to do is also done in the kernel, and I'm usually right:

http://miller.cs.wm.edu/lxr3.linux/http/source/include/linux...

http://miller.cs.wm.edu/lxr3.linux/http/source/include/linux...

http://miller.cs.wm.edu/lxr3.linux/http/source/lib/bitmap.c?...


These are great, I had forgotten about them!


The Linux kernel is a fantastic resource. Any time I need to do some systems programming that I haven't done before, I check to see how it's done in the kernel.


Seconded. That's probably one of the best places to find bit twiddling resouces ever. I remember I had one of those Coriolis kernel printout books back in the 90s, and I used to crib from that, some hashing function and the page allocator all the time.


Another great resource for tidbits like this is _Hacker's Delight_, by Henry S. Warren, Jr. It covers a number of other tricks outside of bit twiddling, as well. Worth checking out for anyone who finds this sort of stuff fascinating.

Link: http://www.amazon.com/Hackers-Delight-Henry-S-Warren/dp/0201...


Another related bit twiddling resource: http://graphics.stanford.edu/~seander/bithacks.html.

The comments accompanying the code are interesting too.


anybody interested should probably consider hakmem, too. http://inwap.com/pdp10/hbaker/hakmem/hakmem.html

and probably alan mycrofts more useful c-interpretation: http://www.cl.cam.ac.uk/~am21/hakmemc.html


For this sort of thing, it would seem helpful to release it under "any license viewed as free by the FSF" or similar. Is that sort of statement likely to cause problems? Or at least license it MIT/BSD/whatever. For such a small piece, it would be nice if projects can just import it without having to move from "all code is BSD" to "all code is BSD except bithacks.h which is MIT, but we also comply with that license".


MIT license is compatible with BSD and GPL licenses, so I don't see a big issue there. But I could put it in public domain, so that there were no license issues at all.


So this is nice, but is it super fast/efficient? It would seem to me that this sort of thing is where inline assembly, on a per processor basis, would really make a huge difference. Is this sort of thing well enough known that compilers already do it for us? Anyone know a lot about this sort of thing?


In general, this will probably get you close to maximum performance. Most CPUs implement shift, rotate, AND, OR, XOR, and NOT instructions, and usually not much else in the way of bit manipulation (besides maybe a few very specialized intrinsics for popular encryption algorithms/hash functions (see SSE)). DSPs and more domain-specific microcontrollers generally have more bit-banging instructions, and you might be able to make better use of them with assembly or intrinsics.

A smart compiler could theoretically transform your crappy implementation of "is the Nth bit set?" into something more intelligent, but it is rarely profitable enough to do so, as programmers who write code in which bit-banging is the bottleneck typically know how to do this themselves; in fact, most of the macros linked are what I consider to be the most parsimonious implementation.

One thing where an intrinsic will definitely beat a C-level implementation is population count (how many bits are set in this int/long?) Many recent x86 parts and many DSPs implement a POPCOUNT/BC instruction which will outperform even a smart LUT-based implementation (without the memory capacity/bandwidth requirement too). There's an interesting anecdote about that instruction here (http://www.moyogo.com/blog/2005/09/secret-opcodes.html ), no idea if it's true.


Thanks for the good reply!


Catonmat never fails in fascinating me, regardless of how stupid this may sound, the more unpopular the language he implements his tutorial/illustration in, the more I realize how unimportant the language actually is. As long as we can/are reuse/reusing what we write, we can rest in the idea that what we do/did now/before actually matters for what we can/are do/doing in the future/present.


As languages go C may be a lot of things but I don't think you can call it "unpopular".


Indeed. I'm always surprised at the number of people I meet (and, distressingly, the number of posters on this site) that honestly believe that "most" software is written by web geeks or IT droids in scripting languages or Java or .NET.

That, and the fact that this belief persists as they turn off their digital televisions, click their garage door remotes, text a friend on their phone, fire up the engine in their personal vehicle, and drive to work stopping carefully at all the carefully automated traffic lights, swipe their ID card and sit at their desk where they can finally get to see some software.


Quoth a friend of mine who is a computer engineer at GE, "Java doesn't keep the lights on."


Exactly, and ironically a lot of those higher level languages are written in C.


How about passé or gauche or some other fashion term, with perhaps, all the implied connotation?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: