Hacker News new | past | comments | ask | show | jobs | submit login
Low Level Bit Hacks You Absolutely Must Know (catonmat.net)
342 points by wheresvic1 on May 10, 2018 | hide | past | favorite | 171 comments



I was asked #1 "Check if the integer is even or odd" in an interview once. Not worded quite that way but that was the answer they were looking for.

12 years later not a single person I have hired could correctly answer that question[1] and most of them (almost all) have been great productive software developers. So "absolutely must know" is entirely dependent on the type of software you write.

[1] I don't ask it in the interview, I ask it after they are hired and have been with us a few weeks just for fun.


What did they usually say? I know the bitwise trick, but I would say "x % 2 == 0" if asked because it makes a lot more sense and will (almost) always compile to the exact same code.


Exactly, readability by other engineers + compiler optimizations is usually better than some bithacking that may or may not help in terms of speed and definitely doesn't help with understanding. (In this case maybe simple enough, but still).


I am not trying to start a argument but it is a bit sad that these things are called "hacks", in the article, and by others.

Some would call it math.

But I guess that shows you the state of our art.


We live in the era were you can get a Billion Bytes for a dollar. We also live in an era where finding someone who can manipulate bits to improve performance is either 1)A Micro-Op 2) Extremely fringe work, or 3) prohibitively expensive to be worth it.


Don't get me wrong. I find most of these "bit hacks" to be quite simple individually. But they add a layer of necessary comprehension.

In most cases you won't see these alone. They will be surrounded by tens of thousands of other lines of code. And where there is one "hack" there are usually others. And at some point you end up with a bug somewhere because one of these smart little hacks is failing. You don't know which one.

This does create unnecessary problems for maintenance in the future. So use it, but use it only when absolutely necessary. Not because you think the code will run faster.

It's also important to note that many things people think make the code faster, don't. Like vector vs. linked list access. Intuitively the linked list seems much simpler for inserting an element. But in almost all cases the vector is faster, because of cache locality.

So what I'm saying is. Do things that make the code probably faster like using a vector, if they don't cloud understanding. If they do, leave them out until you have a reason to use them. The only reason to use micro-optimizations like these is if you measured execution time and know there is a performance bug in this place in the code which your "hack" could resolve.

No measurement -> no optimization. Because the performance problems are almost always in different places than the average developer assumes.

And then add a comment explaining what you did to a 5 year old.


> I know the bitwise trick

Only if you make certain assumptions about how the integer is represented, I reckon.


Indeed, a ones-complement negative number would not have this bit property.


In fact, newer gcc versions are unable to optimize (x & 1) == 0 at level -O1, and require -O2. I assume that the code emitted at -O2 and -O3 for both constructs is the more efficient version than the literal translation, even though they are the same number of instructions.


Apparently not true, at least in a simple function.

https://godbolt.org/g/hfpHz7

  int isEven(int number)
  {
      return number & 1;
  }
gcc 7.3, -O1, -O2 and -O3 (identical codegen for all optimization levels):

  isEven(int):
    mov eax, edi
    and eax, 1
    ret


that function checks if a number is odd, not even.

https://godbolt.org/g/7Khcg9


My bad. Didn't even check it at any point.

Anyways, you still prove my point. All those three are pretty close.

-O1 sete instruction is very slightly unoptimal, but even it is just a fractional clock cycle.


I highly doubt it. That’s 1:1

  test eax, 1
  ; use result


gcc -O2, clang, and icc all produce the sequence:

    is_even(int):
      mov eax, edi
      not eax
      and eax, 1
      ret
I am far from a x86 expert, but I assume this version avoids contention on the flags registers and increases pipelining.

edit: my mistake, icc and clang produce the equivalent sequence "not edi; and edi, 1; mov eax, edi; ret".


This is a question I give people who were already hired out of curiosity and I explicitly say not to use modulus. I ask a similar question in interviews but don't exclude modulus and most answer using modulus.


Well, you need to use modulus if you want to be 100% portable.


always been fond of `even x = x == ((x >> 1) << 1)`


Come on, really? "Knowing the very basics of binary numbers" isn't some obscure super-technical skill when it comes to programming, it's fairly fundamental. If they don't know that, then how can they possibly know how to use bit-masks, for instance?

Like, I get it when people say "most of the stuff you learn getting a CS degree has very limited usefulness day to day". Sure, you're not going to be implementing red-black trees or skip lists in your day-to-day, but you can't possibly claim that having a basic idea of how the bitwise operators work (or the details on how numbers are represented in binary) is in the same category.


>how can they possibly know how to use bit-masks, for instance?

Given that most programming jobs are web development jobs (I think, I don't have numbers to back that up), what makes using bit-masks important? That is, why would someone working on a web application need to know how to use them?


So that they understand how a network mask works? I'm assuming if you're working on the web you ought to know something about networks, right?


Why would a regular web developer need to know anything about network masks? Those come up in administration, but generally not in development.


I might be asking a bit much here, but I generally expect someone who's developing something that's going to run on the Internet to have some understanding of how the Internet works. Like... you don't need to understand the intricacies of BGP, but some idea of how a request from your web browser leaves your home network and comes back is strongly appreciated.

My biggest reasoning for desiring that skill is debugging. If you build a system and it doesn't work right for whatever reason, it's good to understand why that might be. Also, having some appreciation for that stuff really helps you make good decisions (but why can't I make 200 HTTP requests while loading my page?!)


Most programmers nowadays just write frontend javascript and have no idea how anything actually works.


There are relatively few programming jobs that actually require bit manipulation at that level. I've been deep into bit manipulation twice in my career (well more than 10 years now): network programming for a real time system and game development.

For the machine learning and infrastructure work I've done the last 5 years or so I haven't had to do even a single bitmask, shift, etc. directly. Of course we use these things liberally (e.g. hash functions) but it's hardly necessary to write the code itself very often, even in very sophisticated applications.


Huh - I do ML and a lot of bit twiddling with trained feature detector outputs - intake goes into a stage, feature detectors light up, and feature combos get fed to subsequent input via lots of bit banging - can't afford for it to be slow in the middle of that pipeline!


That happens. Like with bit twiddling in general computing it's not necessary in many applications.


Programming used to be mostly bit twiddling. The universe has expanded considerably, but if your still in that original corner its really hard to not see it as a basic skill.

Javascript only has doubles so you can't even bit twiddle if you wanted - C programmers can't even imagine.


Javascript stores integers in twos complement


> JavaScript numbers are always stored as double precision floating point numbers, following the international IEEE 754 standard.

It doesn't have integers to store.

https://www.w3schools.com/js/js_numbers.asp


That quote describes a good mental model for someone learning JavaScript, but not necessarily the physical truth.

JavaScript's numbers can do everything doubles can; but if the compiler can prove that all values that a particular variable could possibly take on will fit in a smaller type, then it can use that. So a variable really can end up stored as an integer--or never stored anywhere for that matter, like if a loop gets unrolled and the counter is optimized out.

The bitwise operators behave as if integers are 32-bit two's complement. The typed arrays are also two's complement. A smart developer can write code that a smart compiler will compile to bitwise operations on actual physical 32-bit integers.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

https://v8project.blogspot.com/2017/09/elements-kinds-in-v8....


That's an extremely dangerous assumption because of the implicit rounding and precision loss inherent in the underlying double type. An interpreter "may" use an integer under the hood but it still has to behave as if it were an IEEE double, that includes all the nasty things you don't want in integer math.

Most C programmers would be too scared to perform bitwise operations on doubles and JS doesn't add anything that makes it safer.


Can you give an example of math that would be exact with 32-bit integers, but is inexact in JavaScript? Floating-point math on values that happen to be integers is exact, if an exact answer exists and fits within the mantissa.


I think we're approaching this from philosophically different directions. On your end if the programmer controls everything correctly then its safe because doubles have a 53 bit significand.

The problem happens when you treat numbers one way and they are treated differently elsewhere. For example -0 when undergoing a bitwise +/- check (val & (1U<<31)) will appear positive when its negative by definition.

The cast to an integer prior to the bitwise operator looses information. You can use the operators safely if you know there is no information to be lost, but it is not type checked like with C. I will say at least JavaScript guarantees twos compliment. You never know when your favourite compiler will decide to break your code in the name of a faster benchmark - correctness of existing programs be damned. Implementation defined used to mean do something sane, but I digress.


The trick is to let your "double" only take on values that are also valid 32-bit integers, like with lots of gratuitous (x|0) and stuff. That excludes your (indeed problematic) -0. JavaScript's semantics give you enough to implement 32-bit integer math this way pretty easily. The speed may vary dramatically with small changes to the compiler, but correctness won't.


I still stick by my original point that:

((int)dblval) & flag

would require serious justification in a C code review. In JS its all you have. Under those restrictions I'd use it as rarely as possible. But yes you can use it correctly if you try.


That's certainly true. The only reason to play these games is that JavaScript gives you no better option.


In my career I have never had to know any binary number. I am on the application side of things though and try to void working with infrastructure at all so don't know about them


In the computer games industry, or at least at the companies I have worked in based in the UK, these would be part of a test given either remotely or immediately on arrival and failing to answer correctly would make it unlikely to proceed any further.

In the c/c++ game code bases I have worked on, it is very common to have statuses stored as a uint with each bit representing something and very common to set, unset, toggle, reverse bits etc.

I actually read the first part of the article thinking these aren't hacks these are just things people use day in day out. But the later section about setting last 0 or 1 bit etc suddenly because things I wouldn't typically see in the wild.


At least in C++, wouldn't you just write these "hacks" only once, in inline methods of a BitField class? It's much more likely to mistype

  state = state & ~BIT7;
than to mistype

  state = state.without(BIT7);


As daemin says, no, I am used to working with people that find basic bit field manipulation easy to work with[1], so there isn't any confusion.

Plus there are performance concerns. I remember one graduate introducing a c++ bit field class and tried to replace some core functionality with it and killed performance. Developers prefer to debug non optimized builds, and non optimized builds can end up running slow when you replace simple one liners with classes and inline methods that don't get optimized in debug builds.

[1]I hope this doesn't sound aggressive as it really isn't intended to. It is just everyone is so used to seeing bitwise operations it is as straight forwards as reading "set the third bit" or "toggle the last bit" when we see the C code.


Would you create a wrapper class and member function to make addition less likely to mistype?

num = num.plus(x);

might be less likely to mistype than

num = num + x;

but nobody would ever do that. Once you've done bitwise operators a few times they become just as comfortable and natural as addition or subtraction.


Nope, people actually just write the bit setting code. It's simpler and also I haven't found a good generic bit flags class that works as you intend.


What about C++ bitfields? Not a library but a language extension. A lot of code I used has migrated to that.


Normal C++ bitfields? Sometimes you get that inside classes in the form of "bool x : 1; bool y : 1;" etc. But that's much harder to use as far as function parameters go, so old school bitmasks are used for that.


I'd only do that if I could force every compiler to _always_ inline those methods. The cost of a method call for a simple bit operation is magnitudes greater than just doing the operation itself (tens to hundreds of ops versus 1-5).


Only a month or two ago I had to debug a data roundtripping issue with our Hadoop cluster. Decimals being sent in came back as apparently completely random numbers.

More experimentation revealed it was only negative numbers that were being messed up (the original tests passed with positive numbers).

Further bitwise investigation and manual twos-complement calculation, and I discovered the problem to be a mismatch in fixed point precision, where negative numbers weren't being sign extended correctly.

It's not every day you rely on your binary calculation skills, but it does come up.


How do you phrase the question? Because surely people you hired should have been able to answer it using modulus?


See my note. I don't ask the question in interviews. I think it is useless, they don't need it for their job. I do ask a variant of fizz buzz that requires using modulus. They don't get hired if they can't answer that one. So this subset of people I'm talking about are ones who do know modulus.


I ask it sometimes too - my favorite response opened with 'you want maintainability or speed' So I picked speed. They immediately set the first branch at mod 3 with the obvious justification - hired on the spot :-)


How often are bloggers really aware of what we MUST know or do?


This was literally part of my first programming assignment in college. Can people really not do this? Isn't fizzbuzz considered too obvious these days? I feel like I must be misunderstanding...


You miss the point. I learned all these tricks in school as well. I loved them. I felt "smart". After working for so many years, I only remember the basic ones. If you ask me the "smart" tricks, I can't get them right out of my head. I can sit down with a piece of paper and may get them after trials and errors after a while. Guess what, the important thing is I know these "smart" tricks exist. If I need them in real work, I can google them out easily. Why need I remember them? In my real work, what I need to get right out of my mind are how our systems work, how MySQL/Postgres/Hadoop/HBase/Kafka/Spark/Linux/AWS/... work, etc.


After working for so many years, I only remember the basic ones.

Several of the examples in the article are the basic ones, though: test, set, clear or toggle a bit. You might not remember the trick for finding the least significant 1 bit, but it's not as if testing whether a number is odd or even is some sort of arcane knowledge. Anyone who has to resort to Google or Stack Overflow for something so simple is going to struggle in large parts of the programming industry.


Agreed. I had to do this first semester freshman year and again in my embedded systems courses. But a very large portion of web developers never went to college or only have a 2 year degree (and, sorry, but most 2 year "Computer Science" degrees glorified really expensive bootcamps).


They couldn't offer an answer at all, or they offered an answer that didn't involve bitwise AND?


They couldn't offer an answer using bitwise operators. They knew mod 2.

Most web developers (front and backend) will never use bitwise and many don't have college degrees so they didn't use it in college. It is actually quite understandable that they won't.

I've used bitwise here and there in backend code in my 20+ years of web dev but never in a case where it couldn't be rewritten as slightly less efficient but more readable code.


I would think a "if (x % 2 == 1) //odd" would be within the realm of most CS applicants.


If x is a signed type, and negative, are you sure the modulus is positive 1?

Since ISO C99, C's division operator has been required to truncate toward zero. The % operator harmonizes with that and so it yields negative remainders south of zero.

In C90, it was implementation-defined, IIRC.

You really just want to test for zero/nonzero, rather than x % 2 == 1.


Good point. Interestingly enough, Python seems to handle this appropriately:

    >>> -4%2
    0
    >>> -8%2
    0
    >>> -9%2
    1


Yeah, Python is one of the few languages which actually defines sane behavior for integer division and modulus rather than leaving it up to the hardware. I wish more languages would follow its example!


Prolog is also like this:

    ?- x is -4 mod 2.
    X = 0.
    ?- X is -8 mod 2.
    X = 0. 
    ?- X is -9 mod 2.
    X = 1.
Using declarative integer arithmetic, we can even express very generally that a number X is a multiple of 2:

    ?- X #= 2*_.
This is true iff there is any integer (indicated by an anonymous variable _) that, multiplied by 2, yields X. It works for positive and negative integers.

For example:

    ?- -4 #= 2*_.
    true.
    ?- -9 #= 2*_.
    false.
A very nice reference on the general topic of bitwise operations is Donald Knuth's Bitwise Tricks and Techniques in Volume 4 of The Art of Computer Programming.


ISO C doesn't leave it up to the hardware. Integer division is required to truncate toward zero; the remainder harmonizes with that. (If someone designs a new HLL which leaves some arithmetic to the hardware that C doesn't, that is indeed silly.)

The sane language here Common Lisp: it does it in all the ways rather than picking one approach with some hand-waving justification about why it's the good choice.

It has truncate, round, ceiling and floor functions that all return a quotient and remainder.

There are also mod and rem functions that calculate just the remainder from a floor and truncate, respectively.


Guido has written a blog post explaining "Why Python's Integer Division Floors" [1].

Python also uses floor division, and the corresponding "sign-of-modulo matches sign-of-divisor", for floating-point numbers. The blog post claims that this can be problematic in certain cases (compared to C's "sign-of-modulo matches sign-of-dividend"), but does not give any concrete examples.

[1] http://python-history.blogspot.com.au/2010/08/why-pythons-in...


In C# this would be wrong

     (-123 % 2 == 1).Dump();
Prints False.

% in C# is the remainder operator, not canonical modulus.


Javascript as well:

    >> -2 % 2
    -0
    >> -3 % 2
    -1
Edit: well, at least in Firefox JS. I have no idea if that's actually standard-mandated behaviour or if it's an implementation quirk. The -0 is especially weird (I had written odd but...)


This is pretty interesting. I don't use js but I took a quick look and apparently (x % 2) == 0 as a test for even integers should still work properly there, since a comparison of -0 and 0 will consider them equivalent.


Well that's not confusing at all. Why would C# change the definition of a common operator character?


I think you would be surprised if you started looking around to see how how % is handled.


> I would think a "if (x % 2 == 1) //odd" would be within the realm of most CS applicants.

Yeah, if they can pass a fizzbuzz question they can probably get that. At which point it behooves the interviewer to explain why (x % 2) == 0 to find an even number is a better test...


I've been finding more & more in new hires that what they know coming in sometimes isn't as important as their willingness to continue learning. If they admit they don't know how to answer a question but are willing to give it a try, that's usually a good sign.


If what you say is factual I'm astonished at the level of incompetence.


Using the word incompetence is ridiculous. An incompetent person can't do their job and I can tell you all these people were very competent at their job. They just didn't ever have the need to learn binary math. Bitwise math is not needed for a huge portion of software.

That's like saying someone is incompetent because their entire career they have only spoken English and someone writes "el al." to them and they don't know what it means. Sure english is based on latin. Do most english speakers know enough latin to know what "et al." means? Probably. Does that make the person incompetent at their job where they only need to speak English? Definitely not.

A large portion of web developers don't have any formal education and the bootcamps and websites don't teach this stuff. As someone who has a computer science degree it seems absurd, one might even say it sounds like incompetence, but it is actually completely understandable.

Most of them can come up with the modulus solution but the bitwise operator solution is less common.

To the other comment in reply to you that talks about Amazon... yeah, that is Amazon... I work for a company with 1/100,000,000 the budget of Amazon (wild probably inaccurate guess). We don't get the people Amazon attracts. We get the people who all those other companies turned down. But you know what? With extremely few exceptions all of them have been great software developers despite not knowing how to tell if a number is even or odd using bitwise.


I interviewed many tenths of candidates at Amazon and I never met one who didn't know the very basics of binary numbering.


This is useful to know for counting the bits set in a word.


Which isn't super helpful, because there is [a GCC `__builtin_popcount`](https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html), a POPCNT intrinsic if you're developing for SSE4, and [existing implementations](https://en.wikipedia.org/wiki/Hamming_weight#Efficient_imple...) you can look up at any time if you don't want to reinvent the wheel.


That's cool. I probably shouldn't have said useful and instead said interesting. My interest in these kind of operations is to learn lower level theory, not to produce production grade bit counting/twiddling code. In that sense, the operation is interesting (to me).


The “clear the bottom bit” hack is pretty cool. It’s sometimes faster to roll a loop that uses that than to invoke the GCC popcnt intrinsic. If you pass an -march parameter that lets GCC emit the intended intel assembly instruction, then the intrinsic seems to always win (as expected!)


The following is my go-to for bit hacks

https://graphics.stanford.edu/~seander/bithacks.html


Yes! This and only this! The posted "bit hacks" are little more than bit level manipulations exactly as intended. The Stanford hacks are truly excellent. My favourite is: https://graphics.stanford.edu/~seander/bithacks.html#RoundUp...


That's really clever, but extremely inefficient. Something like 1 << (sizeof x - __builtin_clz(x)) should be way faster.

Edit: you obviously have to multiply by 8 to convert bytes to bits


By the unwritten conventions of bit hacks folks, 'builtin_clz' is not a primitive (and elsewhere in that same document you will find a brace of different ways of doing clz/ctz). Many of these documents are old and the techniques in them older still (dating back to HAKMEM).

You are definitely correct on most modern architectures which have a single operation count leading/trailing zeros that are typically no more expensive than an integer multiply. This has been latency = 3, throughput = 1 for most recent Intel arch.

I think a redo assuming modern architectures is long overdue; i.e. let's work off the basis that count leading/trailing zeros, PDEP/PEXT, etc are fast and cheap and start from there.

Bonus round: if you had a lot of these operations to do, then you still might be able to go faster on x86 on SSE2 or AVX2 by using this trick, as the penalty to pass a SIMD register full of integers to a floating point op or vice versa is just one cycle. This trick becomes obsolete yet again with AVX512, which introduces parallel count leading zeros.


And neither of the bit twiddling is useful for ARM NEON as bit operations in vector form are very limited... (plus there are pipeline stalls)

Also multiply adds can be fused so if you're doing that it cab be faster to just multiply by a different number instead of bit twiddling.


I'm not sure this is true. Many of the bit twiddling hacks can be used in NEON and they have a few unusual instructions I'm dying to play with.

I'm not sure which bit hack you're talking about that's done with a multiply or multiply-add. There's a nice use involving De Bruijin sequences for doing lg2 of a single bit that's very instructive - is that what you meant?


Of course, casting that int * to a float * is undefined behavior…


Of course.

When you first learn about 'cool hacks' like these, you'll want to use everywhere.

Then you'll realize that they don't apply to certain data types in certain environments.

Then you'll come to realize that the rules governing which hacks to use are too complex to remember.

Then you'll want to write a program that figures it out for you.

Then you'll realize that those programs already exist, and they are called compilers.

So now you are back where you started. You have made no progress, but you have gained an important insight.


And from that, at some point you realize that the corollary to your #2 says that such hacks do apply to certain data types in certain environments. And then you've gained a very powerful tool to be used in specific circumstances.

There's a reason a lot of these potentially non-portable tricks show up in high-performance computing (particularly in videogames) and in the embedded space.


That reason being that sometimes spending a 10+ hours on a tweak that gives +0.1% performance improvement is worth it.

Otherwise, remember the first rule of code optimization for beginners: "Don't."

EDIT: And by "they don't apply", I mean they can crash your program or even silently corrupt your data.


Sometimes tweaks like these might only save 10-20 cycles, which in a vacuum doesn't sound like much, until you consider that before the tweak, it started at 26 cycles and is now 6 cycles, and it's called 50 million times every second, which is a savings of 100 million cycles per second.

For games, this can mean a higher frame rate. For data processing, it can mean the data gets processed significantly faster. It's especially important for things like video encoders, where even a 1% savings in CPU time can translate directly to 1% more potential revenue for the same operating cost at a company that relies heavily on video encoding.

Yeah, saving those cycles doesn't really mean anything to a beginner, but they can be a huge savings for high performance computation applications.


Yep the correct way is:

union { int i; float f;} u; u.i=*i;

Now u.f contains the same bit pattern as the integer pointed to by "i" but interpreted as a float.

Compilers know how to generate code efficiently out of this.

Iirc this is implementation defined behavior, e.g. things like endianness alignment or padding, but it's not undefined behavior, whose evil effects can bubble up in unexpected parts of the program that apparently don't even touch the bad cast.


No, that's also incorrect. The correct way is:

    float f;
    memcpy(&f, i, sizeof f);
Don't worry about the implied overhead of the call to memcpy; the compiler will replace it with an intrinsic and emit the expected movss instruction.


Accessing through a union is legal in C, no? I thought that this was only disallowed in C++.


Maybe? There is wording in the C standard (since C03 TC3 I think) that allow reading accessing from non active members of the union, but the wording is unclear and implementations have taken a very narrow interpretation of it (i.e. the access path need to be explicitly through an union type and the actual object needs to be an union). See the GCC documentation on union and aliasing for example.


It's explicitly legal in both C99 and C11, but C99 forgot to remove it from its master (non-normative) list of undefined behavior. It's up for debate whether or not it's legal in C++ (ISTR a C++0x draft that copied the C99 changes to unions, but then they added unrestricted unions and completely changed the verbiage to have almost no relation to the C99 text).


It mostly isn't legal, though gcc explicitly promises to support it and Microsoft's compiler doesn't even do the optimizations that would cause trouble.


Wow, never realized the pointer cast version is undefined behaviour in C. Is C++ equivalent via reinterpret_cast undefined behaviour too?

Back in my C++ days, I would much prefer pointer-cast to union-struct, because the latter was subject to alignment/padding issues you mention, while for the former, I know of no desktop architecture that could mess up a int* -> float* cast.

I'm also doubly surprised because being able to do that stuff is what I'd consider the selling point of C/C++ over higher-level languages.


Pretty much all pointer casts where you would use reinterpret_cast in C++ are undefined behavior, in both C and C++. The underlying rule for strict aliasing is essentially the same in both languages (you can only access an lvalue via a pointer of its type, with possibly differing signedness, const, volatile, or via a char pointer, or via a field of a struct that has the same initial layout if you're in that same initial layout portion).

Yeah, C and C++ aren't the language most people think they are, and the strict aliasing rules are the most clear manifestation of that.


Trouble is, C was that language. Lots of people came to depend on this. The world needs a language with this ability.

The replacement is often "gcc with inline assembly" or "gcc with lots of __attribute__ extensions such as __may_alias__" or "gcc with optimizations disabled".

The need doesn't go away, despite the wishes of a language committee now dominated by compiler suppliers. From day 1, long before the language was standardized and even before the original (not ANSI) book was published, C was being used with aliasing. It was, after all, developed to support the creation of Unics (before that became UNIX).


reinterpret_cast in C++ has essentially the same rules, with exceptions made for const qualifiers and such, as well as void , char , and std::byte *. Use a memcpy instead.


Nope, it's undefined:

> Accessing an object by means of any other lvalue expression (other than unsigned char) is undefined behavior

https://wiki.sei.cmu.edu/confluence/display/c/EXP39-C.+Do+no...


I don't understand your comment. The link you pasted explicitly mentions the union example as a "compliant solution".

That said, memcpy and a compiler that understand the memcpy intrinsic is probably safer; that said I have a vague memory of hitting an embedded compiler where memcpy wasn't an intrinsic


This is technically undefined in C, but most compilers define it to work as expected as an extension to the language.


While that is a clever hack if you're looking for performance(which is where I've used these before) the float to int conversion normally hoses your pipeline(depending on platform obviously) and the solution below is much faster on any system I've worked with.


The first one (check if an integer is odd or even) is a hack, sure. But I can't think of anywhere you'd use it in the real world over (n % 2), which almost everyone will understand much quicker what you're doing.

The rest aren't hacks, they're bit manipulations, which are crucial for those who need them, and just about useless to those who don't.


Strangely enough, I often find my self impulsively writing (n & 1) to check for even/odd numbers. It just feels natural to me -- maybe because the first C programs I wrote were for µCs, where this kind of bit-twiddling tends to be used everywhere, and, correspondingly, people know what (n & 1) does. Sure enough, even on microcontrollers (n % 2) has no overhead compared to (n & 1) when optimisations are turned on.

It only ever caught my attention that writing (n & 1) is weird when some TAs looking at my code didn't understand what I was doing during my first college year.


It feels natural to me...but I also like writing emulators, so I've probably got more experience in common with µC devs than most other software developers do.


It feels natural to anyone who understands binary numbers. Every odd number in binary will have the last bit as 1. Just like in the decimal system, every number that is a multiple of 10 will has the last digit as 0. If you want to see if a number is a multiple of ten, I guess you could do a mod 10 of the number and see if the remainder is 0. But if you understand the decimal system, you would just naturally check if the last digit is 0.


Strangely, clang (but not gcc 7.2) with -O0 generates different code for the two.

https://godbolt.org/g/N8mTL6


why is this strange? -O0 is no optimization at all, so it isn't strange % yields an integer division right?

On -03 they compile to the same, both signed and unsigned.


For some reason gcc is still optimizing it though!


Try using unsigned int. I believe they are equal only for positive numbers.


Perhaps in some strange numbering system but not with two's complement... (to be clear, the return type is cast to bool).


With many compilers (e.g. clang, msvc) this will emit actual division without optimization (and if the number is signed it's quite a lot of code). If such code is in the inner loops it can hit performance, which is already not great without the compiler's optimization. And if one ever looked for a bug that takes more than few seconds to reproduce then the debug build's performance cannot be over-valued.

Though if this is a hack then checking the last digit to see if a decimal number is even is also a hack :)


Which brings us to: knowing when to use bit hacks is as important as knowing how to implement them. And you usually want to know about compilers for that.


The and operator is faster than modulo in most processors. Although, the compiler will likely optimize it away.


> But I can't think of anywhere you'd use it in the real world over (n % 2)

When your processor doesn't have built-in integer division and the corresponding compiler doesn't optimize that and you need it to work fast.


“Bithacks”, as they’re called, are great for high-performance, low-level development. For C++ developers of my variety, it’s an indispensable tool for writing fast code. Outside of that, it’s primarily a curiosity, but I still think it’s a useful exercise that improves understanding of how a machine and its binary logic work.


Well, you could use it while competing in the IOCCC:

https://www.ioccc.org/


Wait, what? Checking (un-)evenness with (n & 1) is considered obfuscation now? I know no C programmer that would even bat an eye at that one.


This stuff is now second nature to me after a couple decades hacking C, but boy if it was hard to find information on this stuff as a teenager in the early 90's.

It is easy to forget how much the internet, and the WWW in particular, has changed programming.


> It is easy to forget how much the internet, and the WWW in particular, has changed programming.

It really hasn't. It's just that web programming gets all the attention. Those who write hardware controllers or do embedded use bit manipulation every day.

But, really, if web devs aren't familiar with this, why not? Do they not have settings to toggle in their apps?


I'm sure many will disagree, but bitwise operators and operations should be common knowledge for programmers. Why? Because it is linked to understanding how data is stored and manipulated by computers. Computer scientists and Engineers generally understand these operators, hackers and "coders" do not. With this opinion, the title does not match the content.


Is this not already common knowledge? It's taught pretty extensively in one of the required lower div CS classes at my university, and it's usually in other schools' curriculums from what I've seen.


Somewhere around half of the candidates I interview have a tough time with hexadecimal, and definitely can't do bit manipulation without help.

It's kind of depressing, but I've adjusted how I ask my problem since it's not actually needed to demonstrate the skills I want to see. It doesn't take long to learn bit manipulation when it's needed, even though it seems like a foundation of computing to me.


There are a lot of computer programmers who never went to school.


So is calculus, yet apparently most CS students can't really integrate shit by the time they get to graduation...


Learned to integrate in High School, never used it again for years. Used in like 1 out of 5 Math classes at the start university. Haven't needed it for a good.. 13 years I guess?

Indistinguishable from "can't really integrate shit" without some hours of relearning.


Check out Hacker's Delight (2ed) by Henry S. Warren for more of this kind of stuff.


A lot more, if the Amazon preview is any guide. It covers this ground with just the first couple of pages of chapter 2.


Slightly disappointed me in that much of it assumes ‘big’ processors (e.g. fast hardware multiply).


What is this now, junior programming lessons?

You absolutely must know these because they're dead obvious and if you're just starting out with learning programming these are some of the first things you will/should learn, right after how computers store data.


While yes, these are pretty simple statements, most of them only being 2-3 operations, I think you might be interested to know that not every programming curriculum even covers "how computers store data". I know a fair few programmers who came from backgrounds that either never touch the difference between an integer and float, or if they did, it was only to the point of "this one stores fractions, this one doesn't".

For example, no python course for scientists is going to go on in-depth about the difference between big and small endian - it just doesn't matter when you're trying to make a simple graphical model, it's far too low level. Binary operators work and are useful in JavaScript, but how many 6 week frontend bootcamps are going to cover them?

What I'm getting to here is that I don't think you can just dismiss things like this because your coding background covered them - there's some pretty clear gaps in a lot of peoples' knowledge and educating them is likely only a good thing.


> "this one stores fractions, this one doesn't"

This sounds like a fast trip to the land of direct comparisons of floating point numbers.

Further info: https://randomascii.wordpress.com/2012/02/25/comparing-float...


Isn't it sad that most of todays junior programming curricula favour copy+pasting 10s of main() method only "applications" over explaining data types and some basic bit twiddling?


Here I thought this was going to be something like finding the next power of two[1].

Like a few other posters have mentioned, this is all basic bit manipulation stuff that you'd see in any perf sensitive/memory constrained codebase.

[1] https://graphics.stanford.edu/~seander/bithacks.html#RoundUp...


Also useful to generate subsets since possible the number of subsets of a set of n elements is 2^n.

   for (int i = 0; i << n; i++) {
      ...
   }
(won't work for a large n)

Also... dividing by 2 is (n >> 1) and multiplying by 2 is (n << 1).


looping in order of subset size can be done easily too: https://en.wikipedia.org/wiki/Combinatorial_number_system#Ap...


This assumes negatives are represented as two's-complement, but that's implementation-defined. Testing whether a negative number is even by ANDing it with 1 won't work in a system that uses one's-complement.


Which systems don't use two's complement?


Unisys apparently ships some emulators of a old system that still use one's complement... much as I hesitate to ever link to any ESR material, there's a good article here:

http://esr.ibiblio.org/?p=7413



A handful of ancient legacy systems. Very few people today need to worry about their webapp being ported over to UNIVAC.


If you like this, you should check out https://graphics.stanford.edu/~seander/bithacks.html as mentioned elsewhere and also the book Hacker's Delight.

I would love to see these updated for SIMD and modern architectures (which change the equation quite drastically for bit hacking with 1 uop PEXT/PDEP/LZCNT/TZCNT), but that makes for a pretty huge area.


See also: The Aggregate Magic Algorithms page

http://aggregate.org/MAGIC/

For example, Population Count instructions exist. But may actually be microcoded to count one bit per cycle.

Vectorizing your tallies can yield log time speedups:

Faster Population Counts Using AVX2 Instructions

https://arxiv.org/pdf/1611.07612.pdf


I'd recommend anyone who's into bit-level hacking to have a copy of Henry Warren's Hackers Delight [1]

Many of those nifty tricks are found in production systems today.

[1] https://www.amazon.ca/Hackers-Delight-2nd-Henry-Warren/dp/03...


Amateur k user.

  /k3
  bh01:{*-1#(2 _vs x)&1}
  bh02:{(|(2 _vs x))y}
  bh03:{n:|(2 _vs x);n[y]+:1;:[x<0;-8#|n;|n]}
  bh04:{n:|(2 _vs x);n[y]-:1;:[x<0;-8#|n;|n]}
  bh05:{n:|x;:[n[y]=1;n[y]-:1;n[y]+:1];|n}
  bh06:{n:-1#& x;x[n]-:1;x}
  bh07:{n:-1#& x;m:& 8;m[n]:+1;m}
  bh08:{n:(*|& x)+1;m:n _!#x;x[m]:1;x}
  bh09:{n:|x;m:*&0 = n;n:& 8;n[m]:1;|n}
  bh10:{n:|x;m:*&0 = n;n[m]:1;|n}

  /examples:
  bh01 -98
  bh02[-133;5] 
  bh03[120;2] / 120 set 2nd bit
  bh03[-120;6] / -120 set 6th bit
  bh04[127;4] / 127 unset 4th bit
  bh05[0 1 1 1 0 1 0 1;5]
  bh06[0 1 0 1 0 1 1 1]
  bh07[1 0 1 1 1 1 0 0]
  bh08[0 1 0 1 0 0 0 0]
  bh09[1 0 1 1 1 1 0 0]
  bh10[1 0 1 1 1 1 0 0]
Appreciated the help last time e.g. use of dyad # (take), indexing into list via juxtaposition (bh02) and use of dyad ? (find).


bh01:{*|2 _vs x=1}


  > if (x & (1<<n)) {
  >   n-th bit is set
Yeah, don't get a habit to do that. Instead:

  if ((x >> n) & 1)
if x is wider type that int or n, the result of (1<<n) may be UB, and in practice the generated code will be buggy.


- A little off-tangent maybe, but are there other professions where this kind of once important but today less important (as some claim) knowledge exists? Is this kind of "knowledge degradation" common in other industries?

- On a different note: how much of mem-space bloat can be attributed to this low-level neglect? Which of course is partly caused by modern languages / programming paradigms abstracting away this low level details for sake of higher level abstractions (which only seldom break, right ;)

- honorable exception: erlang bit-level pattern matching and all the power that comes with it. Btw. did any other language/lib even try to imitate that feature?


i dunno. only like 2 of these are “hacks”. the rest are sooo basic. and he doesn’t even have the classic swap 2 bytes/words (3 xors).


Surprised I didn’t see this one on the lists.

Shifting left by n bits on a signed or unsigned binary number has the effect of multiplying it by 2^n. Shifting right by n bits on a two's complement signed binary number has the effect of dividing it by 2^n, but it always rounds down (towards negative infinity).


Years ago I recall reading a SIGPLAN paper where someone was trying to create a programming language where they represented bitwise operations and bitfields without primitive types.

For the purposes of doing bit packing and other manipulations, they treated the data as an array of booleans.


Hardware languages such as Verilog work this way natively.


Maybe I'm just off in my own little world but this seems a bit table stakes. Am I out of touch?


No.


While I am with you and the GP, other responses in this thread make it sound like we are in fact out of touch. Or grumpy old greybeards now... Kids these days...


I suspect part of this is that hardly anybody that gets into programming professionally has any experience with digital logic, whereas the two used to go hand-in-hand.

But the level of this post is so low as to be next to useless, I clicked fully expecting to be amazed. The title could definitely do with an upgrade.


This is the most clickbait programming article title I've ever seen.


Agreed. I've flagged it.


As a Python developer who recently wrote a pure Python ECDSA/bitcoin library, these are very helpful. However performance is not always better. For example I had this function:

    def point_mul(p, d):
        n = p
        q = None
    
        for i in range(256):
            if d & (1 << i):
                if q is None:
                    q = n
                else:
                    q = point_add(q, n)
    
            n = point_add(n, n)
        return q
but when i changed the loop to this :

    for i in reversed(format(d, 'b')):
        if i == '1':
I found out it performed much better.


I 100% expect you're correct in your assertion, but here's the big question: do you know why the latter is faster than the former?


Python uses arbitrary-precision integer representation, so you don't work with bits directly.


>y = x & (x-1)

Couldn't you also do: y = x & (~1) ? I'd think that would be slightly more efficient.


These aren't equivalent, though. The example from the linked article turns off the least significant 1 bit in x. Your version turns off the least significant bit in x, regardless of whether it is 1 or 0. They'll give the same result if x ends in a 1 or if x=0, but not in general.


No, that unsets bit #0, not the lowest set bit in x.

Aka. that brings 01101₂→01100₂ correctly, but 01100₂→01100₂ instead of 01100₂→01000₂.


Ah, I completely misunderstood what the author was doing there.


I can't imagine any use for #6 on.


It's useful when you want to find the next free slot in some contiguous store. Normally you'd use ctz, but sometimes you only want 1<<ctz. An example is n-queens, eg.

https://codereview.stackexchange.com/questions/75517/n-queen...


How often is this doing work that the compiler is going to do for you anyway?


I'm not sure I understand the question. The first one about detecting if x is odd or even is usually what the compiler will optimize the more common (x%2) == 0 to.

Other than that, these are things that you have to do yourself. You can't call compiler.setMySeventhBit();

If you need to be able to flip and check bits, you have to do it yourself.


How are these hack? This is standard bit masking/manipulation? Every CS student learns it and the C Programming Language has examples of it.


Nice, but hard to read and compiler does it anyway.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: