Last time this came up (https://news.ycombinator.com/item?id=20808778) and disap...

ncmncm · on Sept 21, 2019

I wrote a little more about this stuff a couple of days ago:

https://news.ycombinator.com/item?id=21011976

mamcx · on Sept 21, 2019

bitmapped set is one I don't remember at all. It have another name? A quick google not give me a clear idea of what is or how could be usefull...

ncmncm · on Sept 21, 2019

std::bitset is one. But you can just use an unsigned int, or array of them. In C++ or C, set intersection is &, union is |, complement is ~. Cardinality is __builtin_popcount, or std::bitset<>::count(). Membership test for m is (s & (1<<m)). For a larger set, represented as an array of unsigned, it's (s[m>>5] & (1<<(m&0x1f)). std::bitset encapsulates all this, optimally. There are similarly efficient implementations for lots of other useful operations: consult HAKMEM.

Bitsets are useful in more places than you might guess, because they can be used to filter out, with typically a single instruction, a majority of uninteresting cases in searches, leaving fewer cases to be examined with more precise and expensive operations. You record interesting properties of elements of a collection upon insertion as bits in a word stored alongside (or, better, in an index); and then first check the bits during any search. It is easy to speed up an amazing variety of programs by an order of magnitude, sometimes much more, with a tiny change.

For example, you can store an int value for each of a (possibly very large) collection of lines of text, representing the set of less-common letters that appear in the line. Searching for lines that contain a string, you first check it against the set for the string. Any line that doesn't have them all certainly doesn't have the string.

Leibniz (inventor of calculus) used this method very heavily in his own work. Before computers--and even into the 1960s--it was the most important way of automating data processing. Back then, you used cards that had a row of either holes or notches along one edge, and selected matching cards from a stack by inserting a rod through the stack at a selected spot, and lifting.

pasabagi · on Sept 21, 2019

>Leibniz (inventor of calculus) used this method very heavily in his own work. Before computers--and even into the 1960s--it was the most important way of automating data processing. Back then, you used cards that had a row of either holes or notches along one edge, and selected matching cards from a stack by inserting a rod through the stack at a selected spot, and lifting.

Hey, do you have a reference for that? I've been doing some research into Leibniz's calculators, and I've been finding few sources.

ncmncm · on Sept 21, 2019

There is quite a lot about Leibniz's calculating machine designs on Wikipedia.

I think I found out about Leibniz's bitwise activities in Neal Stephenson's Quicksilver, but he invented the modern notions of both sets and digital logic, according to Wikipedia. He would have used the cards in catalogging libraries.

mxz3000 · on Sept 22, 2019

Yup I used bit sets to write a super duper fast Sudoku solver. For examples you can detect naked singles extremely fast by just and-ing the bitsets in a given row, column or house, taking the complement, and checking if a unique bit is set.

ncmncm · on Sept 23, 2019

Sudoku is a poster child for bitset optimization.

But they used to be essential for automating chess. A 64-bit word represents a board, and a set of words represents the positions of each type of piece, one word for white pawns, etc. Another word shows places threatened by the white pawns, and ORing them shows all the threatened places.