IMHO it would be better to capitalize the OTG structure field names according to the standard. This would turn "avalidoven" into "AvalidOvEn", which makes much more sense: it's the "A-side session valid override enable bit", which allows the software to override the "Avalid" signal with the value from AvalidOvVal.
The post discusses bit-fields in the context of implementing some part of the USB OTG specification. Author makes a joke about USB having bits for “a valid oven”.
Ah Ada. Reminded me of my high school years playing with Turbo pascal before moving to c/c++. First concurrent language I ever worked with. Thought that feature was brilliant, so much nicer than pthreads. I took a small threaded c project of mine and replaced the pthreading with Ada and linked the c logic to it using gnat/gcc. Then along came Go, et al, and my interest in Ada unfortunately faded.
I'm actually looking forward the next few years, as AdaCore seems to be working on memory ownership-based guarantees a la rust (at least partially) and this might make Ada (or Spark in a first pass) a good candidate for truly fearless concurrency :-D.
I recall that Herb Sutter's C++ Metaclasses proposal has an example of bitfield metaclass [1]. Zig's design is reasonable if it also makes use of those non-2^k-bit integer types in other places, and indeed it does (e.g. alignment is commonly `u29`). However if it is not doable, I think Sutter's design (auto-generated overlay on normal integers) makes more sense.
I have several questions. If I have too bool bit fields next to each other, how are these ordered in a given byte? Even when I have a less then 8 bit field it can span over a byte boundary. Are the low bits go into the next byte or the high bits? The blog post only mentions fields greater than 8 bits.
If a field greater than 8 bits is byte-aligned, it is represented in memory with the endianness of the host. If the field is not byte-aligned, it is represented in memory in big-endian.
So is a 12 bit fild is byte-aligned if it starts at the "first" bit of a byte (whatever that is)? Is there a way to manually specify the endianness of a field? There are certainly cases where a byte-aligned big-endian field is needed on a little-endian host.
I also don't like that keeping track of the fields' address happens in comments. It would be great if there were a language element that would specify the address of a field inside a packed struct, so the compiler could verify its correctness.
Yes, well, now that Rust is aiming for first class support for embedded by the end of 2018, I expect there to be more pressure to find a good solution for bitfields. For that target audience, many will consider the current situation a show stopper.
Bitfield support is far more useful for emulators and for dealing with binary file formats and binary network protocols.
Embedded stuff typically means kernels and drivers. For that, the bitfields are in hardware registers. This is troublesome because the hardware might not act like normal RAM. Issues you will encounter:
1. The hardware demands a particular access size. For example, a nice big array of 4-bit values may need to be accessed only by 16-bit operations because it is on a 16-bit data bus that lacks byte enable lines. Another example is that different access sizes may go to different registers.
2. The hardware takes an abnormal action when accessed. Reading and writing might go to separate registers. Reading could fetch a byte from a FIFO, so the language/compiler is not free to do speculative or repeated reads. Writing could output to a FIFO. Reading or writing could clear bits, possibly in a different register, or initiate a coprocessor action.
3. The bus might allow reordering that can affect the device. For example, the bus might cancel operations that seem to be superseded by later operations (write after write, or read after read) but enforce ordering in other cases. The compiler must respect this in some way, possibly by suppressing optimization or by generating a compile error when the constraints appear to be violated.
You also get ugly layout issues like split fields, but that hits emulators and file formats too.
Erlang's bit syntax is indeed brilliant, but it's a bit different from C's bit fields: bit fields are both mapping and storage, Erlang's bit syntax is only mapping (parsing and serialisation) e.g. your bit-wide flag is going to expand to a full integer (erlang so infinite precision & heap allocated) at runtime, which may or may not be desirable.
Furthermore IIRC an Erlang pattern can be either matching or creation, you can't "abstract" a pattern to do both, so if you need to both parse and generate your binary thing will need to be written twice (at least).
This may or may not be an issue, but at the end of the day it's not quite equivalent to C's bitfields. It's very convenient though.
Could you give an example? E.g. if you have a 3 bits signed integer V followed by two 1-bit flags (F1 and F2), followed by a 3-bits unsigned integer L, followed by binary data D of length L bytes, followed by a little-endian signed integer I over two bytes, how do you parse it?
I went to look this up... I think actually the features I describe are more features of Wireshark which provides a Lua API, which is what confused me.
That Erlang syntax reminds of the python 'struct' [0], except it's generating the unpacking logic at compile time, am I right? Yes that is indeed very cool - it doesn't even seem that hard to do I'd wonder why other languages don't have a similar facility!
> That Erlang syntax reminds of the python 'struct' [0], except it's generating the unpacking logic at compile time, am I right?
Kinda, however there are a few superior items to the bit syntax:
* In Erlang, the bindings are in the expression itself (whether packing or unpacking) which makes the correspondance easier.
* Because the bindings are embedded and set left to right, it becomes possible to use one binding as parameter to the next pattern e.g. `L:3, D:L/binary` will first parse 3 bits as an unsigned integer L, then extract the next L bytes as a binary substring.
* Also Erlang's bit syntax can do sub-byte matching, I don't think python's struct can.
* And Erlang's bit syntax is much clearer and orthogonal (modulo knowing the — convenient — defaults) e.g. where extracting a little-endian 32-bit signed integer with struct is "<i" with Erlang it's :32/integer-signed-little (32 units, integer, signed, little-endian; the default unit size for integers is the bit). More verbose, but also completely explicit and more flexible, and very easy to change.
That is wonderful ... I’ve been looking for something like this for years. Don’t suppose you know off the top of your head are there any Erlang compilers for java and if they’re any good?
I wrote ocaml-bitstring (based on Erlang) and it also works by generating the code to handle the bits at compile time. It's fully general too - eg. if you describe a structure of 1 bit followed by an array of bytes then it will add the necessary bit shifts automatically when reading the bytes. Or if you describe a struct where everything is aligned then at compile time it will generate the most efficient code. You can even have variable length bitfields (a huge headache when implementing, but it does deliver the best possible code based on what it can prove at compile time).
Hm. I like the approach except for the overloaded meaning of packed. In C, packing means byte-aligned packing, rather than bit-packing. These seem like separate concepts. (Both are useful! I just think bit-packing should get its own special name.)
In my experience that's referred to as padding, not packing. A packed struct is what you get when you remove all of the padding. Packing is therefore removing padding, not adding it.
I'd say "packing" is ambiguous and could refer to either bit-aligned or byte-aligned data. But if your fields are less than 8 bits wide, I would assume "packing" means bit-packing.
I can't remember what programming language it was, but I recall one that had managed to eliminate primitive numeric and bitwise logical operations by treating them as arrays of bits (booleans).
Also, this needs a (2017) tag.