I think it's funny. C was originally invented in an era when machines didn't have a standard integer size, 36-bit architectures were at their heydays, so C integers - char, short, int, and long - only have a guaranteed minimum size that could be taken for granted, but nothing else, to achieve portability. But after the computers of world have converged to multiple-of-8-bit integers, the inability to specify a particular size of an integer become an issue. As a result, in modern C programming, the standard is to use uint8_t, uint16_t, uint32_t, etc., defined in <stdint.h>, C's inherent support of different integer sizes are basically abandoned - no one needs it anymore, and it only creates confusions in practice, especially in the bitmasking and bitshifting world of low-level programming. Now, if N-bit integers are introduced to C, it's kind of a negation-of-the-negation, and we complete a full cycle - the ability to work on non-multiple-of-8-bit integers will come back (although the original integer-size independence and portability will not come back).
While contemporary implementations are most commonly tailored to use (UNSIGNED-BYTE 8), (UNSIGNED-BYTE 16), (UNSIGNED-BYTE 32), and (UNSIGNED-BYTE 64) along with their signed counterparts, our language allows one to freely specify and use integer types such as (UNSIGNED-BYTE 53) that could - in theory - be optimized for on architectures that use unique, by today's standards, word sizes.
This also comes from the fact that Common Lisp was specified during times that had no real standardized word sizes, and so the standard had to accomodate for different machine types on which a byte could mean different and mutually exclusive things.
There's something fundamentally different and mistaken about C's original implementation of variable integer sizes though.
People often describe C as "portable assembly", but despite this, integer sizes varying on different platforms results in non-portability of anything those programs _produce_. That is, a "file", or bit stream (not byte stream!) produced by one machine may be incompatible with another. The original integer-size independence is decidedly not portable.
That was probably less of a problem when it was rare to send data from one physical machine to another machine, let alone one of another type. But now the world is inter-net-worked and we have all sorts of machines talking to each other all the time.
Making the interfaces explicit reduces errors. These days we now even have virtual machines and programs running at different bit widths on the same machine, and emulated machines on the same machine running different ISAs!
I'm also part of what I'm sure is a small number of users who believe using "usize" should be a lint error manually overridden on Rust and also thinks endianness should also be explicit. Heck, it should be a compiler error to write a struct to a socket if it contains any non-explicit values!
Pascal has it too. I always found it quite natural to specify the range I want with the data type and not as precondition in the function. Another big advantage is that you can avoid a good deal of potential off-by-one errors, if you define your data types appropriately. For example the following definition in Pascal would be much less error prone than the corresponding definition in C[1]:
var
weekday: 0 ... 6;
monthday: 1 ... 31;
For a quarter of a century I wonder why no one seems to miss that feature. I really hope we will get them in C one day. Even more so I hope that the proposals for refinement types in Rust[2] will one day be resolved and become implemented.
And GNAT (GCC Ada front-end) will use a biased representation for range type when packing tight:
with Ada.Text_IO; use Ada.Text_IO;
procedure T is
type T1 is range 16..19;
type T2 is range -7..0;
type R is record
A : T1;
B,C : T2;
end record;
for R use
record
A at 0 range 0 .. 1;
B at 0 range 2 .. 4;
C at 0 range 5 .. 7;
end record;
X : R := (17,-2,-3);
begin
Put_Line(X'Size'Image); -- 8 bits
end T;
Yes this is kind of funny. My understanding is that Thomson and Richie deliberately left out non power-of-two word size support (which other languages had at the time because cpu manufacturers were adding a couple of bits at a time to new models (Get this years 14bit cpu, two more than last years' cpu!).
To make a more simple, more elegant, more portable language they decided to settle on power-of-two word lengths. This is similar to how Unix came about, leaving out the cruft and complexity from the over engineered Multics.
24-bit floats "FP24" at least have been a thing in some ATI graphics cards (R300 & R420) 15+ years ago. Some DSPs have also used (some still do?) 24-bit word width.
These odd word widths are anything but common, though.
As you mention, the fundamental integer types have guaranteed minimum sizes (or pedantically a range that matches the following sizes if twos' complement is used):
* >=8 bits: char (CHAR_BIT is exactly 8 in POSIX)
* >=16 bits: short and int
* >=32 bits: long
* >=64 bits: long long
The C99 typedefs like uint16_t have to be chosen internally to be one of the underlying types. For those sizes that have no matching underlying type, the implementation will omit typedefs.
However don't forget the more flexible C99 typedefs int_leastN_t and int_fastN_t. They both will give you a type of at least N bits, where the "fast" one chooses whichever type is most convenient for the processor, and the "least" version picks whichever is smallest. (For instance int_least16_t is probably short, and int_fast16_t is probably int.)
Well, it's not lost in C itself. But in the practice of modern C programming, it's often sacrificed in favor of using integers of exact sizes (uintN_t), and many programs perform bitwise operation by assuming an exact size of integer. By C99, they are guaranteed to have the same number N of bits across all implementations, and they are included only if the implementation supports it. So the programs using them is standard C, but not 100% portable, there is no requirement in C to implement exact-width integers.
Although modification of most programs shouldn't be difficult (there is uint_leastN_t), also, C compilers can be modified to treat extra bits as if they don't exist to allow existing programs to work again.
This is the other way round though; code using "int" is a disaster for portability, because you don't know what you're actually getting. So people use uintN_t to get something specific which behaves the same way on different platforms - i.e. portable.
You can always #define or typedef uintN_t to a machine type. You can't re-typedef int. You can #define it, but people will hate you.
Even worse than int is long. So much old Unix code that was (incorrectly) using long to store pointer-sized types got broken on other 64 bit platforms.
That makes it completely impossible while reading code to tell what the size of int is even on your system. And worsens header file include order problems.
AFAIK the last machine in widespread production that handled multi-length integers was the PDP-10/20 and its clones which essentially died around 1984. I say "around" because though DEC canceled the 20 line, some clones remained (that was Cisco's original business plan, for example)
There was a series of Polish mineframes called Odra with 24bit integers which also died out in 80s essentially, but some of them were still used till 2010 in some specialized railway station software, and there was a short series of faster replacement processors for them made in late 80s-early 90s called SKOK.
Of course they weren't "widespread" for most meanings of that word :)
You can get that experience today with the C compiler for the EZ80 chip. The short is 16-bit, and the long is 32-bit, so logically the int is 24-bit. You can use it as int24_t if you prefer.
An EZ80 is used in some of the TI graphing calculators.
I'd love to know if there's any use to this beyond FPGAs. This just seems to be another case of porting the complexity of RTL design into C syntax so that they can claim they have an HLS product that compiles C to gates. It's not C to gates if you had to rewrite all your C to manually specific the bit widths of every single signal. I really wonder how far we can keep going before the naming police break into Intel Headquarters and rename all their marketing material with "Low Level Synthesis".
Arbitrary bit-width integers are great for writing computer emulator code. There's a ton of odd-width counters and registers in microchips, and being able to map those directly to integer variables instead of having to do a "bit-mask-dance" after each operation at least would increase readability (and probably also add a bit of type-safety).
(Zig also has arbitrary bit-width integers up to 128 bits, but other then that I haven't seen this outside of hardware-description-languages).
Does the compiler guarantee constant time? If not, it's still useless for cryptography. If it does, then it becomes useless for regular work because plain bigint will kick it's ass on performance, especially when doing division.
Tangentially related: C++ does have algorithmic complexity guarantees. For instance, std::unordered_map at() and operator[] are guaranteed to be average case complexity of O(1), and std::nth_element is guaranteed to be average case O(n).
The complexity requirements for container member functions are often more vague an implicit, but I believe those are also not expressed in terms of time either. It's some other implicitly implied operation. In [2] for example this operation is the construction of a single object.
An other tangentially related part of C++ is <chrono>, but I do not think that there are strict requirements there for precision. At most there is monotonic requirement for steady_clock[3].
Constant-time division is super slow for sure but it's not needed for cryptography.
Modulo is also similarly slow but rarely needed as well. For example in elliptic curve cryptography it's only needed at the very beginning of a computation when hashing to the elliptic curve or when producing a secret key from a random input key material.
In terms of speed, LLVM iXYZ wide-integer code is usually faster for basic operations (modular addition, substraction, multiplication) the big slowness is for modular inversion where constant-time inversion is about 20x slower than GMP inversion.
Plain bigint in general, including GMP are bottlenecked by memory allocation and the iXXX from LLVM are purely stack-based.
Regarding constant-time, I know that there have been petition to provide a constant-time flag to LLVM but no guarantees so far. Unfortunately you have to take an after the fact verification approach today unless you dropdown to assembly (see https://github.com/mratsim/constantine/wiki/Constant-time-ar... on a couple of constant-time verifiers available)
> or it's own JIT compiler that uses the MULX, ADCX and ADOX instructions.
Does LLVM's bigint implementation use those instructions? Last I tried LLVM can not handle the required data-dependencies on individual carry bits, making it impossible to use ADCX/ADOX in LLVM without inline assembly.
LLVM properly generates ADC code when you use __addcarry_u64 contrary to GCC[1] which generates really dumb code with setc/mov in and out of carry.
However __addcarryx_u64 which is supposed to generate ADCX and ADOX is not properly handled and only generates ADC.
I think you are mixing with one comment on the GCC mailing list about GCC not having the adequate representation of carry and even less a representation that can separate a carry and overflow chains[2]
There's a nearly identical issue in LLVM [1]. So neither are currently able to handle ADCX/ADOX in their IR and can only support it through inline assembly.
I was hoping LLVM could maybe bypass this IR limitation for the iXYZ types and generate the optimal instruction sequence, but from your assembly it looks like that doesn't happen either.
> Does the compiler guarantee constant time? If not, it's still useless for cryptography.
Side-channel resistant algorithms are only required when you are handling sensitive data. This is often not the case when you are verifying signatures, verifying proofs or generating succinct proofs of computation without a zero-knowledge requirement.
I'm somewhat confused. Constant w.r.t to what exactly? You can't have constant time operations on arbitrary bit length integers and once you fix the one parameter you have I fail to see what 'constant' means.
Constant time operations in the context of cryptography means that the runtime of the operation is not dependent on the data being manipulated.
As an example, suppose you were checking a 256-bit number for equality. This would be unacceptable:
bool eq256(uint32_t a[8], uint32_t b[8]) {
for (int i = 0; i < 8; ++i) {
if (a[i] != b[i]) return false;
}
return true;
}
Why? Because depending on the data this function takes longer or shorter, and timing attacks might be used to figure out secret data. Instead you need an implementation like this:
This implementation always takes the same amount of time regardless of the contents of a and b.
It goes further than this as well, you're not supposed to take branches based on secret data nor access memory locations based on secret data as those can be recovered through the branch predictor and cache respectively.
While downvoted at the time of my response, this answer is correct. The optimizing compiler makes no timing guarantees, so it's not required to use any of the constructs in your second answer at all. If you need guaranteed timing, you cannot use standard C (instead use assembly or compiler extensions).
A proper implementation uses the correct compiler intrinsics and hints to cause such operations to be constant-time. Timing problems are a fairly well-known problem in cryptography and constant-time implementations are often required for safe cryptography.
Oh right, that's what's being meant. I confused it with constant time as in O(1).
Wouldn't most implementations of fixed length integers have constant time though? It barely seems worth optimizing unless your integers vary massively in size, at which point using fixed size integers is clearly suboptimal.
Network protocols immediately came to mind for me. They love packed structs of weird bit sizes because anything that's not payload is overhead on every message for the rest of time. Fields can also be large, e.g. VLANs are 12 bits, VXLAN ids are 24 bits, MAC addresses are 48 bits, IPv6 addresses are 128 bits so it's not just limited to a couple of small sized bitflag style things.
I'm pretty sure the exact layout is not guaranteed in this case, so while the above code may work in many cases it's not possible to represent all data structures like this, particularly if the bits are not byte or machine word aligned.
I don't think the standard even guarantees which bits in an integer the bitfields will be stored. That's important for network protocols.
I think the bigger issue in regards to networking is ordering, not the (lack of) packing. I don't think the ordering is guaranteed and probably depends on CPU architecture (e.g. big vs little endian), not necessarily compiler.
In practice, little-endian machines have the first bit field in the least-significant bits of the data type they fit into (usually but not always an int), and big-endian machines have the first bit field in the most-significant bits.
"so it's not just limited to a couple of small sized bitflag style things."
The examples I gave are integers you may have to compare or manipulate not just bitfields to copy.
I also disagree this is aimed at large integers based on the examples of the syntax in the article commonly being <16 bits, sometimes <8, while the examples I cited had non power of 2 use cases up to 48 bits.
You can already do this in C though, If EVER SO SLIGHTLY wasteful in non-practical terms.
If you need a 23bit object you just structure it to be that. It’s a couple of AND or SHIFT ops when accessing, but so what? Even for 100Gbit networking you aren’t going to max out even a slightly appropriate CPU.
The same argument could be made to get rid of all numeric types except the largest. That's likely how it would compile down to on platforms without 23 bit types though just it would be handled automatically based on target. I think the point of such a feature is to abstract you from thinking about if the machine has a native 23 bit type the same way you don't think if the machine has a native 64 bit type or a hardware float type today. Also when you do this manually you're now responsible for tracking the actual type and such too. Beyond that you also want to do operations on these fields not just store them and ignore them, does the IP match this policy? Have I learned this MAC? Is this TOS in an allowed range or does it need to be bleached? Constantly pulling these out and putting them back in, the above work isn't a one time thing.
100 Gigabit networking eats more CPU than you think, especially if you're actually looking at headers. It's an enormous portion of cloud CPU usage and a big reason networking is still driven by what's easy to put in an ASIC vs running the easier thing in software.
I was wondering that, and share your skepticism of autotranslation (it basically never works, and the only reason people like it is that the HDLs are stuck in the 80s).
But I think the "no automatic promotion or conversion" combined with "will error if combined with different width" could actually make extint(8) and extint(16) useful - it's a massive hint to autovectorisers and lets you generate the SIMD instructions for those widths.
Doubly so if they make sure never to write the words "undefined" where they mean "implementation-defined" for extint. At the moment normal arithmetic in C (x = x+1) is potentially undefined behaviour.
High level synthesis works perfectly fine, just not from C. HDLs have chugged along too, it's just the toolchains are ridiculously expensive and risky to change. That's why hardware tech stacks lag behind the state-of-the-art.
I share the skepticism of high level synthesis from C as being a bad motivation. The workflow is more like metaprogramming, and C is terrible at that.
Clash[1] is one HLS language which generates HDL from
a large subset of Haskell. The language is Haskell, actually, but not every construct can be synthesized; e.g. recursive data types and FFI calls have no HDL equivalent. Clash hooks into the compiler (GHC) and generates HDL from the intermediate representation. You can also run the Haskell code directly to see the simulated output.
That is very cool. Alas, most of the other hardware engineers I work with can hardly write C, let alone Haskell, so I very much doubt I'll ever be able to use this at my job. But I like to see that there is innovation in this space and that people are trying new things.
It provides another way to represent individual register fields without using bitfields. And probably gives you stronger guarantees about what happens when the bitfield overflows.
It also provides a way to pass those values around without passing the whole register struct around.
There's also the obvious compression use case. Assuming the rest of your code is sufficiently robust, you can shave off all the excess bits from your data storage. There may be a performance penalty, but you won't have to deal with low level ops or alignment issues. Most real world big data will exceed 32 bits (i.e., the identifiers will exceed 32 bits), but is nowhere close to 64 bits. The benefit is more meaningful if your data now fits in a cache/fast memory whereas it didn't before.
It's not C to gates if you had to rewrite all your C to manually specific the bit widths of every single signal
Ideally, for FPGA design you only have to use the special bitwidths for the interface of a module. The implementation can be in normal wider C types. The compiler can optimize these operations to smaller bitwidths by realizing the higher input bits are zero/signextend and higher output bits are not used. You can help the optimizer by making some variables smaller bitwidths, but no need to rewrite everything.
I implemented this once for a c-to-hardware compiler and it worked quite well. The compiler had a lot of builtin-types, all signed and unsigned integers from 1 to 64 bits wide, named __int1..int64. See 'extended integer types' in the manual: http://valhalla.altium.com/Learning-Guides/GU0122%20C-to-Har...
Well, encryption research. RadioGatún (SHA-3’s direct predecessor), for example, allows the bit width to be any number between 1 and 64, so this will allow us to see how, say, 29-bit integers work with this algorithm.
Most cryptographic algorithms (notably RC5 and RC6, but also Rijndael/AES) can be extended in to 128-bit word size variants, and having guaranteed support for 128-bit integers in C would be useful to see how these variants act, and run programs to evaluate their security margin.
> if a Binary expression involves operands which are both _ExtInt, rather than promoting both operands to int the narrower operand will be promoted to match the size of the wider operand, and the result of the binary operation is the wider type.
Although it's static size, it could be a building block for bignum arithmetic. The last time I tried compilers were pretty bad at optimizing generic code for bignum addition, although it's pretty easy to hand-optimize in assembly.
Generic code for getting the high 64 part from unsigned unsigned integer multiplication of 64 bit values. Can be useful for fixed-point math for example.
Note that the spec[1] requires that this tops at an implementation-defined size of integers, so you're likely not getting out of writing bignum code yourself (and even fifimplemented, the bignum operations may likely be variable-time and thus unsuitable for any kind of cryptography). Making the size completely implementation-defined also sounds like it'll be unreliable in practice, I feel like making it at least match the bit size of int would be a worthwhile trade-off between predictability for the programmer aiming for portability and simplicity of implementation.
Both Clang and gcc already support a 128bit integer type, so it's certainly possible that "implementation-defined" will end up being 128-bit or 256-bit for x64 targets on common compilers (provided MSVC plays along).
LLVM already supports completely arbitrary integer sizes up to 2^23-1 in its IR (https://llvm.org/docs/LangRef.html#integer-type), and I think it can “lower” any integer size to fit what the hardware actually supports. So if Clang doesn't add an artificial constraint on top, in theory you could use a one million-bit integer size if you wanted to?
Since I had to implement something like that in rust for a base32 codec [1] a few years ago I really like the idea. Although my main concern was ensuring that invariants are checked by the type system, which might not be as much a concern in c with its implicit conversions?
Judging by the motivation section, the motivation is primarily for FPGAs which I guess is why they want to allow these sub-int sized bit values. You might come up with some custom C-programmable operator that is only 3-bits wide where before you're presumably forced to use the smallest available power of 2 word size which would waste resources. So I think actually the idea is that this is for code which is not supposed to be portable at all, but rather hyper-optimised for custom devices.
I'm very much in support of this. One thing I like about Zig[1] s that integers are explicitly given sizes. I've been playing recently with it, but I'm waiting for a specific "TODO" in the compiler to be fixed.
> These tools take C or C++ code and produce a transistor layout to be used by the FPGA.
Hmm, I haven’t been following that but it seems that...
> The result is massively larger FPGA/HLS programs than the programmer needed
And there it is.
Really seems odd to me to try and force procedural C into non-linear execution of FPGA. Like it seems super odd, and when talking about changes to C to help that... I really don’t get it.
This isn’t what C is for. What is the performance advantage over Verilog? How many people want n-bit into in C when automatically handled structures work well for most people.
Maybe I’m just not seeing the bigger picture here and that example was just poor?
Not to mention that the first statement is simply false...
The final result is a bitstream that determines which LUTs (lookup tables) and BRAM (memory/block RAM) to use on the chip, and how they should be connected/routed.
The FPGA fabric itself is made of transistors, but your C/C++ (HLS) or HDL code is not directly controlling these transistors. This is what makes FPGAs so flexible relative to ASICs.
There is no performance advantage over Verilog. The reason why (some) people want C is because Verilog and VHDL are unquestionably terrible languages (they weren't even intended to be HDLs, that use case came later) which are damn near impossible to write complex systems in without spending most of your time writing bug-prone boilerplate. Every big name IC designer shop ends up wrapping them in layers of metaprogramming to make them palatable.
So, since those languages suck, people familiar with the procedural side of things end up asking for C. Which is an even bigger impedance mismatch than Verilog, but since you need a smarter backend to even begin to implement it, it can make life easier by that alone.
Personally, I prefer stuff like nMigen, which is basically Python metaprogramming a synthesis-oriented subset of Verilog constructs. Compiles down to Verilog behind the scenes.
"Verilog and VHDL are unquestionably terrible languages (they weren't even intended to be HDLs, that use case came later)"
They were intended to be HDLs (for simulation of hardware), but they were never intended to be automatically translated into gates/schematics (i.e., synthesized)
First of all, I suppose that it will be possible to make them unsigned (just like for standard types). Is this correct?
Also, what's the relationship between standard types and the new _ExtInts? Are _ExtInt(16) equivalent to shorts, or are they considered distinct and require explicit cast?
> In order to be consistent with the C Language, expressions that include a standard type will still follow integral promotion and conversion rules. All types smaller than int will be promoted, and the operation will then happen at the largest type. This can be surprising in the case where you add a short and an _ExtInt(15), where the result will be int. However, this ends up being the most consistent with the C language specification.
For instance, what if I choose to replace short by _ExtInt(16) in the above? What would be the promotion rule then?
Note that it was already possible to implement arbitrary sized ints for a size <= 64, by using bitfields (although it's possible that you could fall into UB territory in some situations, I've never used that to do modular arithmetic).
Edit: Ah, there's this notion of underlying type: one may use the nearest upper type to implement a given size, but nothing prevents to use a larger type, for instance:
struct short3_s {
short value:3;
};
struct longlong3_s {
long long value:3;
};
I don't know what the C standard says about that, but clearly these two types are not identical (sizeof will probably gives different results). What's will it be for _ExtInt? How these types will be converted?
Another idea:
what about
struct extint13_3_s {
_ExtInt(13) value:3;
};
Will the above be possible? In other words, will it be possible to combine bitfields with this new feature?
I guess it's a much more complicated problem that it appears to be at first.
Much of my time is spent writing Mentor Catapult HLS for ASIC designs these days.
Every HLS vendor or language has their own, incompatible arbitrary bitwidth integer type at present. SystemC sc_int is different from Xilinx Vivado ap_int is different from Mentor Catapult ac_int is different from whatever Intel had for their Altera FPGAs. It's a real mess.
I'm hoping this is another small step to slowly move the industry into a more unified representation, or at least if LLVM support for this at the type level could enable faster simulation of designs on CPU by improving the CPU code that is emitted. What probably matters most for HLS though are the operations which are performed on the types (static or dynamic bit slicing, etc).
I'm in the same boat. After having played with all the other vendor libraries, I think I like ac_datatypes the most. It's been really fast and the Catapult is a pretty good engine. Can I ask what industry you're in? I'm in telecom.
> While the spelling is undecided, we intend something like: 1234X would result in an integer literal with the value 1234 represented in an _ExtInt(11), which is the smallest type capable of storing this value.
That “smallest type capable of storing this value” is a disappointing approach, IMHO. It’d be a lot more powerful to just be able to pass in bit patterns (base-2 literals) and have the resulting type match the lexical width of the literal. 0b0010X should have a bit-width of 4, not 2.
"smallest type" doesn't go far enough for HDL languages - consider a 4-bit counter in verilog
reg [3:0]r;
r = 4'hf;
r = r+1;
if (r == 0) ....
if r is really 8 bits r+1 will have a non-0 value ....
However all the LLVM people may be saying here is that they're providing the minimal support for arbitrary size math and expect language implementers to generate the masking where required (ie that r+1 above is really (r+1)&4'hf )
I'll note that for Verilog in particular the standard Verilog C-level APIs for accessing data imply that integers are not stored contiguously, instead they're stored in 32-bit chunks with a min size of 32-bits for 1 to 32-bit values - a 33-64 bit value will be stored in 2 non-contiguous 32-bit words ('packed' values are different from this). To be useful any back end support needs to be able to understand stuff stored this way.
I think your proposal for 0b0010X makes an excellent addition to the 1234X proposal. Has it already been discussed by the working group? If not, you should email someone to ask them to consider it!
Would 0X12 then be a 12-bit integer with the value zero or a hexadecimal `int` literal with a base-10 value of 18? Does this work for other bases (0X12X12)?
I'm not sure why they picked a letter which can already occur in integer literals rather than one of the many unused letters. Given the focus on FPGAs and HDL it's also worth noting that X is commonly used in binary or hexadecimal constants in HDLs to denote undefined or "don't care" values, which could lead to confusion. Rust integer literal syntax would be perfect here (1234u11 or 1234i11) since it already includes the bit width and is compatible with any base prefix.
At some point they need to branch off and not call it C anymore. C should stay relatively small -- small enough that a competent programmer could write a compiler and RTS for it.
My opinion is C and C++ need a divorce. So that C can be modernized with features that make sense in the context of the language. And not constrained as a broken subset of C++.
I think it's starting to happen because C++ has become so grossly Byzantine. C refuses to relinquish a bunch of niche applications. The heyday of OOP is past. 10 years ago the attitude was C is going to die any day now. Now it more like since C isn't dying it needs improvements. And none of them are backports from C++ nor make sense in C++.
C99 already isn't a subset of C++ (although I think it took until C++14 to finally give up on that), so any alignment is courtesy now, and it's recognized they'll deviate where necessary.
A lot of the world still runs on C99, and a lot of (toy or academic) compilers are written for C0 (a simple, small, safe C subset). Even when a new C version gains more features you can still develop against C99, C0, or whatever version you prefer.
The choice is often an illusion. You only get the control the version for code you write and only if you are programming it in isolation. As soon as you work on a larger team project that may also include third party code you no longer dictate what version of C is being used.
I think most commonly used languages with and without standards, C,C++, JavaScript/Wasm, Python, Java, etc. should standardize new primitive type represetations together (with hardware people included).
If you have different representations in different languages it just creates unnecessary impedance mismatch. It would be better for everyone if you could just pass these types from language to language.
C++ can have arbitrary-width integers as library types; it would not be that big of a deal IMHO. If `optional`, `variant` and `any` (and maybe soon, `bit`) are not in the language itself, no reason why n-bit-integer should be.
(Of course, this is written from the "we can jerry-rig the existing language to do what you want" perspective with which so much is achievable efficiently in C++.)
Boost Multiprecision [0] is an example of such a library type. It offers a compile-time arbitrarily wide integers (with predefined types up to 1024 bits) and a C++ wrapper around the GMP or MPIR libraries, which supports arbitrary sizes at runtime (not sure how it's implemented, but probably on top of an array of ints or BCD (binary-coded decimals)).
C++ has had `optional` and `variant` since (I think) C++11, maybe 14. I don't think `any` made the cut. All of these types originated (for C++ standardization) in Boost, as well. I'd caution against using `any`, though. From personal experience, the runtime overhead is quite high, and holding any non-none type is a dynamic allocation. Performance is far better with `variant` at the development cost of needing to know all the types you're going to support at compile-time.
So let's say we have a,b and c. a is 16 bits, b and c are 14bits.
a + (b + c) => ExtInt(17)
(a + b) + c => ExtInt(18)
Now obviously this a trivial example, but it highlights the fact that unless you're actually willing to carry the true ranges around in your type system, your calculation of bit widths are going to vary due to the details of which operations are done in which order with which intermediary variables.
I see, it's now the same problem as with floats. addition is not commutative anymore. But isn't that the same problem in HW also? With the right order you are saving transistors.
Well, it's not quite a problem with whether the operations are commutative - both of those phrasings of the problem will give the correct answer in hardware 100% of the time. The only difference is one made an efficient decision about the order of operations with knowledge of how bit growth rules work and the ranges of the inputs.
You do have the same problem in hardware, which is hardware designers job. The difference RTLs don't claim to be high level languages. This is an instance where there's a high level intention and a low level implementation, and the high level synthesis tool has just ported new language constructs into the high level language to do low level optimisation, rather than actually doing the synthesis optimisations that are expected.
I love Erlang for the ability to deal with _bits_. To see this in a compiled language would be wonderful. Of course, you can get down to the bit level with bitwise logical operations, but to be able to express it more naturally would be a great boon to people writing low-level network stuff, and will probably reduce programming errors.
Congrats Erich! One thing I'd be curious about is the ergonomics (or lack of) of explicit integer promotions and conversions for these types, as I find the current rules for implicit integer promotions a little confusing and hard to remember.
”Likewise, if a Binary expression involves operands which are both _ExtInt, rather than promoting both operands to int the narrower operand will be promoted to match the size of the wider operand, and the result of the binary operation is the wider type.“
I don’t understand that choice. The result should be of the wider type, yes, but, for example, multiplying a _ExtInt(1) by a _ExtInt(1000) should take less hardware than multiplying two ExtInt(1000)s. So, why promote the narrower one to the wider type?
A lot of people don't know this, but `BigInt`s are supported in modern JavaScript; integers of arbitrarily large precision.
Try in your browser console:
2n ** 4096n
// output (might have to scroll right)
1044388881413152506691752710716624382579964249047383780384233483283953907971557456848826811934997558340890106714439262837987573438185793607263236087851365277945956976543709998340361590134383718314428070011855946226376318839397712745672334684344586617496807908705803704071284048740118609114467977783598029006686938976881787785946905630190260940599579453432823469303026696443059025015972399867714215541693835559885291486318237914434496734087811872639496475100189041349008417061675093668333850551032972088269550769983616369411933015213796825837188091833656751221318492846368125550225998300412344784862595674492194617023806505913245610825731835380087608622102834270197698202313169017678006675195485079921636419370285375124784014907159135459982790513399611551794271106831134090584272884279791554849782954323534517065223269061394905987693002122963395687782878948440616007412945674919823050571642377154816321380631045902916136926708342856440730447899971901781465763473223850267253059899795996090799469201774624817718449867455659250178329070473119433165550807568221846571746373296884912819520317457002440926616910874148385078411929804522981857338977648103126085903001302413467189726673216491511131602920781738033436090243804708340403154190336n
To use, just add `n` after the number as literal notation, or can cast any Number x with BigInt(x). BigInts may only do operations with other BigInts, so make sure to cast any Numbers where applicable.
I know this is about C, I thought I'd just mention it, since many people seem to be unaware of this.
Not yet, but I believe babel and others just transpile/polyfill it by having it fall back on a string arithmetic library for working with integers of arbitrary precision.
That will never be totally reliable as 1) javascript is dynamically typed 2) javascript doesn't support operator overloading. Nonetheless, there are attempts.
> Update: Now it can convert a code using BigInt into a code using JSBI (https://github.com/GoogleChromeLabs/jsbi). It will try to detect when an operator is used for bigints, not numbers. This will not work in many cases, so please use JSBI directly only if you know, that the code works only with bigints.
So clearly, if llvm is used as a backend for js, this feature will come in handy.
On a side note, apparently, it will also be useful for the rust folks, which has user implemented libraries to emulate C-like bitfields, and implement bigints,
C++ has sped up the pace of its releases, but I don't have a sense of where C is. I didn't realize until I looked it up just now that there's a C18, although I gather that this is even smaller a change than C95 was.
Safe to say that a feature like this would be standardized by 2022 at the earliest?
I just found out about C18 thanks to your comment, I was still at C11. Thanks. Anyway I think you're right, except for the fact that I don't like when language designers release versions too quickly. I don't know the situation in the C++ land, but as an example I think that Java took the wrong way.
There's something to be said for the new Java approach of releasing often, with a stable LTS release every now and then, even if Oracle is muddying the waters with their licensing. The only release after 8 that interests me right now is 11. Meanwhile, the features of Java 12, 13, and 14 are available for people who do want to experiment with them.
I think we'll see this implicitly with C++. C++11 and the mostly non-controversial updates in C++14 comprise "modern" C++, whereas adoption of C++17 seems to be a bit slower.
Since C++11, the ISO committee has been aiming for a new standard release every 3 years. So far, they've kept this cadence up. I don't recall if C++20 is actually out yet, but I know the feature set was finalized last year, if not out yet, it's probably just due to editorial issues (I've not been using C++ for work the last few years, so my knowledge might be a bit dated).
> but as an example I think that Java took the wrong way.
I wonder what is the right way then ? Java is apparently too fast for you, and yet it gets improvements so slowly that it is getting its marketshare eaten by other JVM languages moving much faster.
If it was even slower it could as well be put directly next to the dusty COBOL and RPG boxes in the IBM attic.
I think that language designers should release a new version when they actually have something new and not because somebody decided that there has to be a new version every x months. Also I think that the popularity of other JVM languages, eg. Kotlin, derives not from their quick releases, but from the fact that they observed what programmers didn't like about Java and designed the language from the start keeping in mind those problems and addreesing them right from the beginning.
So, if I have an array of extint(3), does it pack them nicely into 10-per-32-bit-word? Or 21-per-64-bit-word? Will a struct with six extint(5) fields fit into 4 bytes? What about just a few global variables of extint(1)? Will they get packed into a single byte? Did I miss where this is covered?
I quoted this language below: "_ExtInt types are bit-aligned to the next greatest power-of-2 up to 64 bits: the bit alignment A is min(64, next power-of-2(>=N)). The size of these types is the smallest multiple of the alignment greater than or equal to N. Formally, let M be the smallest integer such that A * M >= N. The size of these types for the purposes of layout and sizeof is the number of bits aligned to this calculated alignment, A * M. This permits the use of these types in allocated arrays using the common sizeof(Array)/sizeof(ElementType) pattern."
But to be honest I don't understand what it's trying to say. If bit width N = 3, the next power of 2 is 4, so would that mean that "bit alignment(?)" A = 4? Then M = 1 is the smallest integer such that A * M >= 3. Then the size of the type would be 4 bits? That wouldn't fly with sizeof.
the title is a little misleading. Since _ExtInst is just an extension of Clang not a standard. GCC and Clang both have some hidden features that are not in standard.
It is on the standards track, though, even if N2472 was not completely accepted it seems like there is a process for this (or something very much like it) to become a standard.
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2472.pdf: "_ExtInt types are bit-aligned to the next greatest power-of-2 up to 64 bits: the bit alignment A is min(64, next power-of-2(>=N)). The size of these types is the smallest multiple of the alignment greater than or equal to N. Formally, let M be the smallest integer such that A * M >= N. The size of these types for the purposes of layout and sizeof is the number of bits aligned to this calculated alignment, A * M. This permits the use of these types in allocated arrays using the common sizeof(Array)/sizeof(ElementType) pattern."
The object size has to be at least the alignment size so that arrays work properly--&somearray[1] needs to be properly aligned, and that only works if the object size is a multiple of the alignment: sizeof myint >= _Alignof(myint) && (sizeof myint % _Alignof(myint)) == 0.
As the proposal says, the bit alignment of these types is min(64, next power-of-2(>=N)). (Of course, the alignment can't be smaller than 8 bits, which the proposal fails to account for.) Assuming CHAR_BIT==8, it follows that:
So the amount of padding can be considerable. But that doesn't matter much. What they're trying to conserve is the number of value bits that need to be processed, and in particular minimize the number of logic gates required to process the value. Inside the FPGA presumably the value can be represented with exactly N bits, regardless of how many padding bits there are in external memory.
Where does the spec say that it does that? As far as I can tell C only allows objects to have sizes in whole number of bytes, and that includes booleans.
Although a _Bool type can be used for a bit field (having size of 1 bit) but you can't use sizeof with a bit field.
A byte is CHAR_BIT bits, where CHAR_BIT is required to be at least 8 (and is exactly 8 for the vast majority of implementations).
The word "byte" is commonly used to mean exactly 8 bits, but C and C++ don't define it that way. If you want to refer to exactly 8 bits without ambiguity, that's an "octet".
I think you worded this pretty well. One thing I'd add (and that annoys me about C & C++) is that the size guarantees for integral types boil down to is that CHAR_BIT = sizeof(char) and that sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long). sizeof(T*) (for any T) is not even defined well, and can be OS/compiler specific. Makes cross-platform 32/64-bit support painful, especially because there were no strictly sized integer types before C11 & C++11. Although C11 & C++11 define types like int32_t and int64_t, they're not actually required to be those sizes! The various x-bit types only have to at least be large enough to store x-bits. So, on a hypothetical 40-bit CPU, sizeof(int32_t) could vary well be 40-bits, if that's the natural "word" size for the CPU.
The devil is always in the details, and the devil is very, very annoying...
Absolutely a true statement, but it should also be noted that WG 14 tends to be more accepting of proposals that have working extension(s) in a major compiler.
Which vendor fully implemented Annex K? For several years after C11 was published no vendor fully implemented Annex K, not even the sponsor, Microsoft. I haven't checked in awhile so maybe things have changed.