Hacker News new | past | comments | ask | show | jobs | submit login

I just hope it's not too late to make the ABI represent Boolean false with ~0.

We won't get another chance to do this right for a long, long time.




This would utterly break the expectations of anyone coming from another platform, and a substantial amount of existing C code. (Whether that C code was allowed to make that assumption or not, it did.)

Leaving aside whether the behavior you propose qualifies as "right": no, I don't think there is any chance to change the encoding of "bool", any more than you can redefine the size of "char" or the value of CHAR_BITS. The world uses 8-bit bytes, the world uses twos-complement (C recently wrote that into the standard), and the world uses 0 for false and 1 for true.


> This would utterly break the expectations of anyone coming from another platform,

Which platform?

I come from x86, ppc, mips, and arm, and in all these platforms "bool's true" are sometimes ~0 and sometimes 1. For example, all SIMD operations on all these ISAs use ~0 for true, and I've fixed hundreds of bugs where people expected true to be represented by 1.

Even RISC-V Vector ISA uses `~0` for true. So if anything, what would be completely retarded is for a new ISA to sometimes use 1 for true, and sometimes use ~0. That's super confusing, and people get hit by it all the time.

Beyond the confusion of using different values for different operations, even for simple scalar code, it means that you can't just "logical and" with true to mask all bits, which is a super useful thing to do (so useful, that this is why all SIMD ISAs use ~0, and why 1 for true wouldn't be an option for SIMD).

> (Whether that C code was allowed to make that assumption or not, it did.)

Citation needed. In C, using an integer in a logical operation returns false if the integer is 0, and true otherwise. Avoiding this is _super hard_, so I doubt your claim that C code is relying on this. Also using 0 or 1 when converting to bool is handled by the compiler, and required to convert to the false and true representations of bool, where multiple true representations are allowed.

If you are serializing or deserializing raw bools, most code doing this serializes 1 bit per bool. Code serializing 4 bytes, is mostly using integer types, and when reading those testing against 0 is the most common thing to do. And well code doing this needs to deal with endianness and what not anyways.


There is a concrete advantage to using a scheme where 0 is true and all other values are false: error codes. True means the operation succeeded correctly, or false because of insufficient permissions, false because there file doesn't exist, false because of io error, false because of timeout etc. bash uses this scheme.

However, this ship has sailed for general purpose programming languages. 0 is false, all other values are interpreted as true, operations that create a bool create 1. As you say, that's just how the world works.

False = ~0 is just zany though.


> There is a concrete advantage to using a scheme where 0 is true and all other values are false: error codes. True means the operation succeeded correctly, or false because of insufficient permissions, false because there file doesn't exist, false because of io error, false because of timeout etc. bash uses this scheme.

That works perfectly fine if you write the type as an integer type and define 0 as "no error". You can't call that a C "bool" though.


> You can't call that a C "bool" though.

In hardware, C _Bool is just an scalar integer type (1 byte wide almost everywhere these days).

If you define 0 for false, and true otherwise, you can emit much better machine code for all scalar comparisons. For example, when doing scalar == 0 or scalar != 0 (e.g. in null pointer checks) the result is always the scalar itself. The test is a nop, and with "branch on non-zero" instruction, you can just directly branch.

If you define true to some value, you need to actually test whether the scalar is zero, and give it some other value otherwise. That goes from zero instructions to often two instructions (e.g. if the hardware comparison returns 0xffff and you need to convert that to 1).


> You can't call that a C "bool" though.

Correct. It was not my intent to imply otherwise. Like the person I replied to and I reiterated, that ship has sailed.


Just change your interpretation of 0 for a return value. It does not mean success, it mean there was no error. Let's not confuse error codes (enums) for booleans because they arent.


Perhaps the type you want would be called “status” instead of bool. Zero meaning success, a few bits encoding severity, and the rest available to identify specifically the error. You would then have a macro or inline called “success” to convert it to a Boolean for flow control.

Welcome to VMS!


From a purely pragmatic point of view, this is probably a bad idea. My take is that new architectures/ABIs get a certain "budget" for doing "weird" things. Every choice you take that isn't the same as existing mainstream architectures (and notably everything that's not the same as x86) is extra work to get the software ecosystem to support your new architecture. So you really want to pick and choose what you're spending your weirdness budget on, because if you blow your budget then the result is that too much software will fail to support your new architecture in a timely way. Another example of this is that there's an argument that an ascending stack would be better than a descending stack -- but almost everybody (HPPA being the only exception I can think of offhand) has a descending stack, so you have to really really believe in the merits of ascending stacks to pick that over "just do what the rest of the world does".


Presumably you mean that false should be represented with 0 and true with ~0? And the motivation is maybe to be able to toggle boolean values with one instruction?

Has there been any discussion on this matter on risc-v mailing lists or somewhere? If the RISC-V Base ISA is now ratified, I think it is too late to change such things.

Edit: seems to be too late. According to [0], SLTx instructions "set the destination register to one or zero depending on whether the relation is true or not."

[0] https://www.imperialviolet.org/2016/12/31/riscv.html


> And the motivation is maybe to be able to toggle boolean values with one instruction?

You can already do that: use XORI with immediate 1 to toggle.


Fun fact: riscv does not even have a machine not instruction. In assembly it is just an alias for xori.


Fun fact, neither does AArch64, it uses ORRN with xzr instead.


Yes, I was hasty. Boolean true should be ~0.


Converting true=1 to true=~0 is trivial: negate the value, or subtract from zero.


Unfortunately, we lost the chance decades ago when C became popular.


Forth uses ~0. I know this has no weight to speak of, I'm just saying there are a few pockets of sanity in the world.


Not necessarily. The UNIX C syscall APIs all return 0 on success and anything not-zero is generally considered failure (usually -1). Even if it's wrong in C, it could possibly be more efficient in assembly to change the truthiness assumption around.


Hack Clang to emit ~0 for true, recompile your kernel and userspace, and see how much of it breaks. I'm almost certain the system won't boot.


> Hack Clang to emit ~0 for true, recompile your kernel and userspace, and see how much of it breaks.

The ABI of your platform might require special values for true and false, e.g., the SysV64 ABI explicitly requires 0 for false, and 1 for true.

So you would need to define a new platform for doing these tests, and then to test some code, you would need to port it to this new platform. AFAICT, if you port that code correctly, everything would work, and if something doesn't work, then you didn't port that code correctly.

So this experiment feels moot.

> I'm almost certain the system won't boot.

Clang can't compile the Linux kernel, so I hope you mean some other kernel. Otherwise, without a kernel, the system won't even boot :P


Look at doom's source code for an idea of how often a bool is not just 0 or 1. I doubt doom is unique.

Spoiler: doom uses 2 values for bools: 0 for false, 1 for true, and -1 for not sure/unknown/error/something else


why would you possibly want that? and in what size? byte? long?


I think it's helpful for SIMD operations, in some way that I can't quite remember.


The reasoning there would be that you could use bitwise operations to mask inputs from two vectors into one (think vectorized conditional move), which effectively allows you to easily achieve performant branch predication.

However, you can trivially achieve this the normal way by including simple instructions that allow you to select or combine vector inputs. If that's not a path they want to go down, it's also doable to take a slight performance hit and achieve the same with the use of multiplication instructions.


+10000000 you deserve to be on the front.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: