Optimizing compilers that don't allow disabling all optimizations makes it impos...

Rusky · 2024-08-03T16:56:21 1722704181

Disabling all optimizations isn't even enough- fundamentally what you need is a much narrower specification for how the source language maps to its output. Even -O0 doesn't give you that, and in fact will often be counterproductive (e.g. you'll get branches in places that the optimizer would have removed them).

The problem with this is that no general purpose compiler wants to tie its own hands behind its back in this way, for the benefit of one narrow use case. It's not just that it would cost performance for everyone else, but also that it requires a totally different approach to specification and backwards compatibility, not to mention deep changes to compiler architecture.

You almost may as well just design a new language, at that point.

amluto · 2024-08-03T17:34:49 1722706489

> You almost may as well just design a new language, at that point.

Forget “almost”.

Go compile this C code:

    void foo(int *ptr)
    {
        free(ptr);
        *ptr = 42;
    }

This is UB. And it has nothing whatsoever to do with optimizations — any sensible translation to machine code is a use-after-free, and an attacker can probably find a way to exploit that machine code to run arbitrary code and format your disk.

If you don’t like this, use a language without UB.

But djb wants something different, I think: a way to tell the compiler not to introduce timing dependencies on certain values. This is a nice idea, but it needs hardware support! Your CPU may well implement ALU instructions with data-dependent timing. Intel, for example, reserves the right to do this unless you set an MSR to tell it not to. And you cannot set that MSR from user code, so what exactly is a compiler supposed to do?

https://www.intel.com/content/www/us/en/developer/articles/t...

skissane · 2024-08-03T23:50:45 1722729045

It isn't just UB to dereference `ptr` after `free(ptr)` – it is UB to do anything with its value whatsoever. For example, this is UB:

    void foo(int *ptr)
    {
        assert(ptr != NULL);
        free(ptr);
        assert(ptr != NULL);
    }

Why is that? Well, I think because the C standard authors wanted to support the language being used on platforms with "fat pointers", in which a pointer is not just a memory address, but some kind of complex structure incorporating flags and capabilities (e.g. IBM System/38 and AS/400; Burroughs Large Systems; Intel iAPX 432, BiiN and i960 extended architecture; CHERI and ARM Morello). And, on such a system, they wanted to permit implementors to make `free()` a "pass-by-reference" function, so it would actually modify the value of its argument. (C natively doesn't have pass-by-reference, unlike C++, but there is nothing stopping a compiler adding it as an extension, then using it to implement `free()`.)

See this discussion of the topic from 8 years back: https://news.ycombinator.com/item?id=11235385

> And you cannot set that MSR from user code, so what exactly is a compiler supposed to do?

Set a flag in the executable which requires that MSR to be enabled. Then the OS will set the MSR when it loads the executable, or refuse to load it if it won't.

Another option would be for the OS to expose a user space API to read that MSR. And then the compiler emits a check at the start of security-sensitive code to call that API and abort if the MSR doesn't have the required value. Or maybe even, the OS could let you turn the MSR on/off on a per-thread basis, and just set it during security-sensitive processing.

Obviously, all these approaches require cooperation with the OS vendor, but often the OS vendor and compiler vendor is the same vendor (e.g. Microsoft)–and even when that isn't true, compiler and kernel teams often work closely together.

amluto · 2024-08-04T00:22:47 1722730967

> Set a flag in the executable which requires that MSR to be enabled. Then the OS will set the MSR when it loads the executable, or refuse to load it if it won't.

gcc did approximately this for decades with -ffast-math. It was an unmitigated disaster. No thanks. (For flavor, consider what -lssl would do. Or dlopen.)

> Another option would be for the OS to expose a user space API to read that MSR. And then the compiler emits a check at the start of security-sensitive code to call that API and abort if the MSR doesn't have the required value.

How does the compiler know where the sensitive code starts and ends? Maybe it knows that certain basic blocks are sensitive, but it’s a whole extra control flow analysis to find beginning and ends.

And making this OS dependent means that compilers need to be more OS dependent for a feature that’s part of the ISA, not the OS. Ick.

Or maybe even, the OS could let you turn the MSR on/off on a per-thread basis, and just set it during security-sensitive processing.

skissane · 2024-08-04T07:01:55 1722754915

> How does the compiler know where the sensitive code starts and ends?

Put an attribute on the function. In C23, something like `[[no_data_dependent_timing]]` (or `__attribute__((no_data_dependent_timing))` using pre-C23 GNU extension)

> And making this OS dependent means that compilers need to be more OS dependent for a feature that’s part of the ISA, not the OS. Ick.

There are lots of unused bits in RFLAGS, I don't know why Intel didn't use one of those, instead of an MSR. (The whole upper 32 bits of RFLAGS is unused – if Intel and AMD split it evenly between them, that would be 16 bits each.) Assuming the OS saves/restores the whole of RFLAGS on context switch, it wouldn't even need any change to the OS. CPUID could tell you whether this additional RFLAGS bit was supported or not. Maybe have an MSR which controls whether the feature is enabled or not, so the OS can turn it off if necessary. Maybe even default to having it off, so it isn't visible in CPUID until it is enabled by the OS via MSR – to cover the risk that maybe the OS context switching code can't handle a previously undefined bit in RFLAGS being non-zero.

Rusky · 2024-08-03T17:54:40 1722707680

I am not talking about UB at all. I am talking about the same constant-time stuff that djb's post is talking about.

SAI_Peregrinus · 2024-08-03T21:16:42 1722719802

Execution time is not considered Observable Behavior in the C standard. It's entirely outside the semantics of the language. It is Undefined Behavior, though not UB that necessarily invalidates the program's other semantics the way a use-after-free would.

tmyklebu · 2024-08-04T16:51:21 1722790281

This is pretty persnickety and I imagine you're aware of this, but free is a weak symbol on Linux, so user code can replace it at whim. Your foo cannot be statically determined to be UB.

davrosthedalek · 2024-08-04T21:02:11 1722805331

Hmm, not sure, I think it would be possible to mark a function with a pragma as "constant time", and the compiler could make sure that it indeed is that. I think it wouldn't be impossible to actually teach it to convert branched code into unbranched code automatically for many cases as well. Essentially, the compiler pass must try to eliminate all branches, and the code generation must make sure to only use data-constant-time ops. It could warn/fail when it cannot guarantee it.

exe34 · 2024-08-03T16:25:57 1722702357

clang::optnone

nabla9 · 2024-08-03T16:34:10 1722702850

"Optimizing compilers that don't allow disabling __all__ optimizations"

layer8 · 2024-08-03T18:21:23 1722709283

It’s not well-defined what counts as an optimization. For example, should every single source-level read access of a memory location go through all cache levels down to main memory, instead of, for example, caching values in registers? That would be awfully slow. But that question is one reason for UB.

kevingadd · 2024-08-03T20:34:25 1722717265

Or writing code that relies on inlining and/or tail call optimization to successfully run at all without running out of stack... We've got some code that doesn't run if compiled O0 due to that.

exe34 · 2024-08-03T17:05:12 1722704712

do these exist? who's using them?

cmeacham98 · 2024-08-03T16:37:46 1722703066

If your "secure" code is not secure because of a compiler optimization it is fundamentally incorrect and broken.

hedgehog · 2024-08-03T17:34:20 1722706460

There is a fundamental difference of priorities between the two worlds. For most general application code any optimization is fine as long as the output is correct. In security critical code information leakage from execution time and resource usage on the chip matters but that essentially means you need to get away from data-dependent memory access patterns and flow control.

account42 · 2024-08-05T13:56:13 1722866173

Then such code needs to be written in a language that actually makes the relevant timing guarantees. That language may be C with appropriate extensions but it certainly is not C with whining that compilers don't apply my special requirements to all code.

hedgehog · 2024-08-06T02:16:32 1722910592

That argument would make more sense if such a language was widely available but today in practice it isn't so we live in the universe of less ideal solutions. Actually it doesn't really respond to DJB's point anyway, his case here is that the downstream labor cost of compiler churn exceeds the actual return in performance gains from new features and that a change in policy could give security-related code a more predictable target without requiring a whole new language or toolchain. For what it's worth I think the better solution will end up being something like constant-time function annotations (not stopping new compiler features) but I don't discount his view that absent human nature maybe we would be better of focusing compiler dev on correctness and stability.

account42 · 2024-08-06T08:02:21 1722931341

> his case here is that the downstream labor cost of compiler churn exceeds the actual return in performance gains from new features

Yes but his examples are about churn in code that makes assumptions that neither the language nor the compiler guarantees. It's not at all surprising that if your code depends on coincidental properties of your compiler that compiler upgrades might break it. You can't build your code on assumptions and then blame others when those assumptions turn out to be false. But then again, it's perhaps not too surprising that cryptographers would do this since their entire field depends on unproven assumptions.

A general policy change here makes no sense because most language users do not care about constant runtime and would rather have their programs always run as fast as possible.

hedgehog · 2024-08-06T16:57:49 1722963469

I think this attitude is what is driving his complaints. Most engineering work exists in the context of towering teetering piles of legacy decisions, organizational cultures, partially specified problems, and uncertainty about the future. Put another way "the implementation is the spec" and "everything is a remodel" are better mental models than spec-lawyering. I agree that relying on say stability of the common set of compiler optimizations circa 2015 is a terrible solution but I'm not convinced it's the wrong one in the short term. Are we really getting enough perf out of the work to justify the complexity? I don't know. It's also completely infeasible given the incentives at play, complexity and bugs are mostly externalities that with some delay burden users and customers.

Personally I'm grateful the cryptographers do what they do, computers would be a lot less useful without their work.

thayne · 2024-08-03T17:37:53 1722706673

The problem is that preventing timing attacks often means you have to implement something in constant time. And most language specifications and implementations don't give you any guarantees that any operations hapen in constant time and can't be optimized.

So the only possible way to ensure things like string comparison don't have data-dependent timing is often to implement it in assembly, which is not great.

What we really need is intrinsics that are guaranteed to have the desired timing properties , and/or a way to disable optimization, or at least certain kinds of optimization for an area of code.

plorkyeran · 2024-08-03T17:57:13 1722707833

Intrinsics which do the right thing seems like so obviously the correct answer to me that I've always been confused about why the discussion is always about disabling optimizations. Even in the absence of compiler optimizations (which is not even an entirely meaningful concept), writing C code which you hope the compiler will decide to translate into the exact assembly you had in mind is just a very brittle way to write software. If you need the program to have very specific behavior which the language doesn't give you the tools to express, you should be asking for those tools to be added to the language, not complaining about how your attempts at tricking the compiler into the thing you want keep breaking.

arp242 · 2024-08-03T20:09:37 1722715777

The article explains why this is not as simple as that, especially in the case of timing attacks. Here it's not just the end-result that matters, but how it's done that matters. If any code can be change to anything else that gives the same results, then this becomes quite hard.

Absolutist statements such as this may give you a glowing sense of superiority and cleverness, but they contribute nothing and are not as clever as you think.

plorkyeran · 2024-08-03T21:08:12 1722719292

The article describes why you can’t write code which is resistant to timing attacks in portable C, but then concludes that actually the code he wrote is correct and it’s the compiler’s fault it didn’t work. It’s inconvenient that anything which cares about timing attacks cannot be securely written in C, but that doesn’t make the code not fundamentally incorrect and broken.

nabla9 · 2024-08-03T17:00:53 1722704453

It's secure code we use.

I'm sure you know who DJB is.

jjuhl · 2024-08-03T17:53:43 1722707623

Why is knowing who the author is relevant? Either what he posts is correct or it is not, who the person is is irrelevant.

bluGill · 2024-08-03T16:50:37 1722703837

If you have ub then you have a bug and there is some system that will show it. It isn't hard to write code without ub.

bigstrat2003 · 2024-08-03T16:55:20 1722704120

It is, in fact, pretty hard as evidenced by how often programmers fail at it. The macho attitude of "it's not hard, just write good code" is divorced from observable reality.

tbrownaw · 2024-08-03T19:11:11 1722712271

Staying under the speed limit is, in fact, pretty hard as evidenced by how often drivers fail at it.

kevingadd · 2024-08-03T20:30:44 1722717044

It's more complex than that for the example of car speed limits. Depending on where you live, the law also says that driving too slow is illegal because it creates an unsafe environment by forcing other drivers on i.e. the freeway to pass you.

But yeah, seeing how virtually everyone on every road is constantly speeding, that doesn't give me a lot of faith in my fellow programmers' ability to avoid UB...

deathanatos · 2024-08-04T01:36:19 1722735379

Some jurisdictions also set the speed limit at, e.g., the 85th percentile of drivers' speed (https://en.wikipedia.org/wiki/Speed_limit#Method) so some drivers are always going to be speeding.

(I'm one of those speeders, too; I drive with a mentality of safety > following the strict letter of the law; I'll prefer speed of traffic if that's safer than strict adherence to the limit. That said, I know not all of my peers have the same priorities on the road, too.)

account42 · 2024-08-05T14:03:12 1722866592

And to be specific, some kinds of UB are painfully easy to avoid. A good example of that is strict aliasing. Simply don't do any type punning. Yet people still complain about it being the compiler's fault when their wanton casting leads to problems.

bluGill · 2024-08-03T17:10:33 1722705033

People write buffer overflows because and memory leaks they are not coreful. The rest of ub are things I have never seen despite running sanitizers and a large codebase.

saagarjha · 2024-08-03T18:41:03 1722710463

Perhaps you’re not looking all that hard.

bluGill · 2024-08-03T23:28:00 1722727680

Sanitizers are very good at finding ub.

saagarjha · 2024-08-04T22:10:27 1722809427

Sure. That's just a function of how much UB there is, rather than them catching it all.

pjmlp · 2024-08-03T20:47:21 1722718041

Only if developers act as grown ups and use all static analysers they can get hold of, instead of acting as they know better.

The tone of my answer is a reflection of what most surveys state, related to the actual use of such tooling.

epcoa · 2024-08-03T23:22:38 1722727358

Do you know what UBSAN is? Have you used it?

bluGill · 2024-08-03T23:28:46 1722727726

yes, my ci system runs it with a comprehensive test suite.