Hacker News new | past | comments | ask | show | jobs | submit login
C99 doesn't need function bodies, or 'VLAs are Turing complete' (lemon.rip)
296 points by lsof on Aug 4, 2022 | hide | past | favorite | 249 comments



Thanks to them being yet another attack vector and funny stuff like on this post, got demoted to optional on C11.

Additionally Google spent several years paying to clean up the Linux kernel from all VLA occurrences.

https://www.phoronix.com/news/Linux-Kills-The-VLA


>Thanks to them being yet another attack vector and funny stuff like on this post, got demoted to optional on C11.

Sadly, the C committee doesn't really understand what was wrong with VLAs and a sizable group of its members wants to make them mandatory again:

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2921.pdf ("Does WG14 want to make VLAs fully mandatory in C23")


What's wrong with VLAs is their syntax. It really shouldn't use the same syntax as regular C arrays, otherwise they would be fine, maybe with a scary enough keyword. They are more generic than alloca too, alloca being scoped to the function, while VLAs being scoped to the innermost block scope that contains them.


Syntax, no protection against stack corruption,...


You can corrupt the stack without VLAs just fine. What else?


VLAs make it a lot easier to corrupt the stack by accident. Unless you're quite a careful coder, stuff like:

  f (size_t n)
  {
    char str[n];
leads to a possible exploit where the input is manipulated so n is large, causing a DoS attack (at best) or full exploit at worse. I'm not saying that banning VLAs solves every problem though.

However the main reason we forbid VLAs in all our code is because thread stacks (particularly on 32 bit or in kernel) are quite limited in depth and so you want to be careful with stack frame size. VLAs make it harder to compute and thus check stack frame sizes at compile time, making the -Wstack-usage warning less effective. Large arrays get allocated on the heap instead.


> stuff like ... leads to a possible exploit where the input is manipulated so n is large

The same is true for most recursive calls, should recursion be also banned in programming languages?


When writing secure C? In most cases, absolutely.


That's not really a fair comparison though. Recursion is strictly necessary to implement several algorithms. Even if "banned" from the language, you would have to simulate it using a heap allocated stack or something to do certain things.

None of this applies to VLA arguments.


It's not strictly necessary precisely because all recursions can be "simulated" with a heap allocated stack. And in fact, the "simulated" approach is almost always better, from both a performance and a maintenance perspective.


This is simply nonsense. In cases with highly complex recursive algorithms, "unrecursing" would make the code a completely unmaintainable mess, requiring an immensely complicated state machine, which is why something like Stockfish doesn't do that in its recursive search function even though the code base is extremely optimised. And yes, some algorithms are inherently recursive, and don't gain any meaningfull performance from the heap stack + state machine approach.


> In cases with highly complex recursive algorithms, "unrecursing" would make the code a completely unmaintainable mess, requiring an immensely complicated state machine

Nothing about it is "immensely complicated". Rather than store your recursion state in a call stack, you can store it in a stack of your own, i.e. a heap-allocated container. The state of a cycle of foo(a,b,c) -> bar(d,e,f) -> baz(g,h) -> foo(...) becomes expressible as an array of tagged union of (a,b,c), (d,e,f) and (g,h).

And there is nothing inherently unmaintainable about this approach. I would hope that it's a commonly taught pattern, but even if it's not, that doesn't make it impossible to understand. Picking good names and writing explanatory comments are 90% of the battle of readability.

> which is why something like Stockfish doesn't do that in its recursive search function even though the code base is extremely optimised

I can't speak for what Stockfish devs do, as I have no insight into which particular developers made what set of tradeoffs in which parts of their codebase. But it doesn't change the reality that using your own stack container is almost always more performant and more extensible:

1. Your own stack container can take up less space per element than a call stack does per stack frame. A stack frame has to store all local variables, which is wasteful. The elements of your own stack container can store just the state that is necessary.

2. Your own stack container can recurse much farther. In addition to the previous point, call stacks tend to have relatively small memory limits by default, whereas heap allocations do not. In addition, you can employ tricks like serializing your stack container to disk, to save even more memory and allow you to recurse even farther.

3. Your own stack container can be deallocated, shrunk, or garbage collected to free memory for further use, but a call stack typically only grows.

4. Your own stack container can be easily augmented to allow for introspection, which would require brittle hackery with a traditional call stack. This can be an extremely useful property, e.g. for a language parser resolving ambiguities in context sensitive grammars.

> And yes, some algorithms are inherently recursive, and don't gain any meaningfull performance from the heap stack + state machine approach.

Using a heap-allocated stack container is recursion. It is the state machine. The only fundamental implementation difference between the approach I describe and traditional recursion is that the former relies on the programmer using an array of algorithm state, and the latter relies on the runtime using an array of stack frames.


First of all, my original point was only that comparing recursion to VLAs was not a reasonable comparison, not to make some profound point about your favourite way to implement recursion. So, chill dude.

1. This is often irrelevant, depending on your priorities. Taking stockfish as an example again, memory is not what's at a premium, search nodes is. The search space is inherently intractable. You're never gonna be able to recurse meaningfully deeper by shaving off some stack space, because the breadth of the tree grows exponentially in the number of plies. The only form of optimisation that helps here is caching and various heuristics to avoid searching certain nodes at all

2. You know you can change the stack size, right? This is what Stockfish does for its search threads. No need for fancy dynamic allocation here. Also, have fun watching shit get interesting interesting when you have to realloc() ncpu huge chunks of contiguous memory balls deep into a search, when the engine is in time trouble... Sometimes resizing allocated memory is simply not an option.

3. Again this is not always relevant. Stockfish needs its big stacks all the time. So big whoop.

4. This Stockfish does need to do, which is why it keeps an array of structs(never resized) for that purpose. But Stockfish also needs to make decisions about when things move between the different stacks, which is why it uses recursive calls despite having a stack on the heap also.

5. It's obvious I am aware of this. My original comment literally said that banning recursion would force you to implement recursion manually anyway using a state machine. Like, dude, you're literally repeating my own comment back at me as if I didn't already know it. What's up with that?

The point here is: yes, in some specialised cases it might be preferable to implement the recursion yourself if the problem calls for it. But other times, and I'd argue most of the time, this is not necessary. So just use the already available abstraction provided by language. Your line of reasoning is a bit like arguing for not using C at all, because it will be slower than assembly in some cases. sure, write hand optimised assembly in your hot paths if you need to, but most people don't. Abstractions are generally our friends, they help us write clearer, more consise code.


> It's not strictly necessary precisely because all recursions can be "simulated" with a heap allocated stack.

This just moves the problem from a stack blowout to a heap blowout.

> And in fact, the "simulated" approach is almost always better, from both a performance and a maintenance perspective.

I am unsure about the performance, but turning recursive code implementing a recursive procedure into iterative code which has to maintain a stack by hand cannot possibly improve readability unless the programmers involved are pathologically afraid of seeing recursive code.


Computers have many GiBs of heap space. Your thread has MiB of stack. Tell me. Which is the bigger problem?

This is also ignoring the fact that the memory usage for recursive algorithms is higher because there’s a bunch of state for doing the function call being pushed onto the stack that you just don’t see (return address, potentially spilling registers depending on the compiler’s ability to optimize, etc). Unless you stick with tail recursion but that’s just a special case where the loop method would be similarly trivial. Case in point. I implemented depth first search initially as a recursive thing and blue out the stack on an embedded system. Switched to an iterated depth first search with no recursion. No problem.

OP said “it’s the only way to solve certain problems”. That’s clearly not true because ALL recursive algorithms can be mapped to non recursive versions.

I never got the fascination with implicit recursion. It’s just a slightly different way to express the solution. Personally I find it usually harder to follow / fully understand than regular iterative methods that describe the recursion state explicitly (ie time and space complexity in particular I find very hard to reason about for recursion.)


Definitely not.


For the can be, or performance, or maintenance perspective?? With example please ..


You’ll find it difficult to beat a construct that has dedicated language support. Trying to fake function calls without actually making function calls is difficult and going to have poor ergonomics.


MISRA C bans recursion for instance.


Doesn't a similar DoS risk (from allowing users to allocate arbitrarily large amounts of memory) also apply to the heap? You shouldn't be giving arbitrary user-supplied ints to malloc either.


> Doesn't a similar DoS risk (from allowing users to allocate arbitrarily large amounts of memory) also apply to the heap?

DoS Risk? No one cares too much about that - the problem with VLAs is stack smashing, which then allows aribtrary user-supplied code to be executed.

You cannot do that with malloc() and friends.


VLAs don’t smash the stack.


Depends on the number you put inside, and the linker settings for stack size.


No.


How does a huge VLA corrupt the stack? If there's not enough space but code keeps going then isn't that a massive bug with your compiler or runtime?


Okay. How do you tell the kernel that? Sure, the kernel will have put a guard page or more at the end of the stack, so that if you regularly push onto the stack, you will eventually hit a guard page and things will blow up appropriately.

But what if the length of your variable length array is, say, gigabytes, you've blown way past the guard pages, and your pointer is now in non-stack kernel land.

You'd have to check the stack pointer all the time to be sure, that's prohibitive performance-wise. Ironically, x86 kind of had that in hardware back when segmentation was still used.


I think the normal pattern is a stack probe every page or so when there's a sufficiently large allocation. There's no need to check the stack pointer all the time.

But that's not my point. If the compiler/runtime knows it will blow up if you have an allocation over 4KB or so, then it needs to do something to mitigate or reject allocations like that.


> I think the normal pattern is a stack probe every page or so when there's a sufficiently large allocation.

What exactly are you doing there, in kernel code?

> But that's not my point. If the compiler/runtime knows it will blow up if you have an allocation over 4KB or so, then it needs to do something to mitigate or reject allocations like that.

Do what exactly? Just reject stack allocations that are larger than the cluster of guard pages? And keep book of past allocations? A lot of that needs to happen at runtime, since the compiler doesn't know the size with VLAs.

It's not impossible and mitigations exist, but it is pretty "extra". gcc has -fstack-check that (I think) does something there.


> What exactly are you doing there, in kernel code?

In kernel code?

What you're doing is triggering the guard page over and over if the stack is pushing into new territory.

> Do what exactly? Just reject stack allocations that are larger than the cluster of guard pages? And keep book of past allocations? A lot of that needs to happen at runtime, since the compiler doesn't know the size with VLAs.

Just hit the guard pages. You don't need to know the stack size or have any bookkeeping to do that, you just prod a byte every page_size. And you only need to do that for allocations that are very big. In normal code it's just a single not-taken branch for each VLA.


That seems to be what -fstack-check for gcc is doing:

"If neither of the above are true, GCC will generate code to periodically “probe” the stack pointer using the values of the macros defined below."[1]

I guess I'm wondering why this isn't always on if it solves the problem with negligible cost? Genuine question, not trying to make a point.

[1] https://gcc.gnu.org/onlinedocs/gccint/Stack-Checking.html


What I'm finding in a quick search is:

* It should be fast, but I haven't found a benchmark.

* There appear to be some issues of signals hitting at the wrong time vs. angering valgrind, depending on probe timing.

* Probes like this are mandatory on windows to make sure the stack is allocated, so it can't be that bad.


I'm mostly interested in it for kernel code though, so the second point at least does not apply, at least not directly. Maybe there is something analogous when preempting kernel threads, I haven't thought it through at all. But interesting.


Because it's on by default in MSVC [0], and we all know that whatever technical decisions MS makes, they're superior to whatever technical decision the GNU people make. /s

Speaking seriously, I too would like an answer.

[0] https://docs.microsoft.com/en-us/windows/win32/devnotes/-win...


One decision Microsoft made was not to support VLAs at all, even after their new found C love.


An attacker would first trigger a large VLA-allocation that puts the stack pointer within a few bytes of the guard page. Then they would just have the kernel put a return address or two on the stack and that would be enough to cause a page fault. The only way to guard against that would be to check that every CALL instruction has enough stack space which is infeasible.


But that's the entire point of the guard page, it causes a page fault. That's not corruption.

Denial of service by trying to allocate something too big for the stack is obvious. I'm asking about how corruption is supposed to happen on a reasonable platform.


Perhaps they're trying to guard against introducing easy vulnerabilities on unreasonable platforms. With VLAs unskilled developers can perhaps more easily introduce this problem. It would be a case of bad platforms and bad developers ruining it for the rest.


An attacker could trigger a large VLA allocation that jumps over the guard page, and a write to that allocation. That write would start _below_ the guard page, so damage would be done before the page fault occurs (ideally, that write wouldn’t touch the guard page and there wouldn’t be a page fault but that typically is harder to do; the VLA memory allocation typically is done to be fully used)

Triggering use of the injected code may require another call timed precisely to hit the changed code before the page fault occurs.

Of course, the compiler could and should check for stack allocations that may jump over guard pages and abort the program (or, if in a syscall, the OS) or grow the stack when needed. Also, VLAs aren’t needed for this. If the programmer creates a multi-megabyte local array, this happens, too (and that can happen accidentally, for example when increasing a #define and recompiling)

The lesson is, though, that guard pages alone don’t fully protect against such attacks. The compiler must check total stack space allocated by a function, and, if it can’t determine that that’s under the size of your guard page, insert code to do additional runtime checks.

I don’t see that as a reason to outright ban VLAs, though.


VLAs give the attacker an extra attack vector. The size of the VLA is runtime-determined and potentially controlled by user input. Thus, the only safe way to handle VLAs is to check that there is enough stack space for every VLA allocation. Which may be prohibitively expensive and even impossible on some embedded platforms. Stack overflows may happen for other reasons too, but letting programmers put dynamic allocations on the stack is just asking for trouble.


I don’t think “may be prohibitively expensive and even impossible on some embedded platforms” is a strong argument for not including it in C. There are many other features in C for which that holds, such as recursion, dynamic memory allocation, or even floating point.


Welcome to the world of undefined behavior. Anything can happen....


I think this is a common misunderstanding about UB. It's not that anything can happen, just that the standard doesn't specify what happens, meaning whatever happens is compiler/architecture/OS dependent. So you can't depend on UB in portable code. But something definite will happen, given the current state of the system. After all, if it didn't, these things wouldn't be exploitable either.


> But something definite will happen, given the current state of the system.

This is only true in the very loose and more or less useless sense that the compiler is definitely going to emit some machine code. What does that machine code do in the UB case? It might be absolutely anything.

One direction you could go here is you insist that surely the machine code has a defined meaning for all possible machine states, but that's involving a lot of state you aren't aware of as the programmer, and it's certainly nothing you can plan for or anticipate so it's essentially the same thing as "anything can happen".

Another is you could say, no, I'm sure the compiler is obliged to put out specific machine code, and you'd just be wrong about that, Undefined Behaviour is distinct from Unspecified Behaviour or merely Platform Dependant behaviour.

Many C and C++ programmers have the mistaken expectation that if their program is incorrect it can't do anything really crazy, like if I never launch_missiles() surely the program can't just launch_missiles() because I made a tiny mistake that created Undefined Behaviour? Yes, it can, and in some cases it absolutely will do that.


I'm aware you can get some pretty crazy behaviours, say if you end up overwriting a return address and your code begins to jump around like crazy. Even that could reproduce the same behaviour consistently though.

I once had a bug like that in a piece of AVR C code where the stack corruption would happen in the same place every time and the code would pathologically jump to the same places in the same order every time. It's worth noting though that when there's an OS, usually what will happen is just a SIGABRT. See the OpenBSD libc allocator for a masterclass in making misbehaving programs crash.

I was never advocating to rely on UB, btw. But yes, UB can be understood in many cases.


You are confusing the C standard and actual platforms/C implementations. A lot of things are UB in the standard but perfectly well defined on your platform. Standards don’t compile code, real compilers do. The standard doesn’t provide standard library implementations, the actual platform does.

Targeting the standard is nice, but if all of your target platforms guarantee certain behaviors, you might consider using those. A lot of UB in the C standard is perfectly defined and consistent across MSVC, GCC, Clang, and ICC.


> A lot of UB in the C standard is perfectly defined and consistent across MSVC, GCC, Clang, and ICC.

Do you have examples of this "a lot of UB in the C standard" which is in fact guaranteed to be "perfectly defined and consistent" across all the platforms you listed ? You may need to link the guarantees you're relying on.


Okay so take the two most complained about UBs, improper aliasing and signed integer overflow. Every compiler I’ve ever used lets you turn both into defined behavior.


These things aren't the default, the compiler may "let you" but it doesn't do it until you already know you've got a problem and you explicitly tell it you don't want standard behaviour. However lets ignore that for a moment:

My GCC offers to turn signed integer overflow into either an abort or wrapping, either of which is defined behaviour, but how do I get MSVC to do precisely the same thing?

Likewise for aliasing rules. GCC has a switch to have the optimiser not assume the language's aliasing rules are actually obeyed, but what switch in MSVC does exactly the same thing?


MSVC doesn’t enforce strict aliasing to begin with. And passing /d2UndefIntOverflow makes signed integer overflow well-defined. Even if you don’t pass it, MSVC is very conservative in exploiting that UB, precisely to avoid breaking code that was valid before they started doing this optimization (2015 or so).


What you are describing is unspecified and implementation-defined behavior [0].

Avoiding UB (edit: in general) doesn't have anything to do with the code being portable and everything with the code not being buggy [1][2].

[0] https://en.cppreference.com/w/c/language/behavior

[1] https://blog.regehr.org/archives/213

[2] http://blog.llvm.org/2011/05/what-every-c-programmer-should-...


Oh really? Then why does every compiler I use have a parameter to turn off strict aliasing?

You cite to a source that contradicts you. In the llvm blog post: "It is also worth pointing out that both Clang and GCC nail down a few behaviors that the C standard leaves undefined."


Sometimes a compiler gives a guarantee that a particular UB is always handled in a specific way, but you cannot generalize this to all UB.

Added 'in general' to my comment to make this explicit.


What is undefined about a large VLA? It shouldn't be undefined.

According to wikipedia "C11 does not explicitly name a size-limit for VLAs"


The C standard has no mentions of a program stack. This isn’t undefined behavior.


You shouldn't be writing C if you're not a careful coder.


Yeah, right.

https://msrc-blog.microsoft.com/2019/07/16/a-proactive-appro...

https://research.google/pubs/pub46800/

https://support.apple.com/guide/security/memory-safe-iboot-i...

Maybe you could give an helping hand to Microsoft, Apple and Google, they are in need of carefull C coders.


I'm not sure if you intentionally missed my point. Everything in C requires careful usage. VLAs aren't special: they're just yet another feature which must be used carefully, if used at all.

Personally, I don't use them, but I don't find "they're unsafe" to be a convincing reason for why they shouldn't be included in the already-unsafe language. Saying they're unnecessary might be a better reason.


The goal should be to reduce the amount of sharp edges, not increase them even further.


VLAs are unsafe in the worst kind of way as it is not possible to query when it is safe to use them. alloca() at least in theory can return null stack overflow, but there is no such provision with VLA.


They're not unsafe (in the memory sense) as long as they check for overflow and reliably crash if there is one.


If a lot of platforms don't implement this check reliably, then it's unsafe in practice at this time, even if not in theory.


Who out there has a version of stack checking that doesn't actually check the stack…? If it doesn't check by default, as C doesn't, then it's not "as long as".


And if you're a careful coder writing C, you should give the VLA the stink eye unless it's proving its worth.


Hint, that means nobody should be writing C.


Where is the lie?


Too bad we have all that legacy C code that won't just reappear by itself on a safer language.

That means there are a lot of not careful enough developers (AKA, human ones) that will write a lot of C just because they need some change here or there.


With VLAs:

1. The stack-smashing pattern is simple, straightforward and sure to be used often. Other ways to smash the stack require some more "effort"...

2. It's not just _you_ who can smash the stack. It's the fact that anyone who calls your function will smash the stack if they pass some large numeric value.


They can overflow the stack. They cannot smash the stack.


Fair enough; I had the mistaken idea that the two terms are interchangeable, but apparently stack smashing is only used for the attack involving the stack:

https://en.wikipedia.org/wiki/Stack_buffer_overflow

so, pretend I said "overflow" instead of "smash" in my post.


Useless semantic pedantry at best, but arguable wrong as there isn't some sort of ISO standard on dumb hacking terms.


Overflowing the stack gives you a segfault. Smashing the stack lets hackers pop a shell on your computer. They are incredibly different. VLAs can crash your program, but they do not give attackers the ability to scribble all over the stack.


> Overflowing the stack gives you a segfault.

Maybe. If the architecture supports protected memory and the compiler has placed an appropriately sized guard page below the stack. If it doesn't then overflowing the stack via a VLA gives you easy read and write access to any byte in program memory.


If your architecture does not support this then you’re at risk whenever you make a function call.


Backpedal harder!


I’m not backpedaling. If your environment has guard pages and does probing then it protects equally well against VLAs and function calls overflowing the stack. If it has neither then both are liable to overwrite other memory. Obviously I would prefer that you have the protections, or some sort of equivalent, but they have nothing to do with VLAs.


Unless they happen to be enjoying kernel space.


There is no difference.


What about not adding even more ways how we should avoid using C?


> What about not adding even more ways how we should avoid using C?

That's a mute point for C's target audience because they already understand that they need to be mindful of what the language does.


What the heck. It's "moot", not "mute".


I'm curious, are there accents in which those two words are homophones? Given the US tendency to pronounce new/due/tune as noo/doo/toon I can imagine some might say mute as moot but I can't find anything authoritative online.


According to Wikipedia, East Anglia does universal yod-dropping, so mute/moot would be homophonic. (See https://en.wiktionary.org/wiki/Appendix:English_dialect-depe...).

Personally, I haven't come across anyone who pronounces 'mute' without the /j/.


They are not perfect homophones. There is a slight i (IPA j) in "mute".

https://en.wiktionary.org/wiki/mute

https://en.wiktionary.org/wiki/moot


That is like saying if sushi knifes are already sharp enough, there is no issue cutting fish with a samurai sword instead, except at least with the knife maybe the damage isn't as bad.


The difference between the largest sushi knives and a katana is more about who wields them than the blade involved.


One ends up cutting quite a few pieces either way.


When you really need a samurai, nothing less will do. Arguably most of us need sushi chefs these days.


> That is like saying if sushi knifes are already sharp enough (...)

No, it's like saying that professional people understand the need to learn what their tools of the trade do beyond random stackoverflow search on how to print text to stdout.

It seems you have an irrational dislike of C. That's perfectly ok. No need to come up with excuses though.


It doesn't seem, I do.

Ever since I got my hands on Turbo C++ 1.0, back in 1993, I see no reason why one should downgrade ourselves to C.

At least C++ give us the tools to be a bit more secure, even if tainted with C's copy-paste compatibility.

You will find posts from me on Usenet, stading on C++ frontline of C vs C++ flamewars.

No one is making excuses, it should be nuked, unfortunely it will outlive all of us.


I like your comparison of a C programmer with a samurai.


Including that most of them end up doing Seppuku on their applications.


while we're hugging them from behind


It's more like the C programmer is a sushi master. They can make a delicious, beautifully crafted snack. But if the wrong ingredients are used you'll get very sick.


VLAs are no more unsafe than standard C is for stack corruption.


Just one additional attack vector more to add to the list, who's still counting them?


It’s not an additional attack vector.


    int A[100000000];
Also has no protection.


the only result of banning VLAs is to force everyone to use alloca, which is even less safe.

exhibit A: https://lists.freedesktop.org/archives/mesa-commit/2020-Dece...

exhibit B: https://github.com/neovim/neovim/issues/5229

exhibit C: https://github.com/sailfishos-mirror/llvm-project/commit/6be...

etc etc


Nobody is forced to use alloca, which is not less safe, only equally disastrous. Just use malloc, already.


ah yes, why didn't I think of it, let me just try:

    #include <cstdlib>
    #include <span>
    
    __attribute__((annotate("realtime")))
    void process_floats(std::span<float> vec) 
    {
      auto filter = (float*) malloc(sizeof(float) * vec.size());
    
      /* fill filter with values */
    
      for(int i = 0; i < vec.size(); i++)
        vec[i] *= filter[i];
    
      free(filter);
    }

    $ stoat-compile++ -c foo.cpp -emit-llvm -std=c++20
    $ stoat foo.bc
    Parsing 'foo.bc'...

    Error #1:
    process_floats(std::span<float, 18446744073709551615ul>) _Z14process_floatsSt4spanIfLm18446744073709551615EE
    ##The Deduction Chain:
    ##The Contradiction Reasons:
     - malloc : NonRealtime (Blacklist)
     - free : NonRealtime (Blacklist)

oh noes :((


Here's a nickel, kid.

The bullshit about oh my embedded systems doesn't have dynamic memory is bullshit. You either know how big your stack is and how many elements there are, and you make the array that big. Or you don't know and you're fucked.

You can't clever your way out of not knowing how big to make the array with magic stack fairy pretend dynamic memory. You can only fuck up. Is there room for 16 elements? The array is 16. Is there room for 32? It's 32.


I think the parent comment was about malloc not being real-time? Not about storage space.

Though I do wonder why there can't be a form of malloc that allocates in a stack like fashion in real time to satisfy the formal verifier?


Real time also generally means your input sizes are bounded and known, otherwise the algorithm itself isn't realtime and malloc isn't the reason why.

But strictly speaking the only problem is a malloc/free that can lock (you can end up with priority inversion). So a lock-free malloc would be realtime just fine, it doesn't have to be stack growth only.


> Real time also generally means your input sizes are bounded and known, otherwise the algorithm itself isn't realtime and malloc isn't the reason why.

I think you meant to say something else? Real-time is a property of the system indicating a constraint on the latency from the input to the output—it doesn't constrain the input itself. (Otherwise even 'cat' wouldn't be real-time!)


If your input is not bounded you can't know in advance the time needed to process it. In other word you cannot be realtime.

`cat` can be realtime, but only by fixing the size of its internal buffer where it reads to and writes from. In this case it can in theory bound the time needed to process the fixed block of input.

But if for some reason `cat` tried to read/write by lines of unknown in advance size, it would fail to be realtime.


I think we're not disagreeing on the actual constraints, but the terminology. The "internal buffer" is not part of the system's "input". It's part of the system's "state".


We're talking about function inputs, not system inputs.


Real-time is a property of the whole system though (?) not individual functions. But even if you want to reframe it to be about functions, small input is neither necessary nor sufficient for it being "real time". Like your function might receive an arbitrarily large n-element array a[] and just return a[a[n-1]], and it would be constant time. Again, the size is the simply not the correct property to look at.


> Real-time is a property of the whole system though (?) not individual functions.

As I see it, it leads to a contradiction. `cat` have unbounded time of execution. It depends on a size of input. It means that `cat` cannot be realtime. But this logic leads us to a conclusion, that any OS kernel cannot be realtime, because it works for an unbounded time.

It is a non sense. Realtime is about latency: how much it takes time to react to an input. `cat` may be realtime or may be not, it depends on how we define "input". We need to define it in a way that it is bounded. I mean 4Kb chunks of bytes for example.

> Like your function might receive an arbitrarily large n-element array a[] and just return a[a[n-1]], and it would be constant time.

No. Receiving n-element array is an O(n) operation. It needs to be copied to memory. We can of course pick one function that just get a pointer to an array, but if real-time is a property of the whole system, then this system needs to copy the array from the outside world into memory. And it is an O(n) operation. So for any latency requirement exists such N so when n>N this requirement would not be met. So unbounded array as an input is incompatible with a real-time.


> Though I do wonder why there can't be a form of malloc that allocates in a stack like fashion in real time

I think that's basically what the LLVM SafeStack pass does -- stack variables that might be written out of bounds are moved to a separate stack so they can't smash the return address.


how would you implement that in the face of multiple threads? you can't use TLS as it will have initialize your stack on first access of your malloc_stack in a given thread, which may or may not be safe to use in real-time-ish-contexts (I think it's definitely not on Windows, not sure on Linux)


> how would you implement that in the face of multiple threads?

I imagine you could you allocate it at the same time as you allocate the thread's own stack?


> You either know how big your stack is

that's in most systems I target a run-time property, not a compile-time one


> wants to make them mandatory again

What does 'mandatory' mean? Like if I write a C compiler without them... what are they going to do about it?


Code that complies with the standard will be rejected by your compiler. The effect would probably be that few people would use your compiler.


Most mainstream compilers aim to be standards compliant


This usage of VLAs is once again mandatory for compilers to support, since C23. From Wikipedia: "Variably-modified types (but not VLAs which are automatic variables allocated on the stack) become a mandatory feature".


Oh well...


At least it is still optional to allow for stack allocated VLAs. Which is the attack vector you mentioned.


Tell me, what’s wrong with variably modified types specifically?


Sizeof, which used to be one of the very few things you could blindly do to an arbitrary (potentially UB-provoking) expression, can now have side effects.


I once implemented FizzBuzz using this trick https://www.reddit.com/r/C_Programming/comments/qqazh8/fizzb...


Wow, this is wizardry.


I haven't thought about C in years - that one where author passes in a printf as a char array element and de-references it to execute it ... gives me chills. Y'all kids have fun - Im going to stick with my VM over here and call it a day. This makes even modern JS look sane by comparison. Excellent write up too.


Nit: the char array doesn’t get dereferenced at all. The entire computation happens as a side effect of computing the length of the array. This is more of a case where there’s an unexpected expression context that can be abused for fun.


The fact the array element isn't in quotes bothers me to some degree - with or without its crazy but the fact that the compiler just accepts it as a language construct as opposed to a value makes me want to drink another beer. But again Im thinking dereferencing here - and as you said thats not the case.


It’s not an array element :) The printf statement is part of the length expression, i.e. it’s setting the length of the array to the result of the printf call. So it is indeed a “value”.

This isn’t much different from writing something like this in JS:

    var a = [];
    a[console.log("Hello"), 1] = 42;
except that this indexes the array as opposed to setting its length.


I spent the past 10 minutes figuring out what to search for (C is very rusty here) regarding C arrays and initialization. Now that you point it out, it seems obvious. It certainly wasn't obvious when I read it though. This is some really obtuse use of a language here - hilariously so really. The code in the array is executed as a function that determines the array size and I see that now - thanks. If someone on my team did this for any reason I'm not sure If Id shoot them or put them in charge of something more important. If no other thing - THIS is why code reviews exist. Still it's an impressive use of a compiler. Thanks for nit picking - this has been very entertaining!


The feature is called variable length arrays (VLAs)

See https://www.ibm.com/docs/en/i/7.4?topic=arrays-variable-leng...


> I haven't thought about C in years ... Im going to stick with my VM over here and call it a day.

What's your VM written in?



Except for most of deno that's written in Unsafe C++.

https://v8.dev/


A JS runtime is a lot more than the core engine, or else nodejs is just a thin wrapper over V8 as well, which is obviously absurd. So without a count of the source lines for each language's portion of the implementation (very rough but acceptable first approximation of how much is implemented in each), you can't claim C++ is doing most of the heavy lifting and be taken seriously.

PS. the worst C++ code is still a hell of a lot more safe than the best C code. This is easy to convince yourself of by noting that C++ mostly supersets C and only deviates to add more safety and static checking, not less. So it's a strict improvment over C, not by a lot, but an improvment nonetheless.

Here are other examples for VMs not written in any of those 2 braindead languages though:

[1] Bun : JS runtime in Zig. https://bun.sh/

[2] Squeak : Smalltalk VM written in Smalltalk. http://www.vpri.org/pdf/tr1997001_backto.pdf

[3] PyPy : Not actually a single VM but an entire framework\toolchain to write VMs, most famous of which is one for Python. The language used to write VMs for the framework is a subset of python called Rpython. https://doc.pypy.org/en/latest/

[4] Maxine Virtual Machine : a JVM written entirely in Java. It's not the only one. https://en.wikipedia.org/wiki/Maxine_Virtual_Machine ; https://news.ycombinator.com/item?id=15733645

Modern VM research is far beyond the 50 year old assembler that thinks itself a programming language. The future is here, just not very evenly distributed.


> PS. the worst C++ code is still a hell of a lot more safe than the best C code.

What? For safe code you need to be able to audit it for logic errors, which you can't do if you can't understand what it does by reading it.

That's why especially fancy code is banned in projects like Linux.


[flagged]


The only one coping here is that who doesn't understand VMs and language research enough and retreats into childish "it was just a joke lol" remarks when out of their depth.

Incidentally, I write all 3 of C, C++ and x86 assembly, which is why I understand how braindead and unfit for human usage they are.


I didn't say I was "just joking", but calling C++ "unsafe" was definitely something I think was funny to say.

> Incidentally, I write all 3 of C, C++ and x86 assembly, which is why I understand how braindead and unfit for human usage they are.

I can't take you seriously when you make statements like this. You have to be trolling.


Probably not C regardless. How many widely used VMs are in C? Python is about it, right?


Java's? Or at least it used to be.


Openjdk seems to be more c++ than c according to GitHub anyway. In particular the "interesting" stuff (so hotspot) looks to be all c++.


It started as C. It’s interesting to watch people post comments, as though C vs C++ makes the distinction in “safety” in a code base.

The take away I’m getting is most have programmed in neither.


These people are just regurgitating HN memes and then flailing when someone prods a bit deeper. C is basically the Donald Trump of HN. All rational thought goes out the door at the mere mention. In fact, you’re socially rewarded for joining the idiot mob.


That printf exploit is quite impressive I think.


There are multiple false myths about VLAs. Please known that VLA is about the typing not about the storage.

The line:

    typedef int T[n];
is the essence of VLA-ness, not `int A[n]`. The array of VLA type can have any kind of storage one wants.

It can be stack

    T a;
It can be heap

    T *a = malloc(sizeof *a);
It can be even infamous alloca()

    T *a = alloca(sizeof *a);
Please stop talking about this "VLA is stack-base vector" crap because it means that one does not understand what VLAs are about. I admin that automatic VLAs are pretty much always wrong but this use case is a tiny bit of the realy functionality of VLA types.

VLA were added to language to handle multidimensional arrays.

And the really shine at this task. Even C++ has not good alternative for it. Vector of vectors is a really crappy data structure.

A few examples when working with square matrices:

- allocation on stack

    float A[n][n];
- allocation on heap

    float (*A)[n] = malloc(sizeof(int[n][n]));
    ...
    free(A);
- indexing

    A[i][j]
- passing to functions:

    void add(int n, float A[static n][n], float B[static n][n], float RES[restrict static n][n]);

    ...
Please show me something as simple, effective, self-documenting and elegant in C++.


Similar to making all your computations in the expressions for default arguments in python (and/or C++) and leaving the function bodies empty. Fancy, but not that mindboggling. What may surprise people is how and when these expressions are evaluated since they differ between languages.


> What may surprise people is how and when these expressions are evaluated since they differ between languages.

What could possibly be surprising about reusing the exact same object on every function call, especially when you default to an empty list, set or dict. Getting an actual empty list is as easy as defaulting to None and explicitly checking for it, so there isn't even a reason to do it any differently. /s

Who the hell thought that this was sane behavior in a language with mostly mutable objects?


i don't think it's a matter of someone thinking it was sane behavior, it's just what falls out from the rest of the rules of the language. it seems that any attempt to fix it would be convenient but inconsistent.


You had me in the first half.


> Similar to making all your computations in the expressions for default arguments in python

Python default arguments are evaluated only once at function definition, not every function call. It's the source of one interesting WTF for anyone who assumes otherwise:

  def foo(a=[]):
    a.append(1)
    return a

  >>> foo()
  [1]
  >>> foo()
  [1, 1]
  >>> foo()
  [1, 1, 1]


It's not quite the same as default arguments in that default arguments are evaluated at run time, whereas array lengths for C++ (and not C) needs to be a compile time constant expression.

    #include <stdio.h>

    // "puts" makes "f" not constexpr.
    constexpr int f() { return puts("hello"); }

    // error: size of array 'argv' is not an integral constant-expression
    int main(int argc, char *argv[f()]) {}


As you point out, VLAs in C are evaluated at run time too, making it very similar to abusing default arguments.


It might not be mind-boggling, but it is highly unusual to people who "think in C". This is not C! This kind of stuff is why purists stick to C89.

I'm uneasy about variable-length arrays at the best of times. It hadn't even occurred to me that you might have a variable-length array as a function argument.


> This is not C! This kind of stuff is why purists stick to C89.

This personal assertion holds no water.

People hold/held onto C89 because compilers like Visual Studio's C compiler failed to support anything beyond C89 for ages.

https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-...

It's my understanding that Microsoft adopted a role in the C standardization committee that was a kin to sabotaging any update beyond C89.


Microsoft saw no value in supporting C when C++ is a better option [0], they caved in because Microsoft Loves Linux (alongside key FOSS projects) implies loving C as well.

Note that they aren't supporting the optional annexes and stuff like C atomics aren't supported.

[0] - https://herbsutter.com/2012/05/03/reader-qa-what-about-vc-an...


VLAs in a prototype are sensible, because they decays to a pointer, and is thus functioning purely as documentation.

In the function definition, it seems to basically also decay to a pointer, except that the compiler adds basically an assertion that the value in the brackets is greater than zero to the to the top of the function. That actually seems quite weird to me.

I'd have been fine with the VLA acting like a proper local VLA with respect to things like sizeof, in which case the assertion makes sense. Or I'd have also been fine with it totally decaying to a pointer, making it just documentation. This half-way in between state is quite weird.


Note that sizeof is useful on the inner dimensions of multi-dimensional array arguments: the following prints the value of m.

    int foo(int n, int m, double x[n][m]) {
        printf("sizeof(x[0])=%zd\n", sizeof(x[0])/sizeof(double));
    }


Serious question, but what "purists" remain on ANSI C? Even Linux has abandoned it as of 5.19.

The only times I use it are when I'm writing something that requires ridiculous portability even to bizarre platforms and compilers. C99 is far more ergonomic in comparison, even if you have to avoid using some of the truly terrible design decisions that came along for the ride.


> Serious question, but what "purists" remain on ANSI C?

It means nothing. It's a baseless attemt at an appeal to authority that has no merit.


Embedded probably.


That's where I work. I haven't seen c89 in years outside those "extreme portability" projects like Linux and curl.


> purists stick to C89

But the true Scotsman… uh, purist sticks to the K&R C as C89 has already lost some of the original elegance and simplicity. (Never mind that it’s a challenge since the general community has moved on.)


Oh, but designated initializers are so useful!


or RIIR is suddenly preferable.


C++ doesn't allow accessing other function arguments in a default argument, is in some sense the VLA trick is more powerful.


They mention that while loops are limited because C doesn't have tail recursion. However, in my experience gcc and clang are pretty decent at tail recursion. It happens via the -foptimize-sibling-calls setting, which is enabled by default on -O2 or higher.

The caveat is that the standard doesn't guarantee these optimizations. But there are some non-standard __attribute__ declarations that can help with that.


Well, I meant that loops are limited within the normal constraints of the C standard, which doesn't offer tail call elimination, but indeed as you point out it is easy to work around it in practice using clang or gcc optimization passes.


> opiltimization

Is this some portmanteau for “compile time optimization”?


On the topic of "compile time optimization" ... There was once a merry bunch of very excellent and brilliant crack smokers at Google who created a project called GWT. The raison d'etre of the project was simply "We know you can write JS but Java has more guardrails - why don't you write Java and we'll transpile it down to better JS than you're smart enough to write because JS basically sucks and is always changing and browsers are F'd".

Before it was all said and done you could enable a "deep compile" option that would even look at your java code for cases that would never execute - rarely execute (you're a dumb human right?) - etc and build an opinionated JS runtime around it for performance and size (size complexity? Space-Time-Trade-Off).

I was enamored with GWT for quite some time and developed a high level of skill using the tool. I still miss it frankly. People who seek to write one language and execute another are fundamentally insane. Even though this is basically necessary with a lang like C (one step above shifting bits and understanding instructions etc) those people still amaze me.

I'm glad I wasn't born with the requisite intellect to go down such rabbit holes myself. This makes me feel dumb and be OK with it at the same time.


> People who seek to write one language and execute another are fundamentally insane.

That would be people who use compilers.


You're technically right, which is the best kind of not invited to happy hour.


I wrote more or less the same reply but opted not to post it - well done on you sir!


Nah, it was just a typo. But I like your idea!


Excellent, didn't know that you could reference previous function arguments :)

Though I do feel that this is something different from VLAs (no critique of the article though!). In my head at least the key factor about VLA is that you can allocate an array on the stack when the size isn't known at compile time. But that is not what the author does.

A VLA as specified in a function definition or a declaration (such as: void f(int n, float v[n]);) is known at compile time. And even if trickery is used to do a runtime-calculation of [n] the array itself is not allocated on the stack. And thus also not subject of the common criticism of VLAs.

So, at least in my opinion the arguments (discussed in this threadl) regarding VLAs is mostly orthogonal to the syntax (ab)used here.


But why? Why is this VLA parameter defined this way? It seems totally bizarre and unnecessary, but I suppose it must have been added to the standard to solve some kind of problem? Is the proposal for this feature available and gives some insight?


Allowing arbitrary expressions allows self-documenting signatures:

  void concat_strs(
    int str1_len,
    const char str1[str1_len],
    int str2_len,
    const char str2[str2_len],
    char out_str[str1_len + str2_len],
  );
  
  void manipulate_array(
    array_dim dim,
    int arr[dim.x][dim.y],
  );
Supporting things like printf() was probably not specifically desired, but it would be difficult to define it in such a way that it accepts all reasonable expressions and doesn't accept any "unreasonable" ones.


Tip: if writing new functions that take things that are not strings, try to avoid C’s mistakes with strncpy and putting “str” in the name.


Those are strings! Null-terminated strings are not special.


It’s only a string of it comes from the null-terminated region of France. Otherwise, it’s just sparkling character arrays


I speculate it's for consistency with VLAs that are not function arguments, which is the more common use for VLAs.

The original purpose of VLAs is to let you stack-allocate an array with a length that is not known at compile time. In ANSI C you must heap-allocate such arrays, using malloc.


seems like to avoid the introduction of constant expressions (constexpr) like is in C++. Which is why c++ doesn't have this issue, even though it can also take expressions as the expressions must be resolvable at compile-time.


You can try trawling through the document log[1], but a big part of the C99 deliberations is either inaccessible or (like the drafts) explicitly kept secret from anyone but the committee.

[1] https://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_lo...


This kind of insanity is exactly why I don't use stuff like C anymore.

It's a landmark and incredible language, but it's chock full of potholes and foot-shotguns (footguns that blow off your entire leg). Even huge companies can't get it right, which makes sense. It started as a language basically without guardrails, and because of the extreme deference to the holy backward compatibility, it's more or less always going to be that.


I think that's more of an issue with C++.

Usually developers stick to known patterns and subsets. Sure, you can write "clever" code that's more aproppriate for IOCCC than as system-critical code, but that's the case with most programming languages.

There is such a thing as "bad code" and "good code". A lot of code linters can catch "code smells", and features can often be disabled at the compiler level.

On the other hand, not having to resort to inline assembler or writing 500x the amount of code in some cases is why we have such features. Complex features are useful in some (complex) cases, but are not to be overused.


> It's a landmark and incredible language, but it's chock full of potholes and foot-shotguns (footguns that blow off your entire leg).

There are very few footguns in C, compared to (say) C++ (or Python).

I can guarantee you that no employed C developer is putting IOCCC type code into shipping products. Pick any language you like, and turn up the code golfing to 11, and you'll be equally horrified.


> Pick any language you like, and turn up the code golfing to 11, and you'll be equally horrified.

Challenge accepted. I pick the Go language, here are my answers:

https://codegolf.stackexchange.com/users/7815?tab=answers

people complain about how verbose Go is, but to me that just means its readable. Even golfed, its pretty clear what is going on.


I agree. The incredibly semantics-hostile optimizer ("undefined means I can do anything", whereas in old C "undefined" just mean you were no longer sure what number was the result of an overflow) just takes the cake.


What old C do you mean? I can't think of any version where undefined had a defined meaning


It's not that the behavior was defined by the C standard, but that you could confidently predict what a given compiler would generate, for a given platform, so it was quite normal to write programs which made productive use of officially-undefined behavior when that was the behavior you wanted. (It may well have been defined by that particular compiler's documentation.)

Nowadays, people are used to thinking entirely inside the abstraction provided by the language spec, but that was not the case in the '80s and early '90s. The boundary between the C language model and the underlying machine architecture was porous. You would include snippets of assembly language in your C code, if you wanted to do something more quickly than you thought the compiler could do it; and your C code would take full advantage of your knowledge about the underlying memory layouts, calling conventions, register usage, and so forth.

It did not really matter that there were gaps in the "C machine" abstraction because nobody was really programming against the "C machine"; they were programming against their actual hardware, using C as a tool for generating machine code.


you could confidently predict what a given compiler would generate, for a given platform

But then you're no longer writing Standard C. You're writing compiler-flavoured C, which is another source of footguns for projects that outlive the compiler (or version) they were originally written for. Which is fine, as you say, for embedded-style projects that only target one processor model and one compiler version. But I don't think that applies to many of today's projects.


> But then you're no longer writing Standard C.

Well, of course not - it didn't exist yet! Or, if it existed, your compiler didn't support it yet; or, even if you had upgraded to a newer compiler which did support it (not a given back then), your codebase and development style preceded the standard, so you continued in the existing style.


Not defined as part of the standard, but before compilers got as smart about their optimizations, it was easier to have behavior that was technically undefined but could be reasoned about in practice.

Now that compilers know cleverer optimizations, undefined behavior is often impossible to reason about because the compiler can change your logic into something else that is more optimal and is equivalent to your logic only in well-defined cases.


Modern compilers also come with UBSan. Either run it through that, or just always build with it enabled.


You can read the C89 rationale here[0] but in general the point of undefined behavior was (and maybe still is, i didn't check other rationales) partly to let implementations not bother with catching hard-to-catch errors for things that could actually happen and partly to allow for implementation extensions for things they didn't want to or couldn't define.

In addition the entire idea of introducing undefined, unspecified and implementation-defined behaviors was to let existing implementations do, for the most part, whatever they were already doing while still being standards conformant (ok, the rationale's exact words is to "allow a certain variety among implementations", but in practice C compilers already existed in 1989 and the companies behind them most likely wanted to call them as "C89 conformant" without having to make significant changes).

C89 didn't define undefined behavior because that wouldn't make sense, but it did define what it means and going by the C89 rationale about what it was meant to be used for, clearly the idea wasn't the extremist "breaking your code at the slight whiff of UB because optimizations" but "letting you do things that we can't or don't want to define while keeping our own hands clear".

The "letting you" bit is important which is why they have the distinction between "strictly conforming program" and "conforming program" (i.e. minus the "strict") - which essentially has the former only be for "maximally portable" programs and the latter being "whatever conforming implementations accept", with conforming implementations being any C implementation that can compile strictly conforming programs - regardless of any extensions the implementation may have as long as these do not affect the strictly conforming programs.

In other words it was C89 Committee's way of saying "a (conforming) C program is basically anything a C compiler compiles as long as said C compiler also compiles C programs that adhere to the strict conformance we defined here" - which BTW flies in the face of the entire idea that introducing a single instance of "undefined behavior" makes the entire program not "valid C" anymore (after all program with undefined behavior can still be a conforming program as long as it is accepted by a compiler that also accepts strictly conforming programs).

This is the sort of circular self-referencing logic you get when committees try to standardize something that already has a bunch of not necessarily compatible implementations while also trying to not ruffle the feathers of the companies behind them too much.

It'd be an amusing tale if only some people (who you can ignore anyway) didn't get into fights about what page x, paragraph y, verse z of the Standard[1] say and decades later funneling that logic into compilers (which are somewhat harder to ignore) that break existing working code while Bible thumping their standards book whenever someone goes "WTF, this thing used to work before i upgraded the compiler"[3]

[0] http://www.lysator.liu.se/c/rat/title.html

[1] Capitalization Intentional

[2] Yes, C was considered one at some point :-P

[3] "No, it is not valid C, it couldn't have worked. You clearly imagined it."


C has many problems, but I don't see this as one. Very few people even realize this functionality exists, and I don't see anyone promoting its use.


> Side note: even though we can't return, main is the exception to the rule that reaching the closing } of a function returning non-void is verboten,

Isn't it legal to fall of the end of the end of a non-void function, only just defined as UB to use the return value?


Correct.

> If the } that terminates a function is reached, and the value of the function call is used by the caller, the behavior is undefined.


Interestingly C++ makes this UB.

> Otherwise, flowing off the end of a function other than main or a coroutine ([dcl.fct.def.coroutine]) results in undefined behavior.

https://timsong-cpp.github.io/cppwp/n4868/stmt.return#2.sent...

edit:

This also manifests in compiler optimizations, at least in gcc

https://godbolt.org/z/GhxjdWqGb


> that declaration is functionally equivalent, and compatible to the following:

Nice to see some vindication - I remember saying this and IRC pedants trying to argue up and down that they're absolutely not the same


Wowzers, I thought that was gong to be some template re-write expansion trick. Nope, just willing to run nearly arbitrary code if you wrap it correctly (not a complaint!).


What is the best alternative when I want to allocate some reasonably sized array on the stack and don’t want to always reserve the worst case size? alloca?


Reserve the maximum every time.


That has cache friendliness implications though, doesn't it? Seems better to just cap the size of the allocation and use a VLA.


You don't need an alternative. VLAs are perfectly appropriate for that use case.


This is just too much of insanity I can witness today.


the fact that you can do recursion before even entering the function is amusing. not THAT strange though, i imagine the compiler just gloms a preamble onto the executing function's stack frame.

amusing to think that the goal of them is probably to make it easier avoid buffer overruns, but then they can just be extended themselves to cause similar problems anyway.


You are entering the function. It's not in the function body in the source file, but the code is almost certainly inserted at the beginning of the function in the compiled output.


>The astute reader might point out that these two versions of sum are not equivalent because the recursive definition may cause a stack overflow for large enough values of n. This is unfortunately true, and the major hurdle for the practicality of disembodied C, but does not preclude Turing completeness (an ideal Turing machine has infinite memory at its disposal).

It does preclude Turing completeness. The stack has an upperbound of the size of a pointer times the word size.


Does it actually? Or does the number of elements on the stack /that you have taken the address of/ have that upper bound?

That is, could you build a compliant C implementation that implements a stack using an infinite memory (e.g. in a delay loop between the user and a mirror moving away through space), where each entry in the stack contains a convenient sized value (word, byte, whatever) and a tag? If the tag is clear, the value the contained directly in the infinite memory; if the tag is set, then the value is indirected into a fixed-sized memory. On taking the address of an item on the stack, if it is not already indirected, relocate it and indirect.

Of course the mechanism for doing stack unwinding on return needs to use relative operations ("drop five elements") and not chase frame pointers, but that seems trivial.

I admit I'm not super-familiar with all the details of the C specification, but it's not obvious to me that this would violate spec, and it would be Turing complete.


Yes it does, C is defined in term of implementation defined, but finite sized pointers and integers, so the computational model of C is a finite state machine. But the size of the state space is so friggin' HUGE that that argument is completely irrelevant for all practical purposes.


Is this true? C is defined in terms of the operations that can be performed on objects in memory, which includes the concept of memory addresses and address manipulation; but I'm not aware of anything that would require that objects that do not have their address taken have unique addresses, or addresses at all. This is the same use of the as-if principle that allows so many local variables to live their entire life in registers. An infinite stack to support unlimited recursion seems possible within the C machine.


The C standard specifies that there is a constant UINTPTR_MAX.


Yes. But, as I mention in the proposed implementation, this limits the number of unique addresses in an implementation. Values that do not need addresses do not consume this limited resource. There would be a limit on the number of items on the stack that can have their address taken, but I can't see why this would limit the number of items on the stack.

In e.g. the fizzbuzz implementation mentioned in another comment [1], even without tail call optimization the stack growth implied by the recursive call does not require the number of /addressed/ items on the stack to grow. (Thinking about it, I believe this is an equivalent statement to "the tail call optimization is valid.")

[1] https://old.reddit.com/r/C_Programming/comments/qqazh8/fizzb...


This is such a fantastic write up. Also, a missed opportunity for a smashing submission to ioccc


This is only tangential to the article:

C isn't Turing complete without `fseek` (as far as I can tell).

Turing completes requires you to be able to read/write from an infinite tape (essentially infinite memory). This isn't possible in C, because `sizeof` is a constant expression, thus limiting the size of any type, and importantly also pointer type, to a finite number, thus making the addressable memory finite.

From what I can tell, the only way you could theoretically access an infinite tape is using `fseek` with `SEEK_CUR`.

There might be more shenanigans possible with `stdio.h`, but I'm pretty sure C can't be Turing complete without the `stdio.h` function (assuming no special language extensions).

If anybody is wondering, the preprocessor isn't Turing complete either, because you might theoretically have infinite memory, but then you only have finite recursion, which also isn't enough for Turing completeness. File iteration can have infinite recursion, but can also only carry over finite state between `#include`s.

Edit: I'm not talking about any real world implementation, but rather about the theoretical bounds of the C abstract machine.


You are pedantically correct and wrong. Even with fseek there are only so many atoms in the universe and you'd eventually run into limited memory. Rather than go so esoteric as to say that only programs structured with fseek are Turing complete, we generally just make the jump from languages able to use arbitrarily sized RAM to assuming infinite RAM and say youre turing complete enough for most purposes.


> Even with fseek there are only so many atoms in the universe and you'd eventually run into limited memory

I should've explained that I'm not talking about a physical implementation, but about the theoretical bounds of the C abstract machine.


Talking theoretically, since size_t is defined as:

> size_t can store the maximum size of a theoretically possible object of any type (including array).

In a proper theoretical turing machine, size_t would allow you to create and address arbitrarily large arrays too.


`sizeof(size_t)` and `size_t x; sizeof x` must be constant expressions. It is possible to create a c implementation where the size of a pointer is arbitrarily large, but it can't grow/isn't infinite.

For any give C implementation, it is thus trivial to theoretically solve the halting problem (assuming no stdio.h stuff is used).


Speaking in so much the abstract that it becomes absurd (which also can be used to prove your point, but bear with me here), since `size_t` is required to hold the size of any object, and a turing machine has infinite memory, then size_t would also need to be infinitely big, and `sizeof(size_t)` would be a constant returning infinity.

So, I don't think it having to be a constant is a problem as much as having to deal with infinitely big numbers inside of infinite memory (which may or may not be a contradiction, depending on axioms used to define a turing machine. I still need to work my way through annotated turing).

Of course, things do get a lot simpler and grounded using IO functions like you said earlier.


I don't think infinitely large integer constant area construct that makes sense.

As far as I'm aware, there isn't an instance of an infinitely large integer, even in mathematics, there are finite integers and there is the concept of infinity. (And size_t is defined as an unsigned integer type)


I'd argue you could use non-standard integers that are unable to be reached in a finite amount of steps but aren't infinity and are still "integers".


> unable to be reached in a finite amount of steps but aren't infinity

I don't see how such numbers can exit/be used/defined in any meaningful way. Do you have an example of such a number?


The issue of memory bounds is commonly handwaved away. Note that your desktop computer is technically not Turing complete either, since it only has access to a finite amount of memory+disk storage, and is thus a (very large) finite state machine since there are only a finite number of states it can be in.


If you want to talk about this kind of technicality, here are three other avenues through which I believe Turing completeness of C can be achieved:

1. As https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm helpfully points out, "If two objects have identical bit-pattern representations and their types are the same they may still compare as unequal", and "[implementations] may also treat pointers based on different origins as distinct even though they are bitwise identical". It can thus be argued that, with a rather strict definition of provenance (i.e. how pointer values may be constructed), one could construct an implementation where every byte of every pointer is tagged with additional unbounded data that lets the whole pointer be properly dereferenced (I say every byte so that you're able to convert the pointer to an integer and convert it back to the same valid pointer), and where, as allowed by DR 260, pointers that have identical observable bit-pattern representations might not compare equal if they point to different objects. Thus, that implementation could create an unbounded amount of pointers.

2. Spawning off another thread and using infinite recursion (C has no defined recursion depth limit, and if you want to argue about automatic variables needing addresses or whatever and that the as-if rule doesn't cover that, there's `register` which makes address-less variables) to have two pushdown automatas that can together work as a Turing Machine.

3. Using variadic arguments. With `va_copy`, one can iterate through variadic arguments multiple times, and thus `va_list` can be used to implement a mechanism equivalent to pointers without actual pointers (any sane implementation probably uses a pointer for `va_list` somewhere in it, but that's not a problem here) and thus make a pushdown automata with 2 stacks


Let me address 2. and 3. first, because 1. is probably correct.

2. The problem with that is that you may have infinite recursion, but you can only pass finite amounts of memory between recursion steps. This has also been addressed by https://cs.stackexchange.com/a/60978.

3. I don't follow, how could you create a growing `va_list`?

Edit: Actually, I think I do now, so you essentially create a sort of linked list of `va_list`s?

1. This is probably correct. You'd need to be able to round trip through a void*, but that could still be implemented to retain the tagged data. I thought of pointer tagging more in terms of marking addresses with certain memory protection properties, but what you suggest should be possible. There may still be some wording disallowing it, but I couldn't find it.


> 2. The problem with that is that you may have infinite recursion, but you can only pass finite amounts of memory between recursion steps. This has also been addressed by https://cs.stackexchange.com/a/60978.

If I have two stacks (by spawning another thread), don't I have two PDAs ? That's what that stackoverflow post indicates, and IIRC two PDAs are equivalent to a Turing machine.


> This isn't possible in C, because `sizeof` is a constant expression, thus limiting the size of any type, and importantly also pointer type, to a finite number, thus making the addressable memory finite.

Who said it had to be finite in theory? `size_t` is defined in terms of the implementation, not in terms of bit-widths.

Sure, size_t is finite in practice, but then again, so are all other programming languages. In theory, there is no upper limit on size_t.


`size_t` is defined as

> the unsigned integer type of the result of the sizeof operator (https://port70.net/~nsz/c/c11/n1570.html#7.19p2)

`sizeof` returns the following:

> If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant. (https://port70.net/~nsz/c/c11/n1570.html#6.5.3.4p2)

So `sizeof(size_t)` must be a concrete integer constant for any given C implementation, it can't change at runtime.

As far as I'm aware, there isn't an instance of an infinitely large integer, even in mathematics, there are finite integers and there is the concept of infinity.


> So `sizeof(size_t)` must be a concrete integer constant for any given C implementation, it can't change at runtime.

I didn't say it has to change at runtime, I said that there is no upper limit on it in theory, because the upper limit is specified in terms of the implementation, and a a theoretical implementation with no upper limit on memory means that there is no upper limit on size_t.

Besides, I also said:

> Sure, size_t is finite in practice, but then again, so are all other programming languages.

Using your definition of Turing complete (no upper limit on pointer, references or addresses), I cannot think of one popular language that would satisfy that definition. Can you?


The easiest example would be brainfuck, which doesn't have addresses and you just move the tape head. I'm not that versed in how other languages are defined, but does e.g. JavaScript even expose pointers to the user? From my understanding of the language, an implementation could be able to allocate infinitely many objects, without breaking the language spec.


See "Subtleties of the ANSI/ISO C standard" <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1637.pdf>, section V "C is not Turing complete". They discuss file I/O briefly too, but I don't follow their reasoning.


Yes, this is basically my argument.

They address that `ftell` must return a finite value, but don't mention that `ftell` can fail for specific FILEs:

> If successful, the ftell function returns the current value of the file position indicator for the stream

But FILEs don't necessarily have a "file position indicator":

> If a file can support positioning requests (such as a disk file, as opposed to a terminal), then a file position indicator associated with the stream is positioned at the start (character number zero) of the file, unless the file is opened with append mode in which case it is implementation-defined whether the file position indicator is initially positioned at the beginning or the end of the file.

They don't seem to discuss the topic further.

> Therefore we will here only consider C programs without I/O. (Our argument can be adapted to show that the programs that restrict I/O to reading stdin and writing stdout are Turing incomplete as well, but we will not do so here.)

So they don't address the problem of IO completely.


I'm sure someone more knowledgeable will point out the flaw, but couldn't Turing's machine definition be changed from infinite tape, to arbitrarily large tape, and thus solve this technicality?


Let me put it that way.

Given any theoretical C implementation and a strictly confirming program (that doesn't do io) I can theoretically trivially solve the halting problem.

Since the possible memory is constant for any given implementation, I can use the following algorithm:

1. Execute one instruction of the program

2. If the program terminated, goto 4.

3. insert entire program state into a hash table

   if it is already in the hash table, goto 5.

   otherwise, goto 1.
4. The Program terminates

5. The program never terminates

The above algorithm solves the halting problem of any C program with a constant memory bound in finite time.


Your argument sort of imploded on itself…your claim that only a file can be considered tape is insane because memory is just as usable as tape and you can keep extending memory available to you using sbrk() . If you’re going to claim the memory is finite, well so are files. (Also lseek is not part of any C spec)

And yes neither is infinite, so every computer is just a DFA, and lseek changes nothing. But given the amount of memory that exists we approximate and say they are Turing machines since there is enough memory to do most what we need.


My argument is that `fseek` (not `lseek`) is the only way for standard C to access a infinite tape, because it allows relative seeking in a file. `fseek(file, 1, SEEK_CUR)` to advance and `fseek(file, -1, SEEK_CUR)` to go back.


The tape wouldn't have to be provided as a single flat file. You could, for instance, use something like a pair of files, one for control and one for data:

    fputc('L', tape_control);
    c = fgetc(tape_data);
    fputc(d, tape_data);
or a single file with controls disjoint from the tape symbols.


This would be possible, but it feels like cheating :-)

Idk how to express this properly, but I feel like there is a difference between this and the fseek approach.

In a similar vain, you could say that a certain kind of undefined behavior controls the tape, or writing to a specific part of memory does.

I suppose this would fall under adding additional compiler extension.


> … or writing to a specific part of memory does.

Memory-mapped devices are pretty common. C's original platform controlled tape drives by writing to memory: https://www.tuhs.org/cgi-bin/utree.pl?file=V7/usr/sys/dev/tc...


Pointer++

Pointer—-

These do the same to pointers…


No, pointers always have a finite size, because sizeof is a constant expression. A theoretically fseek isn't bound by this, the compiler could implement it with an infinite tape.


A pointer is just a number that points to addressable memory. Pointers don't even point to actual memory these days anyways. They're just glorified indices into OS tables that eventually map to actual memory. You can increase a pointer to be 128 bits big, or 256 bits big if you wanted to. It doesn't change anything, because it's just an index.

Here's a clarification on Wikipedia that expands on this:

> nearly all programming languages are Turing complete if the limitations of finite memory are ignored.

And files are just as finite as a pointer. You can't address a location in a file bigger than sizeof(void*) on a modern computer. If you wanted to, you would need to add special support[1] and doing this is equivalent to what you would do to support a bigger memory range for pointers.

I'm curious, do you think other languages are Turing complete? As far as I know, all modern programming languages use a 64 bit addressable memory space maximum.

[0]: https://en.m.wikipedia.org/wiki/Turing_machine

[1]: https://stackoverflow.com/questions/4933561/what-is-the-rela...


> You can increase a pointer to be 128 bits big, or 256 bits big if you wanted to. It doesn't change anything, because it's just an index.

Yes, you can increase the pointer size arbitrarily, but it is always constant for a single C implementation, you can't increase the pointer size at run time.

Essentially you recompiling and running a program and choosing an implementation with a successively larger pointer size is Turing complete, but it requires you as a special actor, and I'm only concerned for a what a single implementation can do.

To put it in another way, solving the halting problem is trivial for any C program + a fixed C implementation, that doesn't have special language extensions.

> I'm curious, do you think other languages are Turing complete?

The easiest example would be brainfuck, because you can just move the tape around in it, and a theoretical implementation could trivially be hoked up to a theoretical infinite tape.

> As far as I know, all modern programming languages use a 64 bit addressable memory space maximum

I don't know enough about how other languages are defined, but I could imagine that Java allows for an implementation with infinite memory, because it doesn't impose any requirements on how the pointers are implemented under the hood.


You’re spouting nonsense. File offset is guaranteed to be expressible by the integral type off_t. Thus also limited.


You are aware that `off_t` isn't in the C standard?

The standard has `fgetpos`:

> The fgetpos function stores the current values of the parse state (if any) and file position indicator for the stream pointed to by stream in the object pointed to by pos

> If a file can support positioning requests (such as a disk file, as opposed to a terminal), then a file position indicator associated with the stream is positioned at the start (character number zero) of the file, unless the file is opened with append mode in which case it is implementation-defined whether the file position indicator is initially positioned at the beginning or the end of the file.

Meaning `fgetpos` mustn't work for files that don't support `positioning requests`, whiles relative offsets using `fseek` could still work.


ok, that is just more nonsense...

but let's go with that. fgetpos stores offset in an fpos_t object...in memory...memory is finite as you admit...thus the length of this fpos_t object must be finite...thus there is a limit to how many bits fpos_t may contain, thus it can address only finite file length...


> ...thus there is a limit to how many bits fpos_t may contain

Yes, but my point was that a `fgetpos` mustn't work for all FILE pointers, take for example `stdin`.

The standard quotes from my above comment show that a FILE pointer mustn't support “positioning requests”, so fgetpos mustn't work.


You could just make a doubly-linked list that dynamically grows new nodes (using malloc()) at either end on demand in order to implement an infinite tape.

(Of course on a real machine the malloc() will fail at some point.)


My point was that this isn't possible, because the addressable memory, whiles possible humongous, is always finite, because the sizeof pointers is constant. So at some point `malloc` must return the same address, even if you had theoretically infinite memory available, there is no way to address that in C.


Ah, good point!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: