I'm actually kinda sad that VLAs didn't work out in general (they have been down...

enriquto · on Oct 29, 2018

Yeah, VLA is one of my favorite features in C. I would love if it was viable to use them for arbitrarily large temporary arrays. They lead to much cleaner code. Instead of

    {
            float *x = malloc(n * sizeof*x);
            ...
            free(x);
    }

you do simply

    {
            float x[n];
            ...
    }

In image processing, you often need large temporary images, but it is dangerous to distribute code such as the above unless you play with the stack limits from outside your program.

rwmj · on Oct 29, 2018

It's non-standard (although we have tried to get it added to the next Cxx standard) but you can use __attribute__((cleanup)) here. In systemd code it is common to write this as:

    _cleanup_free_ float *x = malloc (...);

Since cleanups are supported by both GCC and Clang it's not a real problem to use them on Linux, BSD and MacOS. We recently added them to libvirt, have used them in libguestfs for a long time, and as I mention above they are used in systemd for years. They are also applicable to many other cases such as automatically closing files at the end of the current scope.

dzdt · on Oct 29, 2018

Its unfortunate that C and C++ kept the mindset that automatically memory managed variables are stored on a SMALL stack, with the consequence of exceeding that small and unknown size being that your program crashes.

For variable sized arrays, there is already a bit of overhead in sizing the allocation. Would it really have been impossible to move large allocations to the heap with automatic free at exit of the scope?

exDM69 · on Oct 29, 2018

> Its unfortunate that C and C++ kept the mindset that automatically memory managed variables are stored on a SMALL stack

This is not a property of the language, it's a property of the runtime its running on.

In a normal userspace program under Linux (at least), you can stack allocate megabytes and the stack will dynamically resize (note that shrinking the stack might not happen in a timely manner).

But in kernel space, there's a conscious decision to keep the stack small and not dynamic. While it's not impossible to have dynamic stack in kernel space, that has a lot of implications.

carlmr · on Oct 29, 2018

>For variable sized arrays, there is already a bit of overhead in sizing the allocation. Would it really have been impossible to move large allocations to the heap with automatic free at exit of the scope?

Not a compiler writer here so I'm not sure. I think it wouldn't be too difficult, but then this goes very much against the exact control you normally expect from C. And violating expectations like this is usually not good.

It would also produce bigger functions in a somewhat opaque manner, because you would need the code to deal with malloc and free and stack allocation at the same time.

plasticchris · on Oct 29, 2018

Not all software written in c has a heap.

megous · on Oct 29, 2018

Not all C programs have stack ether. Say Micochip's XC8 compiler for pic14 family.

rbanffy · on Oct 29, 2018

That has to be "fun" to work with...

megous · on Oct 30, 2018

You just can't recurse and the same function can't be called from isr and from main context at the same time. Basically, nothing is re-entrable. Otherwise it's quite fine. To be sure, you shouldn't be doing those things anyway if you have 64B of RAM and 512 words or ROM. :D

cesarb · on Oct 29, 2018

And even if it has a heap, the heap might not be available. In a kernel, the function might have been called in interrupt or atomic context; in userspace, the function might be part of the implementation of the heap itself, or a low-level function called before the heap is initialized.

maxlybbert · on Oct 29, 2018

C++ has std::vector, which doesn’t have any implicit size limitations. I haven’t kept up with the details, so this may have changed, but during the discussions for what C++11, the committee decided not to include VLAs in the official standard because the library already included std::vector.

int_19h · on Oct 29, 2018

The problem with vector is that it always heap-allocates. The ideal approach is the one where small allocations are on the stack, and large ones are on the heap, without the API user having to do anything about it.

C++ has had a proposal for std::dynarray floated for a while, which would basically have semantics allowing stack allocation, but wouldn't mandate it, leaving it as a quality of implementation issue (so it would be legal to implement it on top of vector, but it was assumed that compilers would go for optimizations given the opportunity). It didn't pass, unfortunately.

maxlybbert · on Oct 30, 2018

I guess in theory, you could pull it off with a very clever allocator. Bloomberg’s BDE has an allocator along those lines ( https://bloomberg.github.io/bde/classbdlma_1_1BufferedSequen... ), but I don’t think it made it into the standard (memory_resource header).

My point is only that runtime-sized arrays may have made it into the C standard, but they aren’t in the C++ standard.

geezerjay · on Oct 30, 2018

> The problem with vector is that it always heap-allocates.

How about std::array ?

int_19h · on Oct 30, 2018

It's not dynamically sized - the size is a template parameter, which means that it has to be a compile-time constant.

kevin_thibedeau · on Oct 29, 2018

It's unfortunate when system programmers write unportable code that depends on the assumption of having a large stack.

enriquto · on Oct 29, 2018

The word "large" does not really mean anything. You are always using the stack, and there is not portable way to check whether it is full or not. Now that is unfortunate.

viraptor · on Oct 29, 2018

There's still `alloca` which allocates on the stack and frees automatically.

gameswithgo · on Oct 29, 2018

seems you could macro this really easy?

pjmlp · on Oct 29, 2018

That is kind of what Ada does.

int_19h · on Oct 29, 2018

VLAs date back all the way to Algol 60. Most Algol-family languages inherited that trait, and many relied on it heavily. C (actually, B) excluded them because it was deliberately lower-level than was the norm for Algol family before.

pjmlp · on Oct 29, 2018

The big difference is that in the Algol 60 family bounds checking was enforced.

"Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interests of efficiency on production runs. Unanimously, they urged us not to--they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980, language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law."

-- C.A.Hoare on his Turing's award speech

C and B designers had another point of view regarding security, or lack of thereof.

int_19h · on Oct 29, 2018

That's orthogonal to arrays being variable-length or not, though...

pjmlp · on Oct 29, 2018

The point was that the way VLAs were designed in C99, it was a motorway to stack corruption, leading to dropping it out as optional feature in C11 and C++ not even considering them.

int_19h · on Oct 30, 2018

I don't see anything in the wording of C99 around VLAs that requires them to always be allocated on the stack. That implementations did so regardless is, arguably, a defect on their part, and a quality-of-implementation issue.

pjmlp · on Oct 30, 2018

Given C design culture how could you expect any other kind of approach to their implementation?

int_19h · on Oct 30, 2018

Well, the whole point of VLAs was supposed to placate the math/numeric crowd somewhat, and they generally expect things to be a little bit higher-level than what's otherwise mandated by the C "zero overhead" philosophy. Since VLAs are opt-in, code that cares more about perf could always skip them and not pay the tax.

But yes, it was probably too naive to expect that to work out.

skocznymroczny · on Oct 29, 2018

Hmm, how does it work if you want to pass that array somewhere else, and you want that other code to free it?

lucozade · on Oct 29, 2018

> how does it work if you want to pass that array somewhere else, and you want that other code to free it?

It wouldn't work, like a fixed size array wouldn't.

pjmlp · on Oct 29, 2018

It made it stack corruption way too easy, it was a very bad idea.

Languages like Ada do have them, but then again the language also protects against stack corruption.

int_19h · on Oct 29, 2018

There's no reason why VLAs have to be allocated uniformly on the stack. The C standard doesn't say that "auto" = stack. It just says that it has to be automatically deallocated when it's out of scope.

But even as implemented today, in practice, it all works fine if you're not working with really large arrays. In a lot of scenarios, you do not. This isn't really any different than using recursion when you reasonably assume that recursion depth is not going to be large. Sometimes it's just easier and clearer all around.

pjmlp · on Oct 29, 2018

That it the main reason why so many CVEs are related to memory corruption in C code as cause, in practice things do not work fine.

Which was the main theme at Linux Kernel Security Summit 2018.

Just for info, if not wanting to bother watching Google's talk, 68% of Linux kernel exploits were caused by out of bounds errors, including those caused by VLAs misuse.

agumonkey · on Oct 29, 2018

what do they do to avoid it ? static analysis ? runtime layout tricks ?

pjmlp · on Oct 29, 2018

With something that standard C keeps ignoring, bounds checking.

If the requested amount if bigger than the stack size defined at compilation time, an exception is thrown.

While C just overwrites the stack.