I think that structure will be helpful, and you can also easily pass substrings....

nicoburns · on Aug 27, 2022

It boggles my mind that the C standard hasn’t officially added support for this. It seems like such a small change that would dramatically improve the quality of C code.

dezgeg · on Aug 27, 2022

It would not be a small change if you wanted to actually make them usable for the standard library - ie. add a second 'struct bytes'-variant of EVERY function that currently accepts a NUL-terminated string.

No, just calling a conversion function before (that would need heap allocation) would never be accepted by C programmers (for the overhead).

Then there is inertia - how many really would want to port their application to a different string type? Not to mention, all the libraries you're using would also have to been converted.

wongarsu · on Aug 27, 2022

You can make the read-only conversion "free" by storing both the size of the string, as well as terminating the actual string contents with a NUL. So the String "bar" would be { length: 3, contents: ['b', 'a', 'r', '\0\] }. All your functions dealing with the "rich" string type work the same using the length, except they have to be aware of the need to preserve the terminating 0, and you if you want to pass a read-only string to a legacy function you can just pass it contents.

Of course that also generates a giant foot gun because you might manage to get the old and new strlen to disagree on the size, because one reads the length field and the other searches the \0.

pjmlp · on Aug 27, 2022

Inertia is no excuse, if something like SDS was adopted by WG14, eventually C applications could slowly be migrated into it.

As it is, it will never happen.

ranger_danger · on Aug 27, 2022

SDS is cool, I use it in my projects and have also extended it further with some new features... but I worry about the amount of dynamic allocation going on, it would be nice to have an alternative solution such as using stack or pool allocations, or being able to declare SDS string literals at compile-time (probably only possible in C++). Every time I notice that I need to replace a const char* with an SDS I just know I'm slowing things down and adding more complexity.

pjmlp · on Aug 27, 2022

It doesn't have to be 1:1 equal to SDS, rather at least having WG14 doing something, anything at all, instead of keeping pushing the agenda of C being a Swiss cheese of security exploits, to the point everyone is adopting hardware memory tagging as the ultimate mitigation.

matheusmoreira · on Aug 27, 2022

For some reason C seem to be going through standard updates as often as C++ now. Not only that, they've added absolutely huge stuff like threading primitives as well as an insane generic macro thing.

Surely they could add better designed structures and functions for dealing with memory.

dralley · on Aug 27, 2022

Is that really so big a deal, considering they've done that multiple times already anyway? How many versions of strcpy, printf, and other string handling functions are there already?

morelisp · on Aug 27, 2022

"I mean it's one mutable string type Michael, what could it take to standardize, 48 pages not including the allocator model?"

duped · on Aug 27, 2022

Copy C++. The underlying string buffer must be null terminated, but still carry its length.

nine_k · on Aug 27, 2022

They were afraid of the C++ situation where there are 3-4 different string implementations in every large project.

pjmlp · on Aug 27, 2022

Hence why something like SDS should be part of the standard.

jstimpfle · on Aug 27, 2022

Nul termination is not only in C but also in some file formats. It has some advantages - apart from modest space savings for short strings, it means that you can read a string from some given location to the end - without any out of band data (length) that necessarily has to be stored in a different, agreed on location. This is a very valuable property.

I'm not saying don't use length fields, I'm saying use nul terminators where possible and use length fields where needed. And, they are not mutually exclusive.

And C doesn't need a standardized length delineated string structure in my opinion. Nul terminators serve the job fine for most Standard APIs (which take only short strings), and can receive length fields as separate function parameters where required.

shadowofneptune · on Aug 27, 2022

A case against:

• On the most widely used architectures, reading a string is much easier if the string is a known length. x86 has its string instructions, ARM has its Load Multiple instructions.

• Even with length-prefixed strings, many uses of short strings are with string literals and so the length does not need to be stored anywhere.

jstimpfle · on Aug 27, 2022

I agree. We're not disagreeing :)