Hacker News new | past | comments | ask | show | jobs | submit login
Stop Memsetting Structures (anmolsarma.in)
191 points by unmole on April 27, 2019 | hide | past | favorite | 263 comments



Does that also zero the padding that may exist between elements of the structure (or at its end)? Because if not, not memset'ing will open you up to all kinds of info leaks, especially if you plan to send that struct over the network (or to another process, or from kernel to userspace...).

EDIT: Actually I have to apologize, because I’m not certain what the C standard says about whether the padding bytes can change when changing members of the structure (6.2.6.1):

6 When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values. 51)

If that means that the padding can actually also change when just assigning individual members (which I think it does, but footnote 51 only calls out copying the whole struct by assignment, and I’m not 100% sure that “including a member object” means what I think in this context), then that means that copying an unpacked structure across information boundaries is technically always unsafe, because any change of a value in the struct might leak information by changing the padding to undefined values. It’s safe with explicitly packed structs, though (which should be the majority of the cases, as you’d have a representation independent of the architecture), but then so is what the author describes. I’d appreciate if someone could clear this up! memset’ing can still help you getting a structure quickly into a well-defined state, though.


Nope it doesn't:

  //GCC:  -O0
  //MSVC: /Od
  #include <memory.h>
  #include <stdio.h>
  struct S { unsigned char x; int y; };
  int f(int f) { S a; memset(&a, f, sizeof(a)); unsigned char y[2]; memcpy(&y, &a, sizeof(y)); return y[1]; }
  int g() { S a = { 1, 2 }; unsigned char y[2]; memcpy(&y, &a, sizeof(y)); return y[1]; }
  int main(int argc, char *argv[]) { f(0xDD); printf("%#x\n", g()); f(0xFF); printf("%#x\n", g()); }
It definitely seems like the author didn't realize this fact, or I'd have expected it to be addressed in the article.


I am aware of the fact that unitilized padding can have nasty side effects: https://lwn.net/Articles/417989/

But yeah, I should have mentioned the caveats too. I will update the post once I'm back at my computer.


I'm confused what I'm supposed to look for in that article (do you mean you were the author?) but anyway -- I feel like calling this merely a "caveat" gives the wrong impression? It seems a bit like telling people to mix bleach with vinegar and then saying "Oops, sorry, I did know that it produces chlorine! I forgot to mention that caveat." The caveat isn't just a side note for the margins, it's a critical reason why people avoid this sort of thing.


Failing to clear padding in structs is _really_ not often a problem. It's pretty much only a problem when copying un-packed structs from kernel-space to user-space.

So if you're not working on an OS kernel somewhere in the syscall path - really not a common activity even among C programmers - it's not a problem. If you are passing a struct from one part of your program to another part, or to a library you use which is not isolated or sandboxed in some way, it's not a problem. If you pass a struct to a _more_ privileged context like a syscall, it's not a problem. If you are passing structs over a network connection or writing to a file, you'll pack them and be aware of every byte anyway, in order to be compatible with another implementation reading them.

If you _are_ passing a struct from a more privileged context to a less privileged context, then yeah you have to memset().


I agree, but the frequency isn't the issue. You could say that's where my analogy breaks down, and if that's your point then maybe it does (although how often do people want to mix bleach with vinegar?), but that entirely misses the point. The point was that the failure mode, when it does come up, is a critical one, and if you're going to criticize a safe coding practice and tell people to switch to a potentially dangerous one, it behooves you to explain to them this very fact. It's just irresponsible to tell people to switch to a dangerous practice without making them aware of that fact and telling them how to deal with it.


> If you pass a struct to a _more_ privileged context like a syscall, it's not a problem.

Simply not true; that context could pass a copy of that structure to a less privileged context other than the original one, without doing anything about the padding.

(Well, you could argue that it's the privileged kernel's fault: it should meticulously floss the structure between the members to clear the padding to zero while preserving the values.)

Someone upthread mentioned "sending over a network": that's an example. Sending over a network (or local socket/pipe) begins with pushing it down to a more privileged context, which then assiduously places all the bytes into a buffer that is blasted out on the wire. In this case, the more privileged context has no idea what the structure even is; it's just a blob of bytes.


If the author is only going to add "caveat", then that sounds like word games and being unwilling to admit that the blog post's advice is wrong (for the security risks and other reasons pointed out here in the comments)


Technically I don't think memset forces the compiler to clear the padding and keep it clear either though. The magic of the C abstract machine.


Why? memset followed by memcpy must return the same byte everywhere. Otherwise what would memset even mean?


I feel like accessing the bytes that correspond to padding might be implementation-defined, but I couldn't find this in the standard.


Parent comment was edited to include this quote:

_6 When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values. 51)_

So sure, a memcpy after a memset aught to preserve all the bytes, but if they are updated through a struct which contains padding then the values in the padding bytes will be left unspecified.


I don't think it is. It wouldn't make sense since memcpy is basically supposed to convert between arbitrary binary data.


From a practical point of view, yes, but I would be very surprised if the C standard had any accommodation for memcpy being a way to leak the value of a structure's padding.


You clearly haven't understood the thread.


How so? Did I miss something important?


Do you mean in the sense that a compiler can say "I know what memset is supposed to do" then decides that even though sizeof returns a particular size, that it might ignore that and set fewer bytes when it does a substitution for the memcpy call?


Yes. As long as the behavior observable by a conforming C program is the same, the compiler is allowed to change anything else.

Even if memset in a vacuum is guaranteed, look at the quote by anyfoo. "When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values."

If the compiler knows you're going to write to the members right after the memset, it can put the padding back to its previous values. And by that I mean as far as your code can tell it put it back, but in actuality it never zeroed the padding to begin with.

And since there are performance benefits to pretending, now you have a situation where a "helpful" compiler and a malicious compiler have the same effect: data can get leaked.


As soon as a pointer to said memory is passed to an extern function in another translation unit, the compiler can't prove anything about how it's used, which is the case in pretty much all of the examples mentioned in this thread.

Also, type-punning is a thing. memset is byte-oriented/memory-oriented. Just because you're using it to zero a struct of a particular kind doesn't mean that's the only way the memory will accessed. Just because reading the padding of some struct is undefined behaviour doesn't mean accessing those bits by some other means is also undefined.

A compiler usually can't eliminate a call to memset() in most practical cases of initialization (where memory leakage is also a concern) because they almost always pass a reference to a routine in another translation unit. Something like memzero_explicit() can be used anyway -- but you're massively overstating the relevance of compilers eliminating dead stores done via memset(). It's much more of an issue for post-destruction memory sanitizing (which is the primary use case for memzero_explicit) than it is for compiler f*ckery when memset() is used for initialization.


> As soon as a pointer to said memory is passed to an extern function in another translation unit, the compiler can't prove anything about how it's used, which is the case in pretty much all of the examples mentioned in this thread.

You would have to call such a function between the memset and the first time you write to a member. Otherwise the compiler is allowed to say "I put the padding back, and you can't prove otherwise".

> Just because reading the padding of some struct is undefined behaviour doesn't mean accessing those bits by some other means is also undefined.

It's not always undefined, but it says very clearly that the value of padding becomes unspecified.

> they almost always pass a reference to a routine in another translation unit

> you're massively overstating the relevance of compilers eliminating dead stores done via memset

Unless inlining happened, or link-time optimization, or, or...

If the compiler zeroes the memory most of the time, that makes it even scarier. Because all your tests come back clean and safe, then four years later a macro changes and suddenly you're leaking data all over the place.

I don't think I'm overstating the relevance at all. Any security feature that could disappear because of a reasonable, trying-to-help optimization is one that should have a bright red warning label. And this is such a feature. It doesn't require a "sufficiently smart" compiler, and it doesn't require a malicious compiler. This is the kind of thing that can break by accident and ruin everyone's month.


> This is the kind of thing that can break by accident and ruin everyone's month.

Not in practice. Compilers make use of undefined behaviour to optimize things that are widely applicable and profitable. No real compiler does what you're saying and no future compiler is likely to without explicitly being asked to.

I agree that, by the letter of the spec, you're right, but you're still most certainly overstating the relevance.

I'm not arguing that this isn't a real problem or that people shouldn't use memzero_explicit() (or similar) where security is on the line, as I already said several times in another sub-thread -- I'm just saying that this kind of thing is extreme language-lawyering beyond the realms of probability. It's still not an excuse to be lax, but let's be realistic about the actual likelihood of it happening.


It's just a type of dead store elimination. And objects are initialized so often that I could easily see it happening in the future. But I guess we'll just disagree on how likely it is.


> It's just a type of dead store elimination

It's not just any, typical kind of dead store elimination. It also would require other optimization passes that I can assure you no mainstream compiler actually does. You can disagree with me if you want, but you're simply wrong.


Lots of mainstream compilers already have passes that check if every field in an object is definitely assigned. They use this to provide errors or warnings.

That step is 90% of the work. Once you can do that, it's straightforward to assess that every field is assigned after a memset, with no intervening reads, and then remove the memset.


> That step is 90% of the work

I spent several years working on a production grade compiler and I can assure you it's not. But keep just making things up off the top of your head if it makes you feel smart.


Would you care to explain why?

Let's look at a basic common case, as a checklist.

1. All fields are definitely assigned.

2. No functions are called except intrisics.

3. Nothing reads from the object on any control path.

4. Memset happens between creation and first assignment.

We agreed that step 1 is a solved problem, right? Are any of 2-4 difficult? Did I miss any prerequisites for the optimization?

Once you can prove 1-4, isn't the optimization pass as simple as looking for memsets applied to structs, checking 1-4, then deleting the call?


If you're relying upon the value of padding, you're already into undefined behavior.


If you're either using via either assigning or reading the value that is stored in the padded area of a class or struct you are in undefined behavior, any version of the standard you choose. So, care to explain the downvotes?


Although I did not downvote, I suspect your comment is being downvoted because it appears to overlook the risk of data leaking to an adversary through the padding bytes.


If you pass partially uninitialized objects (including padding) between kernel space and user space, there's a chance that the (less privileged) user space code can recover information it shouldn't be able to see, regardless of what the spec says about undefined behaviour. Such information may include secrets that previously occupied the memory where a new, unrelated struct now resides. The same can also apply to passing objects between different machines.

All "undefined behaviour" means is that the spec can no longer guarantee anything about the execution of a program once a constraint is violated. It doesn't say anything about the practicality of actually doing it on any given implementation.

FWIW I didn't downvote you, but your reasoning about undefined behaviour seems overly simplistic. It should never be used to explain away security concerns.


It shouldn't be used to explain away security concerns, but it shouldn't be used to overstate security fixes either.

You can memset to 0, and on your compiler it might be secure, but it's not enough to keep your security guaranteed against future compilers.

There's a reason functions like SecureZeroMemory exist. In a similar situation, you try to prevent leaking secure secrets by zeroing memory before releasing it. But the compiler sees you never use the variable again, and optimizes away the zeroing.

The message to take away is not "be less paranoid". It's "be more paranoid".


Right, but I didn't make a case that memset() is any more secure, did I? The parent comment was talking about undefined behaviour as if it's some kind of universal get-out clause.


Your last paragraph strongly implies that hermitdev was using undefined behavior to "explain away" security concerns.

But the problem is that the security concerns and fixes are all undefined here.

So the initial comment is still very right. If you're relying on the padding to be anything in particular, you're in trouble.


> But the problem is that the security concerns and fixes are all undefined here.

No one in this sub-thread has mentioned (or implied) any "fix". You appear to be putting words in my mouth.

> So the initial comment is still very right.

Of course it's right, but it's also misleading when read in the context of the parent comment.

> If you're relying on the padding to be anything in particular, you're in trouble.

No shit.


> No one in this sub-thread has mentioned (or implied) any "fix". You appear to be putting words in my mouth.

I'm talking about the comments hermitdev was replying to that were treating memset as a 'fix'.

And 'fix' is shorthand for the opposite of "open[ing] you up to all kinds of info leaks". I don't think that's putting words in anyone's mouth.


I guess we are misunderstanding each other's point. I just found hermitdev's comment to be misleading (despite being correct), but perhaps it's just my reading of it.

To be fair though, memset() usually IS a fix. As mentioned by the kernel memzero_explicit() docs:

> usually using memset is just fine (!)

-- https://www.kernel.org/doc/htmldocs/kernel-api/API-memzero-e...

A conforming C compiler can't just remove memset() as it pleases. The case that most often requires memzero_explicit() is when zeroing an object after destruction, because the compiler thinks it can statically determine that it's a dead store. It very rarely happens that a compiler elides a memset() used for initialization.

I'm not sure why you seem to think that memset() can just be dropped at will for no reason whatsoever or that it's somehow always undefined behaviour.


A conforming C compiler can remove a memset that has no side effects.


You mean it can remove a memset() that doesn't cause the observable behaviour to change?

For the sake of argument, can you show me some example code where it would be conforming to remove a memset() call? Preferably a realistic example and not a Google'd copypasta. Because it's all too easy to just regurgitate things you heard and think you understood, but no so easy to demonstrate it yourself.


I agree that it would be rare for a memset() used for initialization to be removed, but this is a good recent paper that gives lots of examples of when stores are eliminated in practice:

Dead Store Elimination (Still) Considered Harmful

https://cseweb.ucsd.edu/~klevchen/yjoll-usesec17.pdf


That doesn’t check whether that memset call writes 0xFF bytes in the padding.

The compiler could also determine that &a.x)[1] is undefined behavior in C (reading past the end of an object), and take advantage of that.

The way to check that is by disassembling the code, and even that doesn’t say what a C compiler should do, only what these particular compilers do.


I feel like you're missing the point (the gcc -O0 was there for a reason) but I updated it. Is this better?


Sending structures that include padding or that you don’t know the exact layout of over the network is another thing you shouldn’t do, period.

Code doing that may break with a compiler change or a compilation flag change on one end of the connection, even if you can guarantee both ends use the same CPU.


This kind of issue happens all the time with structs that a kernel, driver etc. expose to userspace: https://j00ru.vexillium.org/papers/2018/bochspwn_reloaded.pd...

Not sending the structure over the network is insufficient, and rigorous memset is a clear solution to this problem.

Also, C structure layout for a specific ABI is strictly determined; if that weren’t the case you couldn’t compile two C compilation units with different flags/compilers and expect them to work.


Also don't expect memset() to actually be a library call at all.


And most importantly dont expect memset to be executed at all. Most -O2 compilers happily optimize the memset away if if doesn't care about the sideeffects. Therefore there exists the "secure" memset variants, which are the insecure variants adding a simple compiler barrier, so it isn't optimized away.

And there exists the very few really secure memset_s variants with a real memory barrier which guarantees memset to be executed in load and store order, flushing the cache, so that Spectre attacks are mitigated. security-relevant crypto libraries, like openssl and variants don't care about that, because memory barriers are "too slow". only the Linux kernel does it properly. It's still a mess.


you can write your own memset that bypasses all of that. Its just 2 lines of code.


And that will be slower than calling memset() unless you have a really smart compiler.


I don't trust any memset at all. Literally every single memset I looked at was either broken, too slow or insecure. Mostly the wellknown glibc, freebsd, msvcrt and compiler implementations.


That's quite a bold claim. Care to explain further?


I already explained it. Noone implements memory barriers in its safe variants, only compiler barriers. And the trivial variants work byte-wise, not word-wise, which is usually 8x slower.


> Also, C structure layout for a specific ABI is strictly determined

That's true, but when sending stuff over the network you don't know if the receiver uses the same ABI.


Changing compiler or flags won't change structs, unless you do something crazy. If it did, you couldn't link C libraries compiled with one compiler with programs compiled on another.

You can't be sure you can send such structures to different OSes or CPUs however.


> If it did, you couldn't link C libraries compiled with one compiler with programs compiled on another.

You cannot do this unless the compilers have the same ABI, which is not part of the standard.


It depends upon the architecture and compiler. One C compiler for the MC68000 might have int as 16 bits (since externally, the MC68000 had a 16-bit external bus) but another one have ints as 32 bits (since the MC68000 can handle 32-bits internally). This was more of an issue during the 80s and 90s when you had more commercial C compiler available than today.


That is what stdint is for.


> Sending structures that include padding or that you don’t know the exact layout of over the network is another thing you shouldn’t do, period.

What if you do know the exact layout and it happens to have a gap between fields?


Tbh I'm not a network person, but my naive brain says instead of casting to char* and writing to the buffer, couldn't you just write out individual fields to a buffer so you control every byte? That seems smarter and safer.


It depends. It was common case for years in games to just cast your networked data to an array and send over the network. Nowadays it's much more in fashion to pick a serialization library to sit in between, as PCs are faster. But one might choose not to depending on the problem space. Embedded devices might not have the spare cycles or the space for a full serialization library.

So...you're not wrong, but it's basically the same reason people use C over something with more ergonomics.

Edit: the last time I wrote a networking library, I also just cast to raw structures, mainly because it was less work (didn't want to integrate a serialization library on both sides of the link). This was fine for a while (and very fast), but also hid some bugs in the protocol, and made sending arrays of data very ugly. Eventually we put protobuf on top, and things ended up much nicer.


How does sending raw memory to the socket work if you have a struct filled with pointers?


I think they simply don't do that with structs that contain pointers?


Indeed, this. If you have data with pointers, you either need to flatten (read: copy) your structure into something appropriate for structuring, or just get a proper serialization layer.


I guess there could be some fun workarounds with ptrdiff_t instead of pointers. "My child is 1000 bytes that way in memory" etc.


That's a pretty classical file serialization trick to pack a bunch of heterogeneous data in one file. A header/manifest starting the file, with an array of names, offsets, and sizes.


Actually works somewhat well for mmap'ed access, too, by using a macro or function to "dereference" "pointers" (offsets).


Or even an OffsetPtr class with overloaded arrow and dereference operator, if you swing that way. :)


What’s the runtime performance of that in-practice? Does x86, x64 and ARM have support for relative offset pointers without any performance impact? And do compilers use them?


Yes, address generation can at the very least do [reg + constant], and on x86, much much more. These are bread and butter optimizations constantly used by compilers all over the place, because this is how field access works.


Do you write a separate serialization function for every single struct and sub-struct you send over a network? And then continue maintaining every single one to keep it consistent whenever a field is added/removed/modified in any struct?

(Though I should mention that macros actually do make this doable, but they're hardly pretty, and I have yet to see many people embrace this approach, if they realize it's possible at all.)


> Do you write a separate serialization function for every single struct and sub-struct you send over a network? And then continue maintaining every single one to keep it consistent whenever a field is added/removed/modified in any struct?

Yes? Of course you do that? Or pick one of a hundred libraries that do that for you - protobuf, cap'n proto, flatbuffers, etc...


That's the protobuf approach, but things like Capnp are different, right? Isn't its wire format its in-memory format? I thought it was closer to the "straight memcpy" approach.

Protobuf serialization functions can turn into a lot of generated code too (read: big binaries) and stomp all over your icache, because they're different for each type. More "data-driven" approaches using type-generic functions with some runtime type-specific data (like proto descriptors&reflection) can be a good tradeoff for code that isn't as "hot". Typically more branchy than a memcpy or a type-specific function though.


There are dozens, if not hundreds of serialization libraries. None of them are going to compare to something like sending a raw struct on the wire, if you need absolute performance. Probably the biggest advantage of things like grpc, flatbuffers, capnproto, etc, are they provide serialization libraries for many different languages. Sending a raw struct will be up to the implementer to get the other side correct.

And by sticking with the serialization libraries with higher performance, like flatbuffers and capnproto, you also lose a little bit of future flexibility, since its typically harder to change the format of these structures compared to higher level libraries.


Author of Cap'n Proto here...

> None of them are going to compare to something like sending a raw struct on the wire, if you need absolute performance.

I don't know about that.

For simple flat structs containing integers only (no pointers), Cap'n Proto is nearly identical to raw structs.

For complex structures with pointers, sending raw structs doesn't work.

Admittedly you may suffer unwanted code bloat linking in the Cap'n Proto library if all you really want to do is send a flat, raw struct.

> And by sticking with the serialization libraries with higher performance, like flatbuffers and capnproto, you also lose a little bit of future flexibility, since its typically harder to change the format of these structures compared to higher level libraries.

I disagree with this. Cap'n Proto's compatibility story is essentially the same as Protobuf and JSON: You can add new fields and old programs will ignore them.

(I suspect Flatbuffers is also similar but I haven't looked at it in a long time.)


That's what I do these days for simple stuff, or for stuff which needs to talk to nonstandard hardware (eg. talking to a PLC, where I need to know the exact bit layout of the packet in order to unpack it manually on the other side).

If you're doing a large project on supported platforms it's probably worth using a pre-existing serialisation library like protobuffers or something.


There is an attribute that lets you specify alignment per-struct, which means you will know the exact layout.


But it is still potentially problematic if not all of your clients have the same endianess. Memcpy and exact layout still fails you there.


RDMA (Infiniband) does DMA to/from network. Why we should not use RDMA?


Hmm, that concern is not just limited to RDMA; any kind of shared memory communication across a trust boundary is vulnerable to info leaks in the padding of the struct. That does seem like a real issue.

Not sure what's the appropriate mitigation. Having to memset is a drag, and it's easy to forget to do it in some place or another. I'm tempted to say that you should only use packed structs for shared memory communication across trust boundaries.


Well, Infiniband is intended mainly for supercomputers. So security is of no real concern within the Imfiniband network as that is implicitly a trusted environment. Access comtrol to that network is the main security boundary.

Edit: Also, lots of simulation codes I am aware of tend to send flat arrays anyway. So I'd say that your concerns are mostly theoretical in nature.


RDMA, both roce and infiniband have some significant limitations, specifically at higher rates. But they're the best we have right now for direct dma.


Serialization is a non-trivial problem. Especially when dealing with different OS, network stacks, and architectures. protobuf alleviates this quite a bit.


No, but the follow up article is called Stop Memcpying Structures. And the follow up to that Stop Memcmping Structures. :)


You're probably not serious, but why waste time and complexity copying structure elements into a buffer (which likely needs to be memset/bzero'd!), if you can just directly use the struct, all consistent with the standard?

C is not a very high-level language. Apart from where undefined behavior is called out, you can and should assume how data is represented on a low level. It was made for writing operating systems and their kernels after all, where interfacing in a bit-accurate way is common.


I'm joking because endless flame wars have already been spent debating this issue. But I realize not everyone has heard them yet, so let's strap on the football shoes and bring out the horse carcass one more time!

There are platforms on which NULL, integer 0 and floating point 0 is not the same. Hence memsetting won't do the right thing. On modern platforms they are, but still, you don't want to get the Standard C Weenies on your back - they're even more annoying than the Rust Evangelism Strike Force. You don't want demons coming out of your nose, do you? :)

They do have a point, though. Suppose you are handling a struct you thought were memsetted but was not (your stupid team mate used those damn initializers!), then when serializing it you will get undefined bytes in the stream, possibly triggering hash values to mismatch causing mysterious rebuilds and other stuff.

Wrt to field accesses, yes those can overwrite padding bytes. Suppose a is an 8 bit char field, but padded to 4 bytes. Then "foo->a = 22" could be translated to "MOV [RSP+16], EAX" which overwrites the 3 padding bytes with undefined values. The solution, as you mention, is to use packed structs but in those field accesses are much less efficient.

Bottom line: code whichever way you want. :) Me, I like to live in the fast lane from time to time and those memset/cpy/cmp functions are sooo handy. :)


By the standard, the null pointer must be equivalent to 0.


Only if you use conversion within the language: void *p=0;

If you said memset(&p, 0, sizeof(p)) then that's not guaranteed to work on all implementations.


gizmo686's comment [1] explains the subtle difference:

> [...] The technicality of the memset example is that it does not set the bits of the pointer by referring to it as a pointer, so the requirement that 0 behave as if it was a null pointer does not apply.

[1] https://news.ycombinator.com/item?id=19767316


The memory representation of the null pointer is implementation-defined. The C99 standard only guarantees that a integer constant expression with the value 0, implicitly or explicitly cast to a pointer type, will give a null pointer. It does not guarantee that a null pointer cast to an integer type (other than _Bool) will give the value 0; all pointer-to-integer casts are implementation-defined. It also does not guarantee that a non-constant expression which evaluates to 0 will produce a null pointer if cast to a pointer type.

In short, there may exist counter-intuitive but nonetheless standards-compliant platforms where tests such as "assert((uintptr_t)(void * )0 == 0);" or "int x = 0; assert((void * )x == (void * )0);" would fail.


Did you mean this as a joke? I can't find a follow-up article (this one was written today).


Please explain how you send a struct over the network without UB and leaking the padding.


Use a packed struct (e.g. __attribute__((packed))). You may take a performance hit due to lack of alignment, but that's the judgement call


__attribute__((packed)) is not part of the standard, and therefore doesn't quite answer the question. I believe the standards-compliant way of doing this is that "you can't".


Would this work

  struct S {char data, char pad0, char pad1, char pad2};
  int main() {S dataStruct; printf("%s", dataStruct.data); return 0;}
and give me a 4-byte struct or would the compiler optimize it all away and leave me with a 1-byte struct?


Since you don't actually access the size of the structure, I see no reason why it matters. Also, FWIW, your code has undefined behavior because you call printf with the wrong type.


Yes on the UB, I'm passing a "char" into a "char *", sorry about that. I don't pass lone char data that often.

Well, I chose this example specifically to ask whether the resulting memory structure, completely independent of calling a sizeof(), would still be as if I were to call a sizeof() (which then I believe would be 4 bytes large) if not accessing the total size of the struct or any of the padding members.

It makes a difference in the memory footprint, which can become important, if you're programming a device with just 64-bytes total RAM.


> performance hit due to lack of alignment

This isn't true of modern processors.


Sure you get free unaligned access for scalars on x86, but unaligned arrays are still trouble if you use SSE (which basically everyone does).


By knowing its exact size with sizeof(), and using that and a cast to uint8_t* to both bzero() it out before use and memcpy()ing it when sending it. Unless the C standard says that the compiler may change the padding bytes of the existing storage?


To be clear, it is something like this, which is not bad at all to write:

  struct st s;
  bzero(&s, sizeof(s));
  // ...
  write(fd, &s, sizeof(s));
I’ve dealt with many systems that have tons of unnecessary serialization logic, and found them slower and buggier than just letting the C spec define the data format.

Having said that, it is possible to get this sort of thing right with C++ and templates (thanks to inlining).

Higher-level languages without compile time code generation (like Java and Python) usually make serialization a train wreck in comparison with “send raw structs” or “make the [C++] compiler generate and typecheck the machine code that does the serialization”


There's enough wiggle room in the standard that two compilers would be happy making two compliant but still different physical layouts for the same struct declaration. I'm not even worrying about cross-platform issues or bitfields yet.

You can get away with directly writing a struct for simple structs, but eventually it will bite you.


Packed structs are reasonably standard (even gcc and visual studio can be made to pack in the same way using the same syntax these days).

The remaining issue is endianness, but you can configure ARM to match x86 at boot so shrug.

Also, if I tell any reasonably experienced developer my protocol sends a packed struct with these fields:

  uint8_t version
  uint16_t reserved
  uint8_t op
  int32_t status
  uint32_t data_length
  // void data[]
They’ll be able to write a correct deserializer in pretty much any language, and can probably write it in ~ 10 lines of code.

I agree that packed structs can be tricky, but they’re much better than the vast majority of serialization libraries I’ve dealt with (including all of the open source ones, and I’ve used all the usual suspects).


> The remaining issue is endianness, but you can configure ARM to match x86 at boot so shrug.

ARM in BE mode is pretty rare / nonexistent. Small MIPS chips still float around in mostly low-end networking gear, though, and those do tend to run in BE.


My experience with packed structs is there really aren't any gotcha's. Either works or not.

The big annoyance is network people all use big endian probably just to spite everyone else.


Why would it be to spite everyone? Big endian is network order by definition. It's hard-coded into every asic in switches and routers for decades. If you don't use big endian your packets won't even make it to where you want. Of course, your own payload can do whatever you want as long as you know with 100% certainty nobody else will ever want to talk to your application and get confused.


Consider the lorawan spec which was written in the last 5 years. The stack is designed to be implemented in software on small 32 bit microcontrollers which are _all_ little endian. And it's big endians.

The people that wrote that spec knew that and went with big endian anyways.

Also I work with people that do network hardware. They can't care less about little or big endian. Makes no difference to them.


Someone needs to pick one or the other: you can't have it both ways.


Yeah they are picking the side everyone else abandoned 25 years ago.


In the "..." part you presumably meant to include writes to members of s, and those writes are allowed to change padding bytes and leak information.


A small quibble: bzero() is not defined by the C standard. memset() is.


Obligatory pet peeve note: sizeof is a unary operator whose argument is written in parentheses if it is the name of a type. It is not a function.


All answers to this must be wrong as it's simply not possible to do it. You need to take endianness of each field into account, and you cannot do that by treating the struct as a single blob.


But if you're slamming in-memory structs over the wire, you're using packed structs. You explicitly don't want padding when you use structs that way.

Whether you should do this in 2019 is another question, but sometimes it's still the easiest way to implement a binary protocol.


And you're not writing code for big-endian machines to interoperate with little-endian machines.

It's easy to write binary protocol code that way but it isn't portable.


It goes without saying that you have get the endian right when you store the values. That's what the endian macros are for.


My favourite trick with designated initialisers is using them with array indices and enums, e.g.

  enum color {RED, GREEN, BLUE};
  char *color_name[] = {
    [RED] = "Red",
    [GREEN] = "Green",
    [BLUE] = "Blue"
  };

  printf("%s",color_name[RED]);
This means you don't have to ensure ordering is the same between the enum and the array, which can be a pain for more complex tables. It's a shame C++ doesn't support this, more details: https://eli.thegreenplace.net/2011/02/15/array-initializatio...


Why would you ever want to do something like this in c++?

What you are implementing is a map. Just use std::map.


Lookup time is widely different if you use std::map. But you can use std::vector in a similar fashion.


Because it’s 2019!

It really irritates me when people give arguments like this. I seriously don't care what year it is, I care that the code I write works. A lot of projects still use C89 because they want the widest portability. C99 isn't very new, but especially in the embedded and other niche industries, C89 (often with some extensions) is all you get.

That said, "= { 0 };" is almost always usable instead of a memset() and works in C89 too.

And finally, there is absolutely no reason to check if a pointer is NULL just before calling free() on it

This is advice I agree with. The nullcheck is in free() itself (IMHO a good design decision --- especially along error-handling and cleanup paths.)


> but especially in the embedded and other niche industries, C89 (often with some extensions) is all you get.

Embedded and niche industries are forcing their programmers to work with 30 year old tooling? (Instead of 20 year tooling?)

I think at some point, writing yourself a transpiler from C99 to C89 becomes profitable… (The JS community did it for a lot less than that w/ babel…)


Embedded and niche industries are forcing their programmers to work with 30 year old tooling? (Instead of 20 year tooling?)

Why does how old it is mean anything at all? I personally choose C89 because of its wider compatibility, and I don't think I'm any worse off because of it. As the saying goes, "it's not about the tool, but how you use it." Incidentally, the best programmers I've worked with have also been the ones with the most conservative choice of tools.

I think at some point, writing yourself a transpiler from C99 to C89 becomes profitable… (The JS community did it for a lot less than that w/ babel…)

If anything, I think the JS community could learn a lot from the "C community" (a very loosely defined term...) --- the huge amount of churn with anything related to JS and web stuff in general turns me off.


I remember reading a doc's for a library targeting ARM Cortex processors that crowed about it being C89 compatible.

I pretty much spit on my keyboard.

I also knew the code base was going to be kinda trashy and it was.


Damn straight! That's why I exclusively use punch cards, for that good ol' reliability. Don't need no new fangled language crap! Sugar is for eating, not for syntax!

If you're actually stuck with C89 you have my sympathy, like all the people stuck still maintaining FORTRAN and COBOL. But that's obviously a niche case with extreme limitations, it doesn't really need to be brought up every time someone mentions C99. Or C++11 for that matter.


COBOL and Fortran are indeed still wide used, however I claim there is way more C89 than those two together. The point about C89 is that this covers a large area - all common Unix/Linux APIs and libraries are at least C89 compatible, if not written using C8 and as the GP mentioned embedded often depends heavily on such compilers. Thus usage goes far across industries with new libraries being created each day. COBOL and Fortan are way more in their fields (legacy business and science respectively) where more and more logic is build around those old systems using newer languages.


C89 compatible but not written in C89, which is the point. And almost certainly just C89 compatible because C99 didn't change the ABI, so that happened "for free". But the actual source, the thing people actually write in, is not C89. Linux heavily leverages new language and compiler extensions. It doesn't handicap itself with only using C89, because that's just silly.


Actually, I had some code that passed null to free(), and it segfaulted on some old version of Red Hat. (Not RHEL, Red Hat.)


That's either your platform's problem, in which case you should be very worried, or (more likely) you had some other undefined behavior somewhere else and it ended up biting you inside free.


I was once forced to change the initialization of all the structs in an old codebase that used them for almost everything.

The senior programmer that requested the change came from the application world and used the same argument.

When I told him that the compiler gave lots of warnings about being compatible with C99 through an extension, he asked me to just disable the warnings and keep on going. We did some testing and pushed the update (we had remote devices connected via GSM)

Two weeks later we were pushing a remote downgrade to the previous version because we started to detect anomalous bugs everywhere.


#include <stdio.h>

struct abc { int a; int b; int c; };

int main() {

  struct abc b = {2};
  printf("a b c = %d %d %d\n ", b.a, b.b, b.c);
  return 0;
}

% c99 test.c

% a.out

a b c = 2 0 0


https://stackoverflow.com/questions/10828294/c-and-c-partial...

> The C and C++ standards guarantee that even if an integer array is located on automatic storage and if there are fewer initializers in a brace-enclosed list then the uninitialized elements must be initialized to 0.

It does work for zero-initialization.


Absolutely! This is why I'll iterate through all and manually set each and every single element. You can even do it in one line, too: for (unsigned int i = 0; i < 3; i++) {b[i] = 2;}

EDIT: Just noticed your code example uses structs. My example is for arrays, for which these observations are also correct.


I just had to google it: https://stackoverflow.com/questions/1912325/checking-for-nul...

There is history to this practice. And also, foresight: 1) I may want to redefine free(). 2) If I do, my free() does not null-check. 3) Which is why my main code would null-check.

I'm an embedded developer, never do much on application work, and used to do that with C++, so I'm not too familiar with the heap allocation intricacies with C. But I'm quite sure that "delete" in C++ e.g. definitely needs to be null-checked (unless they changed that in some newer C++ revision, I always to try to code version-agnostic, and so far, have always managed).

In practice, I find it better to manually null-check pointers before deleting their contents, it either gets optimized away anyways or is an extra redundancy in code-or platform-migration.


I may want to redefine free(). 2) If I do, my free() does not null-check.

I think redefining standard functions to do something very unexpected is very poor decision, especially when it means you now have to burden all its callsites with an extra check.

But I'm quite sure that "delete" in C++ e.g. definitely needs to be null-checked (unless they changed that in some newer C++ revision

No, it was never needed in C++ either (which given its desire to be backwards-compatible with C, is not surprising.) From the C++98 standard section 5.3.5, paragraph 2: "if the value of the operand of delete is the null pointer the operation has no effect."


Yes, you're right about delete. I still do it, and will still do it, though. I have my reasons.

No, I wouldn't redefine free(), I'd use my own terminology, but I'd search the code for it and replace it verbatim, and seeing that it apparently is standard practice for both C++ and C to null-check their pointers before deletion, I have all the more reason to do so, because my code assumed that it didn't.

Delete shouldn't null check, it should delete memory at location, no matter what address. It should be up to the implementation to check that pointer is valid.

EDIT: I should maybe add that the next line of code is in my case almost always "pointer = NULL" following "free(pointer)". The null-check is not just a technicality of checking of pointer-validity ("Pointer not null?"), but also code-syntactics ("Has this item not been deleted yet?").


What are your reasons?


They're in the edit. Precedence of code-flow indicators (I said "code-syntactics" in my original quote, that's wrong, sorry) are more important to me, than keeping away redundancies.

What I am advocating for, is "Before you work on a pointer, check its validity", regardless of what follows after. It's just a general guideline, do it, even if you think it's senseless and redundant. Ever since I've adopted those basics, I've virtually reduced my problems with pointers to zero.

There's danger in accidentally not null-checking, but none in accidentally double-checking. Therefore, the implementation should take care of this (too), or at least consider implications.

I can't really take risks there.

The original author of the article is asking me to "stop" this, which could put a lot of things at stake. No, thank you.


C++ explicitly allows `delete` to be called on a null pointer.

E.g. see https://en.cppreference.com/w/cpp/language/delete


Am I the only one who finds this

    int yes=1;
    setsockopt(listener, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof (yes)) ;
much cleaner than this

    setsockopt(listener, SOL_SOCKET, SO_REUSEADDR, &(int) {1}, sizeof(int));

?


But it's C99, so you can do:

    setsockopt(listener, SOL_SOCKET, SO_REUSERADDR, &(int) {true}, sizeof(int));
I think this makes the intent clearer.


Beyond the readability aspect, I really dislike that the programmer is required to make sure that the inline datatype matches the type passed to sizeof. If the API ever changes, e.g. to using a long, it has to be changed in two places, and there will be no compiler error if it isn't. Using a separate variable takes care of that automatically.


I agree anything is better than a magic number, but really only if the variable is properly descriptive.

What are we saying 'yes' to?


> *What are we saying 'yes' to?

The flag SO_REUSEADDR (SO = Socket Option); really, I'd say we're setting the flag / setting the flag to true, but true is a bit of a reserved word, which I'm guessing is why they went with "yes".


Definitely agree that calling it `reuse_local` makes more sense, but OP was just quoting the blogpost.


And the blog post was just quoting Beej.


Whatever you pass the yes to.


No, you are not the only one.

The author of that code is trying to make it explicit -- "yes, I want to allow reuse of local addressses". Some will prefer this, some won't.

I personally like the use of the variable, though I think it's not worth getting all bent out of shape about it one way or another. But I guess we all have our pet peeves, and I know I'm guilty of getting all bent out of shape about things that ultimately don't matter much.


Code is read many times more than it is written. If someone scrutinizes code for clear intent, I don’t think that should be considered a “pet peeve” or some quirk to be explained away.


I wasn't suggesting that any scrutinizing of the code for clear intent should be considered a "pet peeve". I was saying I see both sides in this particular case, and that in my opinion devlopers (myself included) often make a way bigger deal out of things than they really are -- when the larger part of their offense toward a particular choice is often more about personal taste than it is anything else (thus the reference to pet peeves).

Again, I like the use of the variable, because in my opinion it makes the code a bit clearer, but I also don't think it is the end of the world if someone chooses the option without it.


I feel that code that needs to be read a lot is badly named, commented or dodgy.


No, you're not, the second part is like random json data inside C source.


I disagree with this author on both counts. 1. Zero filling is theoretically unnecessary, yes. But theory is not often reality. Zero filling is a defensive measure that protects you from other code that makes unhealthy assumptions. Code external to yours may make assumptions about what lies in your padding, either by doing memcmp's, or adding new fields in a later version of the struct and intending it to be ABI compatible between modules. I'll prefer safe and evolvable code over your nitpick about how it's initialized, thank you very much. 2. Consider yourself lucky that you only use free() in your code. In more complex code bases, custom allocators are used. They do not always follow the same semantics as are dictated for free(). Maybe instead of saying "this perfectly valid thing drives me nuts", ask why people do it. It's not because they don't know better.


> Code external to yours may make assumptions about what lies in your padding, either by doing memcmp's, or adding new fields in a later version of the struct and intending it to be ABI compatible between modules.

The content of the padding between members can arbitrarily change during assignments of members, so code relying on the content of it is broken anyways. In the case of an extended struct both methods are equally broken if you don't recompile your module with the updated definition of the struct. Your memset will still have the old size of the struct as the third argument.


If you use:

  sizeof(struct_instance)
the compiler will automatically update the size passed to memcpy. (I like passing the instance to sizeof and not the type, just in case the type of the thing being copied changes in a future version of the code.)


That is exactly what I said:

> ...if you don't recompile your module with the updated definition of the struct...

And once you recompiled, the C99 version with the pretty initializer is fixed as well, so there is no advantage in using memset regarding this.


The compiler may elide memset if it's not possible to access the value in a standards-defined way, so your defense might not actually be doing anything.


> Cue the Rust Evangelism Strike Force chiming in to say that there is really no reason to be writing new C code in 2019.

Even as a member of the Force (RIIR 1st brigade), I laughed pretty heartily at this. :)


Same. But also, I feel like most of The Force's efforts are directed at C++. I could be wrong.


C++ programmers need to be converted. C programmers just need sympathy.


We don't really need sympathy, we get paid a lot and can afford to drown our sorrows.


At least until the next zero-day caused by a buffer overrun. (Kidding, kidding.)


I know you're kidding, but C programmers also get paid to provide security fixes. There's plenty of C code out there that doesn't interface with untrusted data/input, or does it in a safe way. So I'm not really in fear of losing my job over a simple buffer overflow.


Rust programmers need Haskell.


Judging by posts here on HN, we'll eventually get there ;)


Most C programmers I know are generally happy.

C++ programmers seem 'stressed'


They're happy until someone fuzzes their stuff.


I'll take unit testing and static and dynamic analysis tools for $800.


Rust is a big language, like C++. It feels too complicated compared to C.


I've only written a little of each, but my impression is that C has mostly remained a focused, effective language for careful systems programming. While C++ has become a sprawling cthulhu of leaky abstractions and other nasty foot-guns and people only use it because "it's as fast as C but sorta feels like a higher-level language". And that therefore Rust is a strict improvement over C++, but not necessarily C.


I thought this way too until I worked with competent C++ devs on a modern code base. (By my definition of “competent,” no one ever encountered by the Rust brigade bloggers is competent, fwiw).

At some point, I realized that C++ is a language for implementing new languages in a way that is backward compatible with existing programs.

Templates are Turing complete (lazily evaluated, purely functional, and side-effect free), so whatever your language of choice can do can be backported to C++ (granted, painfully, but probably less painfully than throwing out all your legacy code and starting from scratch).

On top of this, C++ supports zero-cost abstraction, so it will get very close to low level C performance (including constant propagation, loop unrolling, type-unsafe transformations, hoisting, inlining, etc, etc) even if you throw piles of lambdas and encapsulated methods at it.

The costs, of course, are that the compiler is slooooowww, and you inherit all the memory safety warts of C by default. (Emphasis on “by default”, because memory safe subsets of C++ are perfectly workable.)


> Templates are Turing complete (lazily evaluated, purely functional, and side-effect free), so whatever your language of choice can do can be backported to C++ (granted, painfully, but probably less painfully than throwing out all your legacy code and starting from scratch).

This does not logically follow, since templates are evaluated at compile time and hence have restrictions on what they can do.

> C++ supports zero-cost abstraction, so it will get very close to low level C performance (including constant propagation, loop unrolling, type-unsafe transformations, hoisting, inlining, etc, etc) even if you throw piles of lambdas and encapsulated methods at it.

You do have to be careful with what you're doing, though.

> the compiler is slooooowww

I mean, it's not like the Rust compiler is fast either…


"since templates are evaluated at compile time and hence have restrictions on what they can do"

This statement doesn't make any sense. A Lisp program can be fully converted to run at compile time; it'll just be lacking in I/O opportunities.


You do realize that I/O is two thirds of "input, processing, output": what computers do?

What app do you use that takes no input? Or produces no output? Let alone both?


Any reasonable compiler will time out before you can do anything too fancy at compile time.


Templates are not quite side-effect free, although it takes some effort to run into that. See http://b.atch.se/posts/constexpr-counter/ for a delicious exploit of that fact.


The problem is that (in my experience) it's very hard to reach that level of competence. I certainly haven't, so I feel safer with Rust.


Honestly I tend to use c++ as C with easy strings for user feedback.


It's not about size, it's about predictability and explicitness. Rust has features, though it also won't do anything you don't tell it to do, and its guarantees hold throughout. You don't have to know and understand how something works to be effective, just the languages guarantees. All the abstractions have zero cost anyways. If you want to write "like C but with guarantees" that's totally fine, and you can safely and predictably interface with anyone elses libraries.

tl;dr: while big, it reduces your cognitive load through strong guarantees. Think of it more as progressive disclosure of complexity.


That makes no sense - C++ eliminates many of the memsafety concerns that are problematic in C


It also occasionally obfuscates some of the safety issues under the guise of "smart" classes.


C99 designated init doesn't work for C code that also needs to compile in C++ mode (...yet, a limited subset of designated init is coming in C++20, clang also seems to be more relaxed about this and allows the full C99 designated init also in C++ right now).

For code that only needs to compile as C, I agree. It's one of the best (if not the best addition) to the C language. A nice addition would be default values for struct members in the declaration with compile-time constants.


My C++ is rusty (no pun intended), but if your code also has to compile as C++, I think this will work instead of memset:

  struct addrinfo hints = {0};
If C++ requires

  struct addrinfo hints = {};
instead, you could create a macro, and write

  struct addrinfo hints = ALL_ZEROES;


Sure. C++ has initializers ({1,2,3}), it just doesn't have C99 designated initializers ({.a = 1, .b = 2, .c = 3}). This is one of the main pain points around C++ not being a true superset of C.


G++ actually says “sorry” when it bails on that.

The other (bigger) pain point is that malloc returns void*, and C++ requires a cast.


Don't use C99 initializers, or not directly. Write a macro:

  struct point {
    double x, y;
  }

  #define init_point(x, y) { (x), (y) } /* C90 */

  #define init_point(x, y) { .x = (x), .y = (y) } // C99

  struct point p = init_point(0.5, 1.5);
The macro is superior because macro calls are checked for number of arguments. If you have a structure in which certain members must be initialized (to a non-default value), and such members can be added over time, that will save your butt. How? Because when you add a new member that must not be zero/null, you add a parameter to the macro. Then, the compiler will tell you all the places where the macro is called with too few parameters.

Designated initializers eliminate some of the error-prone aspects of C90 style, but don't help enforce the discipline of initializing all members, or a certain subset of members.

Functional abstraction with checked arguments beats fancy new syntax.


Another reason to not memset structures: In the code

    struct foo {
        void * bar;
    } baz;
    
    ...
    
    memset(&baz, 0, sizeof(struct foo));
    assert(baz.bar == NULL);
it's possible for the assertion to fail, since NULL it not guaranteed to be represented in memory by a zero bytes.


This is a weak reason. In practice, machines that do not represent NULL as zero no longer exist and new designs of this nature would break so much existing code, it would be difficult for them to gain wide adoption in the marketplace. The C standard hasn't decided to require NULL be represented as zero yet (and may never, to preserve compatibility with machines it used to have defined behavior on), but most C code runs on machines where NULL is represented with zero bits.

(You're absolutely correct that the standard does not require NULL be represented with zero bits, though.)


I feel like a compiler is within its rights to optimize out that check if it so wishes, though.


> it's possible for the assertion to fail, since NULL it not guaranteed to be represented in memory by a zero bytes.

Technically correct (which is of course the best kind of correct) but you'd be hard pressed to find a system in 2019 where NULL != (void*)0. The most recent machines with non-zero NULL in the C FAQ entry on the matter (http://c-faq.com/null/machexamp.html) date back to the mid '90s.


>NULL != (void* )0.

This expression is false in any C99 compliant system:

Section 6.3.2.3 [0]:

> An integer constant expression with the value 0, or such an expression cast to type void * , is called a null pointer constant. [1] If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function

> Conversion of a null pointer to another pointer type yields a null pointer of that type. Any two null pointers shall compare equal.

The technicality of the memset example is that it does not set the bits of the pointer by referring to it as a pointer, so the requirement that 0 behave as if it was a null pointer does not apply.

[0] http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf

[1] In text footnote: The macro NULL is defined in <stddef.h> (and other headers) as a null pointer constant; see 7.17

EDIT: Formatting


The literal `0` is guaranteed to be the null pointer value when used as a pointer, so by definition

    NULL == (void*)0.
But given

    void *p = 0; 
    intptr_t i = 0;
it is not guaranteed that `memcmp(&i, &p, sizeof(p)) == 0`.


> it is not guaranteed that `memcmp(&i, &p, sizeof(p)) == 0`.

Even less intuitively, it is not guaranteed that `(void*)i == p` since `i` is not an integer constant expression, even if the value is known to be 0.


> you'd be hard pressed to find a system in 2019 where NULL != (void *)0

You'd be hard pressed to find a system at any time where NULL is not equal to a compile-time constant zero. ;-)


[flagged]


You're missing the point. "(void *)0" is guaranteed to be a NULL pointer, even when said pointer is not represented in memory by zeroes.


Nah, NULL is guaranteed to be == (void * )0 by the standard. The allowed divergence is in how (void * )0 is represented as bits in memory, i.e., memcmp(zeroes, (ptr = NULL), sizeof(ptr)) == 0?


Technically no such system exists ;)

But talking about the actual bit representation. A zero value as a special pointer with the magic property of being invalid is a convention for C on x86, right?

I assume it has to do with it's fast to check the zero-flag in EFLAGS register when doing checks for null pointers.


That will never happen on current or future platforms, and is not a big concern. Or perhaps it'd be better to say that if it ever does happen on a future platform, then a few memset calls are going to be the least of your problems when porting legacy C code to that platform.

However, what has bitten me is memsetting structures that I later turn into full-fledged classes in C++. Oops, there went the VMT.

Designated initializers are very nice, as is the ability (in C++) to provide initial-value assignments that run before the constructor.


I do wish there was a compact way to tell a C++ class to zero init all pointer/numeric members to zero.


What's wrong with:

  T foo = {}


I mean in the constructor... Can one do *this ={}?


...are there any remaining architectures where NULL is not zero?

The examples listed here are all historical:

http://c-faq.com/null/machexamp.html


It does not matter if there are any architectures like that. Relying on that fact is still an undefined behavior. The compiler is allowed to produce any code it wants.

If you use memset to initialize structure with pointers with 0 and then test if the pointer are NULL, the compiler could assume that the pointer was not properly initialized and remove the if completely, or actually even remove the whole function.


In the real world it rarely matter what a theoretical compiler is allowed to do, but what specific compilers actually do. UB only matters when compilers actually "exploit" it.


Are you accepting bug reports for such issues in tarsnap? :)


Yes. If you look at the commit history you'll find that we've fixed a few of them already.


Yeah, I saw the comment and big block of bulls in bad tar. Here's two from memory: parent pointer in tree_entry, although it seems unused anyway. buff pointer in the link hash thing. (My bad, I looked at it on my laptop, which is not here anymore. I can drop an email later.)


Thanks! I'll try to remember about this when I'm back at my laptop but an email will make sure I don't forget.

FWIW I'm less strict about standards compliance in the libarchive-derived code, since that frobs lots of unportable bits anyway.


Used to read your blog several years ago, and really enjoyed it. Good seeing your name pop up again.


You are right in theory, but I have yet to come upon a compiler that will use something other than zero for null.


So if you're zeroing out a struct like so

  struct S {size_t a, void *p;};

  foo = (struct S) {0};
Does the standard require the compiler to set any pointers to NULL, or will everything be implicitly zeroed leaving pointers possibly improperly initialized to NULL?


The pointers are required to be initialized to NULL in your example (even if the representation is different from memset of zero).


Most interesting to me is that

  &(int) {1}
is equivalent to

  int one = 1;
  &one
How did I not know about this? I'll be using it from now on.


The article gives very dangerous advice, especially for networking code. Padding will not be overwritten and when transmitting the struct (provided there is no attribute packed or the compiler ignores it) information will leak.

Yes, someone will point out that there are languages that do not have such problems. However, when still using C, one has to be extremely careful...


I am reasonably certain that any code that happens to transmit padding bytes (or observe them in any way) is abusing at the very least implementation-defined behavior.


> While we’re on the topic of little annoyances, another pattern I often see is using a variable just to pass what is effectively a literal to setsockopt():

  int yes=1;
  setsockopt(listener, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof (yes)) ;
> Which can instead be written as:

  setsockopt(listener, SOL_SOCKET, SO_REUSEADDR, &(int) {1}, sizeof(int));
>

I'm sorry, no it cannot. Or, rather, not without

1. Switching C90 code to C99; or

2. Switching C++ code to C99.

Good luck with 2.

GCC and Clang support this in C++ as an extension; then you're not writing in standard C++ any longer.

1 is a nonstarter in project that mandates C90.

In other words, there may be valid reasons why the code is that way other than ignorance of compound literals.


> Elements that are not specified are initialized as if they are static objects: arithmetic types are initialized to 0; pointers are initialized to NULL.

That's actually not true. It is the case when initializing arrays, but for structs (and unions) any non-named member has "indeterminate value" (see paragraph 9, section 6.7.8 of ISO/IEC 9899:1999; unchanged in C11 and C18). Well, at least according to the standard - GCC, Clang and MSVC all happen to make it zero, but it's not guaranteed by the C language.

The trick with avoiding a variable when passing an address to a literal value is nice though, I didn't know that. There are some places in my code that could use it ;)


Para 9 is talking about members that do not have names in the struct definition. Missing initializers are covered later:

21 If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.


C11 6.7.9.19 says "all subobjects [of an object initialized with a initializer list] that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration."


The memsetting code is more obvious for older programmers and probably more compatible too. I don't want to start a flame or something, but if you want shiny new programming features and initialisation you should probably look away from C and try javascript ot something similar. One of the reasons we love C is that the same code compiles and works cleanly on the latest linux, a freebsd 4.11. or PIC16.


A new edition of K&R that is edited to address the new features that have been added since C89. I suppose that some might argue that there isn't much need for a new edition because anyone who can grok C89 could probably grok the rest themselves Internet-based self-teaching, but I would pay good money to have all those resources in consolidated into one printed resource.


After debugging undefined padding bytes breaking per-byte pod type hashes, I'll hang on to my memset, thank you very much! :)


This is mostly a matter of taste, but if the first example makes sense, the second example looks less clear, because, for someone quickly skimming through the code, a variable which have a somewhat clear meaning "yes" (that could be improved) is replaced with a magic number who have no intrinsic meaning.


I agree. I’m also not a fan of doing `sizeof(typename)` as it can hide intent


I'll stop memsetting structures as soon as you support designated initializer lists in C++.


Apparently it's in C++20, so feel free to stop memsetting whenever you update your C++ compiler next (clang needs a flag to enable it, gcc has had it as part of it's c++2a support for 2 years)


It is C99 code.

It is worth noting that many C projects go for extreme portability. That's why C is used in the first place, and sometimes C99 is too much. I still sometimes work with compilers that barely support C89. That's for the aerospace industry BTW. There are even some libc that don't accept free(NULL).

If you know the platform you are writing for is not ancient, that's fine but if you are writing portable code, you have to realize these ancient platforms are still alive.

I think ditching K&R C is fine though. I've seen some of it but never encountered a case where it was the only thing the compiler supported.


> There are even some libc that don't accept free(NULL).

That’s pretty much the completely opposite of portable, since it violates the C standard.


The "extreme portability" he's talking about is portability of projects to compilers and libc implementations that behave badly, possibly in standards-violating ways like this.


Interesting, I was under the impression that compound literals of form (type){value} were illegal and the correct form was (type []){value} but apparently not!


There were a few old free implementations which misbehaved when given null. Maybe they are all gone now, but there was a reason to do it.


Indeed, I recall getting messages along the lines of "error: freeing a null pointer" or something like that. On the other hand, what are the chances of some freshly-written C being built with an old compiler? (possibly moderately high in the embedded world?).


Last I checked C99 initializers require constants, so they don't generalize, whereas the pattern shown does. I'd prefer if the initializers pattern did allow variables on the right-hand side for this reason.


Not true, designated initializers expect the same kind of expressions are variable initialization (which is limited at global scope, but arbitrary when initializing local variables).


Oh? I think I reviewed a few Linux kernel patches this week that used memsets to fix up messed up aggregate or designated initialization which is the most concise way of initializing the padding.


Frankly, use the style that is idiomatic to your code base is what is the most important. Keeping things consistent is what matters.

And if you start a new project, just use the style you prefer and enforce it :)


-Wclass-memaccess is a warning emitted by newer g++ versions. It’s led me to use safer/better, if less convenient, options. (That being said, it’s only applicable to C++.)


We use Crystal for networking code, and it has nice C binding semantics. Haven't had to drop down to pure C much because Crystal just works, and doesn't have the learning curve or un/boxing hoops of rust. It's basically Ruby-like but with rudimentary type-inference, like what Ruby 3 wants to be but Crystal already exists... and orders-of-magnitude more performant.


Interestingly enough, Clang (but not GCC) appears to embed memset for you despite the code using initializers: https://godbolt.org/z/khJQLV

The author of the article should file a bug report. After all, if they wanted to memset, they'd just do it themselves.


The compiler knows the representation of NULL on the targeted platform and if it is safe to use memset. It is perfectly valid for the compiler to make this transformation if the platform represents NULL as zero.


My post is sarcasm, I just find it humorous that one's efforts to eliminate explicit memset doesn't stop the compiler from going one step backwards and embedding it just the same. If you replace the initialization with memset, it still generates identical code for both Clang and GCC.


I don't think the point was to eliminate "code that looks like memset at the assembly level", but to eliminate memset at the source level, which the author considers cleaner.

Under the covers, one would hope a good optimizing compiler could choose the same assembly for both, which may very well look "memset like" since fewer, wide writes are generally more efficient.


Ugh, what do you do when the structure definition is updated and you now have uninitialized values?

If there's no explicit initializer I think memset,calloc or ={0} is just fine thanks.


Unclear to me why you've been downvoted; I've seen many cases where a structure has had elements added which would be end up uninitialised. A memset(&struct,0,sizeof(struct)) seems pretty reasonable to me.


Probably because the article specifically addresses this issue and C99 specifically initializes any members who are not named in the initializer form to either zero or NULL.


If you only use super simple structures maybe. It is quite common to zero out a structure and use only parts of it. It is also quite common to have to pragmatically fill in the structure.


> It is quite common to zero out a structure and use only parts of it.

That point is addressed in TFA.


memset() heads off the dreaded "I added more fields, but forgot to add the initializers everywhere" bugs. Well, it will still be a bug, but at least it won't be a heisenbug.


The point of the article is that C99 initialization does the same.


Because of c spec idiocy it isn’t event sufficient to use memset anymore, you need to use memset_s.

But “stop memsetting” is clearly nonsense if you care about security.


Surely you only need memset_s if you're doing this for security reasons, as if memset is optimized away it's because the compiler can prove the observable semantics are the same.


Nope, memset will often be “optimized” such that padding is not zeroed. It’s not because the compiler sees that padding isn’t observed, it’s because reading from padding is undefined. Because reading from padding is undefined the write (also undefined) by definition has no observable side effects.

Memset_s is the only way to guarantee that padding is actually zeroed, well memset or bzero_s obviously.

It’s critically important for people to understand that if the C spec says doing X is undefined, then current compiler writers interpret this as allowing them to do anything they want, even when it clearly introduces security vulnerabilities.


are you sure that the padding is not zeroed by memset?

[1] says that memset_s exists because it is not optimized away if it just clears a variable that will not be used after the call (unlike memset)

On the other end: people who do this for security reasons say not to trust memset_s is 'an optional feature of c11 and not really portable' [2]

[1] https://en.cppreference.com/w/c/string/byte/memset

[2] https://www.cryptologie.net/article/419/zeroing-memory-compi...


This [2] is a dangerous post from a security guy, describing memset_s with just a compiler barrier as "secure". In the age of Spectre only a memset_s with a memory barrier is secure, and currently all memset_s but my safeclib memset_s are insecure. Esp. the ones with "Secure" in its name.


I don't think memset is required to zero out any padding, because the effects of this are not observable.


The function memset gets the length of the area to overwrite, how is it to know where the padding is?


The compiler knows the layout of your structure and can choose to elide memset calls or replace them with equivalent code.


Right, but zeroing padding only matters for security reasons. If your memset isn't being done for security reasons, then memset() is better than memset_s().


I just realized you might be thinking of security in the sense of keys/partial keys.

Things that matter:

* vtable and code pointers: leaks the aslr slide

* data pointers: leak heap location

* type ids: can be used to spoof objects

* length and size fields: can give you information about heap layout

Etc etc.

All of these get attacked.


The standard (and compilers that conform to it) have no sense of security baked into them. They don't know that they should wipe certain buffers because as far as it's concerned the C "abstract virtual machine" provides no way of accessing such data. Of course, as a real software engineer you know that you will make mistakes and these will frequently allow implementation details to leak in ways that are exploitable; however, there is no standards-compliant way to protect against this because as far as it is concerned you always write correct code. There are ways to influence the compiler to do certain things that help in the case of mistakes (memset_s, __attribute__((packed))) but these do not and cannot solve the underlying problem because they are by definition non-portable: they change things that are implementation-defined and not standards-defined.


If you are processing data from the network, any uninitialized read gets attacked. My experience is padding (both inside a struct, and outside - array alignment for instance) gets attacked.

The only safe option is to initialize everything.

I say this having spent more than a decade working on software that is always being attacked.


You shouldn't be sending structs over the network, since their specific layout depends on the compiler (and version). At least make them packed if you're gonna do undefined behavior...

But you should just do it the right way and have a wrapper that deals with endianness differences and such:

    struct S my_struct;
    // ...
    push_int64(&output_stream, &my_struct.value1);
    push_string(&output_stream, my_struct.value2);
    push_int32(&output_stream, &my_struct.value3);
And on the receiving side:

    struct S my_struct;
    next_int64(&input_stream, &my_struct.value1);
    next_string(&input_stream, my_struct.value2, sizeof(my_struct.value2));
    next_int32(&input_stream, &my_struct.value3);


The problem is not sending structs across the network.

The problem is processing any data from the network. You have to assume that data is malicious, and will attack any weakness.

I’m not talking about serialisation at all - I agreed entirely that sending struct padding across the net is less than optimal :)


> You have to assume that data is malicious, and will attack any weakness.

The point is that the standard cannot help you here because malicious data attacks things outside the scope of the standard.


The standard gives compilers leeway to handle “undefined” behaviour however they want.

The original purpose of “undefined behaviour” was to deal with the myriad different platform behaviours of the era (remember at that point in history 1 byte was not necessary 8 bits).

At some point the compiler writers decided to extend the effect of that to “if this behaviour is undefined in the spec” then we are allowed to treat the code as doing anything at all - so we went from this may do different things on different platforms to “the compiler can emit code that does something other than the clear intent or the platform specific behaviour”.

This change in mindset from the compiler devs has been necessary to improve performance in the spec benchmarks, and other micro-benchmarks created primarily to justify the arbitrary definition of “undefined behaviour”.

To give an idea of how capricious the compiler devs are they implemented optimizations that ended up breaking spec (it has undefined behavior), and so rolled those optimizations out. Yet when they literally add security bugs to code that has been secure for a decade, their response has been “well you’re relying on undefined behaviour”.

The whole nonsense can be summed up as a (according to the insane definition of “correct” used by the compiler writers)

Struct {char c;int i;} a,b;

Bzero(&a, size of a)

Bzero(&b, size of b)

If (something) return memcmp(&Zaki, &b, size of a);

Return 3;

Could legitimately be optimized to a function that just returns 3. Because the writes to padding in bzero aren’t uneeefined, as are the reads in memcmp. Therefore the memcmp is undefined, and the compiler can assume undefined behavior cannot occur, therefore that branch can be removed.

Whether the compiler does do that today isn’t relevant: the definition of “correct” used by compiler developers is that that is valid behaviour from them.

The reality is that modern c/c++ developers need to treat the compiler as an adversary, because of a nonsensical definition of “correct code generation” assumed by the compiler writers.


Believe me, I know how compilers exploit undefined behavior and how this can lead to security issues. While you may disagree, there are certain useful optimizations that all but require the compiler assuming that undefined behavior cannot occur. Yes, occasionally compilers end up doing somewhat stupid and annoying things in the process or go "overboard" (Clang is notorious for this–I think some old version would just stop writing out code if it hit undefined behavior, so control would just run off the bottom of the procedure into the next one) but in general there are good reasons behind this choice being made.

> The original purpose of “undefined behaviour” was to deal with the myriad different platform behaviours of the era (remember at that point in history 1 byte was not necessary 8 bits).

FWIW, CHAR_BIT is implementation defined (not undefined) to be at least 8.

> Therefore the memcmp is undefined, and the compiler can assume undefined behavior cannot occur, therefore that branch can be removed.

I am fairly certain that this memcmp is "implementation-defined" (namely, reading the value of padding bytes is valid but not guaranteed to produce anything useful), which allows the compiler to pick whether it wants to emit that branch (but it must still return some value). If the memcmp was actually undefined the compiler would be "justified" in producing code that wiped your hard drive.


Undefined behavior != implementation defined.

Also, is it that much harder to do this (C++ for example):

    struct S {
        char c = 0;
        int i = 0;
        bool operator==(const S &other) const {
            return c == other.c && i == other.i;
        }
    };
    
    S a, b;
    if(something) return !(a == b);
    return 3;
Bam, no undefined behavior and you don't have to remember to manually zero out the structure everywhere.

If you're using memcmp on an object, you should be asserting that std::has_unique_object_representations<T> is true for it (unfortunately this is only available after C++17).

Also, if you are okay with anonymous field names:

    std::tuple<char, int> a, b;
    if(something) return a != b;
    return 3;




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: