Yes the code is bad but does not need to be completely rewritten. It should interpret the data as an array uint8_t (no casting to structure/integer pointers!), use simple helper functions to read out (u)int(8/16/32) values (there was a post on HN about this recently, https://commandcenter.blogspot.com/2012/04/byte-order-fallac...), and be careful to check things like sizes.
The code is also wrong because of strict aliasing. This is a real problem, your program can in fact exhibit undefined behavior because of this (it happened to me).
Some time ago I wrote some code to make these things simpler in C++. It allows you to define "structures" (which however are not real structs, it's an abstraction) and then directly access an arbitrary buffer through setters/getters. The code is here with documentation: https://github.com/ambrop72/aipstack/blob/master/src/aipstac... . In fact this is part of my own TCP/IP stack and it includes DHCP code (here are my DHCP structure definitions: https://github.com/ambrop72/aipstack/blob/master/src/aipstac...).
Interestingly (and pedantically), the code from the otherwise excellent "Byte order fallacy" article technically contains undefined behavior in C on machines where int is 32 bits wide (which is almost everything nowadays).
Consider this example of extracting a 32-bit int encoded as little endian:
#include <stdio.h>
int main(void)
{
unsigned char data[4] = { 0, 0, 0, 128 };
unsigned int i;
// the line of code from the article:
i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | (data[3]<<24);
printf("%x\n", i);
}
Compiling this with either gcc or clang using "-fsanitize=undefined" triggers this runtime error:
test.c:9:61: runtime error: left shift of 128 by 24 places cannot be represented in type 'int'
That happens because, even though the bytes are unsigned as the article requires (using uint8_t instead of unsigned char has the same problem), they are promoted to (signed) int before the left shift as part of automatic integer promotion. When the left shift by 24 happens, it puts an 1 into the sign bit of the int, which is undefined behavior. Later the int is converted to unsigned (using implementation-defined behavior, which luckily is sane on all popular compilers), but by then it's already too late.
The solution is to change the code slightly:
i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | ((unsigned)data[3]<<24);
This prevents the promotion from unsigned char (or uint8_t) to int by explicitly casting it to an unsigned int, which can then be safely shifted left by 24.
Yeah casting to struct pointers is a bright red flag. It also rarely saves you anything since packed structs are slow, and you still have to validate the contents.
You're right, these streams should be interpreted as uint8_t to avoid undefined behavior when shifting, masking, and ORing, and anyone doing this should tuck it behind some abstraction layer. It should never bleed into parsing.
My second year at google, I fixed undefined behavior that essentially appeared in every google binary because a core library did this without care for the alignment of the memory in question. Thank god with PODs you can just memcpy
Fair, but people do the packed struct thing because they're either lazy (don't want to memcpy individual fields) or they're trying to avoid a perf hit from memcpy. That perf hit is on the scale of using packed structs, except you only pay it once. So it's really not worth it.
I've written a good deal of very high performance code that does a great deal of parsing such 'packed structs', and I intentionally do not use memcpy, for performance reasons. For one thing, note that many compilers never inline calls to memcpy--it's always a function call. In this kind of code, that alone is sufficient reason to avoid using it for accessing individual fields.
Sigh. Of course I say that, and then try to reproduce it on godbolt, and can't, at least not with a trivial example. So maybe that problem is not as widespread as I thought.
Yeah I'm completely ok with reading into a packed struct. I think you may as well and indeed in most cases it's more explicit and succinct. I think using that internally leads to problems, so generally I advocate (if you're going to do this) using those as "packets" or whatever that are just PODs. You can validate those, then build higher level, internal structs from them. Generally I find whenever you see packed structs anywhere but the "I got this from the network/filesystem" level, the devs aren't considering taint analysis, which is a red flag to me. You shouldn't be dealing with external data outside a specific, well-defined layer.
As an array of uint8_t? I'm pretty sure that's wrong, and that it should be char instead. Usual OS network APIs are char based, and the fact that the network is technically about "octets" shouldn't matter behind those APIs. Or is my thinking wrong?
There's a different consideration: as far as I know, it's an open question whether uint8_t* and int8_t* have the same special aliasing rules that apply to char* and unsigned char. A strict reading suggests that they do not. So there may be cases where using uint8_t is UB but using unsigned char* is not.
I think that all compilers consider this to be a bit of defect in the relevant standards, and they'll compile your code the way you expect.
On virtually all modern computers chars are 8-bit and casting the address to uint8_t* is perfectly fine (due to specific provisions in the standards that I am too lazy to find). And you need unsigned so that expressions like x + (y << 16) do the right thing. With that in mind, I use uint8_t as opposed to unsigned char to emphasize that it is 8-bit and also for less typing :)
> On virtually all modern computers chars are 8-bit and casting the address to uint8_t* is perfectly fine (due to specific provisions in the standards that I am too lazy to find).
uint8_t is guaranteed to always be an octet, and stdint types (uint8_t, int32_t, uint32_t etc.) are also the recommended types to use when you want to make the size and signedness of the data clear. An array of uint8_t is definitely correct, and it's also the right thing to do in 2018. An array of char is something you should only use to store human-readable character in array form.
Many OS network APIs are "char based" because the C standard requires that sizeof char be exactly one byte, so it used to be a convenient shorthand, but that's not the case anymore.
One of the things I was not sure about is whether the size of uint8_t is guaranteed to be 1 as well. And whether that means the both are pointer-compatible. And if so, where is this specified?
As far as I know, C99 makes no requirements about the byte width of int8_t, only about its bit width. The relevant bits are in section 7.18.1.1 .
There are few architectures and runtimes where this is relevant anymore, and the answer is really best given on a case-by-case basis.
I think it's technically possible to make the two pointer-compatible on architectures with wider bytes. Last time I used one it was Analog Devices' SHARC, and I think they were pointer-compatible, but that was a long time ago, so take it with a grain of salt.
Downvoters, my question was wether int8_t* and char* are interchangeable. I can't see why that's a bad question to ask. Given "char* buf", there might well be a difference between "(uint8_t) buf[1]" and "((uint8_t* )buf)[1]".
Now, upon further researching it looks like uint8_t always has size 1 (as does char), but I'm still not positive that the second expression is technically valid.
> my question was whether int8_t* and char* are interchangeable
They are not. Not every "C" compiler complies with whatever standard you care to cite. There are DSP platforms[1], for instance, where char is 16 bits. Obviously any hypothetical pointer arithmetic on such a platform must differ between (u)int8_t and char.
The intent of (u)int8_t is that such a platform will either a.) correctly implement (u)int_8 and operate on exactly 8 bits as intended or b.) fail to compile. This is demonstrably different than assuming char is 8 bits and yet compiling when it is not.
I don't think I disagree, but could you articulate how strict aliasing makes this code wrong? Maybe I'm not looking at the right code, but it's not immediately obvious from the post to me.
That code looks somewhat wonky and seems either legal to me, or illegal for non-aliasing reasons. In principle, taking a pointer to a blob of bytes and casting it to a pointer to your struct should be sound if you can assume that the blob of bytes was originally a valid object of that struct type. I can see how you might get it wrong if that's not in fact how the blob was originally formed or if you get some alignment details wrong, but I can't immediately think of any aliasing-specific problems.
The problem is that it is "potentially unsafe" depending on what OTHER code does. For example, if some other code happens to manipulate the same bytes as an array of uint16_t (but not char/uint8_t) it's wrong (like in that example). Similarly if some other code manipulates the same bytes as some other struct type, it's also wrong. Yes it is possible to do it this way and not have undefined behavior, but you are leaving trap doors open that you don't even know about.
C strict aliasing rules doesn't allow you to cast a pointer to a char array to an arbitrary type T, regardless of whether it has correct alignment and contains a valid representation of T. It allows you to memcpy from a char array to a T variable (assuming valid representation).
And yes, it does actually produce broken code at runtime when optimizations are enabled on some compilers, unless you explicitly opt out from optimizations that depend on strict aliasing.
In fact the C aliasing rules don't say a single thing about casting. You are free to cast at will. What matters, as far as aliasing is concerned, is the effective type of an object for access through an lvalue. And the effective type is.. it depends. But it is not the same as the type of the pointee of whatever pointer you might or might not have.
Note that arrays are not lvalues, so the effective type of an object is never an array of char or any other array type. You can, however, access any object with a character type, and that does not change the effective type of said object. The standard explicitly permits this!
That means I could allocate some memory, copy an object in there, make a pointer-to-array-of-char, pass the pointer onwards, and let the next guy in row cast this pointer-to-array-of-char into a pointer-to-the-type-of-the object I previously copied, perfectly legal. And if I didn't copy an object, it is still perfectly legal. A random piece of memory has no effective type unless it's gained one due to a previous access through a non-character type.
Yes, you're right. When I said "cast a pointer to an array", I meant it more colloquially - as in declaring a variable of type char[...], taking a pointer to the first element, and casting that. And yes, it's about access rather than the actual cast, but in practice a cast to anything but void* is virtually always the first step to accessing it as the cast-to type (excepting some legacy POSIX APIs that use char* for this, because they predate the existence of void* ).
And yes, there's the exception to the usual rules that lets you access T via char* , and then there's the "common initial sequence" rule with structs. Suffice it to say that it's complicated, but that things that people usually think "just work", actually don't.
FWIW, I'm of the general opinion that the C (and C++) memory model is formulated in such vague terms that no-one really knows what it is. We have some sort of conceptual consensus, that kinda sorta works because everybody makes the same assumptions (that aren't really warranted by the standard, but are "common sense"). But once you start digging into things like lifetime and object type - in C++ especially - and coming up with weird corner cases, things break down pretty quickly.
Yeah, there's a certain level of hermeneutics involved in arguing about undefined behavior, it's a great source of unending entertainment because no one can prove you wrong _definitely_.
I've long had daydreams of a C or C++ implementation that went out of its way to dynamically track all those ephemeral distinctions described in the standard but not commonly made concrete in compiler diagnostics or emitted code. Maybe this implementation would take some liberties with regards to the prescribed space/time complexity of certain operations, but that'd be okay, since it's pedagogical and not for production use.
Wouldn't it be amazing to have every bit of undefined behavior at runtime (except for some of the more esoteric ones, I guess) result in diagnostic messages outlining the bad state that the participating objects are in and how they got there?
Like, at runtime, keep track of the dynamic type stored at each byte of memory, and then have debug logging output as the runtime evaluates the strict aliasing rules for each memory access until it finds a clause that makes it valid. Then we mere mortals can actually have arguments about these kinds of things and forward "falsifiable" theories!
I feel like setting this up in some sort of C interpreter can't be fundamentally nearly as hard as the inner workings of current-day optimizing compilers, so I keep thinking that someone must have done this already...
That's interesting - I had such thoughts as well, way back after my first experience writing significant quantities of C++ (03 back then) in the industry.
And yes, I think this would require an interpreter, and some kind of shadow memory as you describe (that keeps track of type and lifetime metadata associated with bytes). In fact, I think it would even have to be multi-level segmented memory, with segment per each object (including subobjects!), to fully enforce rules such as out-of-bounds array access, and comparing pointers to different objects.
It could be an interesting exercise. And if written in a verifiable language, the result could, in theory, be declared the specification for the memory model.
Nonetheless, it's quite commonly used in popular code. E.g. I was looking at some dynamic string libraries, and a popular one (sds) casts a struct header from a char * and modifies its values. This code is used in Redis.
And if you look at its build scripts, I bet there's -fno-strict-aliasing there.
All popular compilers let you opt out of these kinds of optimizations. But it's no longer standard C at that point - and it's specifically because of strict aliasing rules (which were being discussed in this thread).
The bugs you get when you break the rules without opting out of standard compliance are not theoretical, either. Just google for "strict aliasing bug" to see numerous horror stories.
> The byte order of the computer doesn't matter much at all except to compiler writers and the like
This statement is to generic to mean anything helpful.
I think the point our friend Rob Pike is making is that you could write general purpose code which accomplishes the task, and on a sufficiently advanced compiler, the swap may be eliminated where applicable; but when the point is made this way, it surely just pisses off people who know better.
C programmers are not (quite) children, you don't need to eradicate nuance to keep people reading.
The article isn't really about the bug but about the (lack of) quality in the code. Spoiler alert: He doesn't like it.
"This code has no redeeming features. It must be thrown away and rewritten yet again. This time by an experienced programmer who know what error codes mean, how to use asserts properly, and most of all, who has experience at network programming."
If the rest of the code looks anywhere near similar to the examples he shows he’s right, this is terrible code and it should never have been allowed in a serious project in this day and age.
It seems like this day and age is precisely when this sort of code is normal. We had a brief window of hopeful optimism in the mid to late 90s and even into the early 2000s, but since then it seems like code quality in general has declined tremendously.
It's not even really a replacement for NetworkManager; it's designed to be used to provide basic networking within containers, for systemd-nspawn. It can be used on host systems, but it's deliberately fairly lightweight and designed for fairly simple configurations in containers, rather than as a full fledged replacement for other networking management daemons.
I agree with most of the critique, but I don't quite get the part about the use of assert.
The code in question quite clearly doesn't use C's standard assert macro, but a custom macro called assert_return (which I guess checks for a condition, and returns from the function instead of aborting the program).
So basically:
if (!condition) {
return err;
}
How else would one check for invalid input to the function?
I don't know the details of DHCP, but there is a difference between values supplied by the programmer and those supplied by the user. According to the author, checking if a pointer is not-null is checking a programmer error (EINVAL) so it makes sense to use asserts. But the optval string in the example is supplied by the user so it is insane to use asserts to check for null-termination.
Asserts should only be used to signal "this code is broken" but that is not how they are used here.
A long while ago the GTK+ library used to suffer from a similar affliction. Mainstay GTK+ programs such as GEdit and Totem would print lots of warnings on the console. Since the GTK+ developers used the same facility (g_warning) for reporting both coding errors and for user errors like "file not found," debugging code was hard. Was the warning "supposed to be there" or was it a bug? GTK+ programs are much cleaner these days even if some programs still fails to make the distinction between user and programmer errors.
I may have a different understanding than the author, but for me the difference in the name means a lot. The fragment you wrote is clear and simple. But if I see assert_..., then I immediately have questions like: Where's the message printed and does it have enough context for debugging at that point? Is this affected by NDEBUG? Will it run in release builds? Does assert_se() kill the whole application?
Assert already has a meaning and using it in this context is not great.
I can see the argument to be made about overloading the word "assert" in the language though.
i.e. "assert" meaning "check something in this code is algorithmically correct"
Whereas when you're checking user input, use a very different term - i.e. validate or something.
So a fix would be to rename the macro to be obvious about it's usage - "validate_return" or "valid_or_return" or something.
To be fair I can see the problem because I do tend to write code comments saying "assert everything makes sense" when checking user input sometimes and "assert" itself doesn't properly convey the outcome if it fails - i.e. "abort_if" or "crash_if" would be a much more descriptive term.
In my opinion diluting existing, well-established terms is extremely dangerous and harmful for a project. If you're diluting established terms - like assert - you're kinda forcing readers to stop assuming anything about your names.
If assert isn't assert, and if return values are also abused, why can I assume get is get, sort is sort and so on? Sort might check inputs, get might delete the customer account, who knows at this point. And that makes anything but a tiny code base pretty much unmanageable, or at least very very hard to learn.
I used assert_* for code bugs and then verify_* for the user checking ones, my projects would have a verify.h,c file that was pretty short/sweet/easy for the next dev to grok.
At one point and older coder who I looked up to even complemented and adopted this pattern. 1998 was a good year.
I sometimes use a bunch of Ensure*() methods in C# if I need to validate input data. They throw exceptions if condition fails. If I only do this once I don't create those methods.
One drawback is the stacktrace has these Ensure names in it.
Probably returns a value if the assert works, and aborts otherwise.
I would go so far as to say that using asserts in input validation is a good thing, in very limited quantities. I use them in places where the input should have been validated so long ago, so many functions back, that if something isn't as I expect here, I seriously fucked up somewhere else and the whole system has a good likelihood of being compromised by something unrelated to the condition I'm asserting. In that scenario I'd rather log and abort.
In this case, though, you're not using asserts to validate. You're using asserts to capture the post-validation guarantees in your code. Basically, you assert that your validation code did the right thing and rejected invalid input through appropriate mechanisms - not that the input was valid.
That's what systemd does. For instance, the article uses a couple of asserts in dhcp6_option_parse_domainname(). By the time that function is getting called, the values should have already been validated, and the asserts (which are at the very top of the function) are asserting that the validation code did the right thing.
Then why do those "asserts" have return error codes associated with them? A failing assert shouldn't use normal error reporting mechanisms, and it most certainly shouldn't return a specific error code implying that assertion failures can (and should) be handled.
For the same reason read(3) returns EBADF if you pass it a bad file descriptor. Why doesn't the libc wrapper around the syscall just abort() if I give it a bad file descriptor?
In systemd-networkd, the "asserts" assert the pre-conditions of a successful call to that function (in dhcp6-option.c). It's up to the caller (in sd-dhcp6-lease.c) to validate the users input, and ensure that it's passing valid arguments to the functions it's calling.
Building software in layers, and making use of "internal" libraries is a valid software engineering strategy for managing complexity.
> For the same reason read(3) returns EBADF if you pass it a bad file descriptor. Why doesn't the libc wrapper around the syscall just abort() if I give it a bad file descriptor?
Because, unfortunately, POSIX requires it to do that. But that's a bad decision on the part of POSIX, and we shouldn't perpetuate that when we have the choice. Assertions that can be "handled" are not assertions - they're errors. By returning an error code, you're saying "it's okay to pass bad things here without checking, and check the error code afterwards". If it's not actually okay, because validation is not reliable, then you're just enabling writing bad, unreliable code that shouldn't even be possible to write.
The "bad descriptor" check in read() is a good example. You can never rely on EBADF unless you fully control every line of code that ever runs in your process, because a descriptor might be accidentally valid - it might be a descriptor that was closed, but then got reused for something else (and POSIX explicitly requires reuse of file descriptors). So checking for EBADF to e.g. detect closed descriptors and do something about that will mostly work, but it will sporadically break by doing something on the wrong descriptor. Therefore, it would be strictly better if read() just called abort(), and thereby forced the caller to use some other mechanism to keep track of what's valid and what isn't - the one that the caller can implement reliably, if it can.
In D, we call the two functions "assert" and "enforce". The point is that assert is the wrong term; to say "assert(condition)" says "I assert that condition should hold here". If there's an input event that causes condition to be false then your assert is not just false but wrong, since you assert something that is not necessarily true.
So this is a whole rant, and it's pretty much spot on probably, but the reader is left with a huge elephant of a question in the room: what is the proper way to do it, then? It has samples all over the place of what's bad and telling thhose should be rewritten. Why not show some C/C++ code samples of the proper way to do it, then?
I wonder why the author chooses to write buf[offset]*256 + buf[offset+1] instead of the more common method to load bytes using bitwise arithmetic, e.g. b[k]<<8 | b[k+1]. This is far clearer instead of using potentially magic numbers. Does Javascript not have bitwise arithmetic? Why would you write C code with Javascript in mind?
The silver bullet is to use the standard library that provides appropriate abstractions, such as byte streams with functions like "read a 32-bit little-endian integer". Nobody should actually be writing code like the above in most cases; and in the very very few cases where they have to (e.g. when forced to program in an environment where these haven't been provided), they should write it exactly once as a reusable function, complete with appropriate tests.
I mean, I don't actually know what the correct way to do this is, because it's been a long time since I thought about endianness. I just wrote down roughly what I remembered from the article. Also, of course there is no silver bullet. Everyone knows that RCEs are still achievable against Python or Java servers. But there are still better and worse ways to do things, and there are certain categories of error that you cannot make in Python and Java that you can make in C, where the opposite is rarely the case.
The main issue to me is that conversion, parsing, and validation are scattered throughout the code when they should be in separate, local, sequential steps. You should never be in a situation where you're halfway through allocating memory for an internal structure when suddenly you realize your input is malformed, for three reasons:
- In order to make a good decision about what to do next, you need a lot of context: what you need to clean up, what any calling function expects you to do, any internal state you need to update, etc. You either end up passing a lot of extraneous context info around, or you make everything global.
- Alternatively you bail with a status code and mutter "yolo" to yourself, or you throw an assert -- but crashing a system daemon with input is the essence of a DoS, so only the truly ignorant or lazy resort to it.
- Building a reasonable framework is hard when all your concerns are entwined like this. If at some point you decide asserts on input are bad, it's very hard to change that convention in this code. Essentially all your function signatures and your entire call graph change. There's an extraordinary amount of technical debt here; just look at all the places Lennart had to change the size/offset stuff. The logic for this is scattered all over the place.
People have focused on some of the tactical issues -- asserts on input, byte order conversion, etc. -- but the real issue is that when you write code in this style those things don't jump out at you because you have to do them everywhere. If your code had a better separation of concerns, as soon as you saw 'offsetof' in a parsing function you'd immediately see a red flag. It's so much easier to think in these specific contexts, where you're assured previous steps are complete, rather than in every single function having to get the entire conversion/validation/parsing flow correct every single time, keeping the entire thing in your head always.
As an aside, I find it a little ironic that I found it hard to parse the headline "Systemd is bad parsing". No, Systemd is not bad parsing, Systemd is parsing badly.
If this were not buried in systemd, but was part of a standalone dhcpd package, would it have been done better or caught earlier?
One of the major criticisms of systemd is that it wants to take over all system management functions. In order to do that it needs to offer at least minimally functional versions of all the services it pushes out. It should not be news to anyone that trying to do everything instead of integrating with existing projects leads to sloppiness.
This incident is evidence suggesting that other serious issues are lurking in systemd as a result of the actions naturally arising from their policy.
I would only expect that if the people rewriting everything were good programmers and said they were rewriting it to be secure. Most rewrites are done for non-security reasons are don't improve security.
Programmers should give appropriate attention to security regardless of whether they are doing either original work or a rewrite, and in the latter case, regardless of the reasons for the rewrite. I would not regard those who do otherwise as good programmers. Clearly, a DHCPv6 client within Linux' system administration suite is a case where security is an issue of the first magnitude.
Programmers should do a lot of things that most of them do not do. The statement was "I would expect..." and I replied that it doesn't make sense to expect those things because the vast majority of programmers don't know or care about security at all. We still see the same trivial SQL injections being written again and again on today's web, often by 20+ year programming "veterans".
One of the more popular DHCP clients is 'dhcpcd', which was created and is maintained by Roy Marples, the creator of OpenRC. It's fine when someone wants to work on DHCP clients and init systems, unless that init system is systemd?
(And for the record, systemd integrates wonderfully with dhcpcd.)
dhcpcd is not solely or even primarily used with OpenRC, it just happens to be created by the same person. What systemd is doing is something else -- not recreating one thing because there is a lack of implementation diversity or the existing implementations are subpar, but reimplementing everything even when there are multiple existing reasonable implementations, just because.
Rewriting everything is fine, as long as it's not in C or similar insecure by design programming language, it doesn't invent DSLs on top of INI/YAML/JSON files, uses proper architecture with clearly defined boundaries and protocols, allowing custom application-specific management instead of forcing everyone into author's narrow worldview full of incorrect assumptions. So as long as you are not doing what systemd does.
Centralized management/monitoring in general sucks. People don't seem to understand the problem and keep reinventing new centralized architectures suitable for their own special cases, instead of addressing that on architecture level.
>> One question: any smell of intentional obfuscation for this vulnerability, or is it impossible to tell?
I always think that anytime there's an interesting bug in ANYTHING. I'm not an expert, but I would think it would be nearly impossible to tell the difference between "someone made a mistake" and "someone made this look like a mistake". I would think any decent programmer with malicious intent could do that anytime they wanted.
I suspect in advance of the 1990s it was a problem to people who specifically cared about such things, and would read or write papers on the topic. In the 1990s it started to become a problem for a lot of people who really only cared about putting their programs on a widely-accessible network. ;)
For some further, semi-amusing support for the title's assertion, save all of your work, close important programs, and then run 'sudo systemctl --dry-run reboot'
Article declares that "What you should do instead is parse data the same way as if you were writing code in JavaScript," followed by "pointer arithmetic is wrong" (paraphrasing). That seems a bit extreme...
gcc is running in -fno-permissive by default, it will give you a warning if you implicitly try to converted between pointers of different types. (turn on warning as errors -Werror so that you have to deal with these messages)
Now what I don't understand is that for C you are not allowed to disable this, -fpermissive is not allowed for C, whereas for C++ you can. Anyone knows the reason why? The compiler error says "command line option "-fpermissive" is valid for C++/ObjC++ but not for C" - why is the compiler having this distinction?
I typically disable ipv6 unless I specifically plan to offer/use it as a public service. Just more things to keep an eye on for no benefit to the actual business.
Just like me and many other sysadmins. What amazes me is that the people who designed IPv6 still don't get it and seem surprised why it hasn't took over the Internet yet.
A lot of mobile phones access the internet via IPv6 by default, and IPv4-only sites that want to reach them require non-trivial compatibility infrastructure like carrier-grade NAT or more complicated shenanigans. It seems like some sites are starting to identify these shims as a huge stumbling block in their continuous work to improve latency/quality-of-service.
I think Facebook in particular made a bunch of noise in that direction and celebrates how prioritizing IPv6 has let them make huge improvements in load times etc, but at this point you're as likely to find good information about that on Google as I am so I'll spare you any links.
So while IPv6 might not have taken over the Internet in any particularly observable manner so far, I think it's fair to say it has huge momentum at this point and has buy-in way beyond some IETF nerds and protocol fetishists.
> It seems like some sites are starting to identify these shims as a huge stumbling block in their continuous work to improve latency/quality-of-service.
I'd be curious to know how much time you can actually save because really, honestly, in the case of Facebook this particular aspect should be the last of their concerns as far as page load times are concerned.
Hundreds of milliseconds per TCP (or UDP) session. The CGNAT boxes are handling hundreds of thousands of flows. And they're part of failover clusters so each flow has to be shipped to the other cluster members and acknowledged before starting the flow. And these things are often overloaded and adding tens of milliseconds delay to each packet passing through. Because whoever heard of an ISP spending more than "just enough" on hardware?
You see, my point is that these milliseconds are negligible in the case of a website like Facebook that takes 20 second to fully load on a reasonably modern machine.
You may already be behind an overloaded CGNAT then. Over IPv6 on Comcast cable, with the Edge browser, Facebook is fully loaded for me in 3 seconds. Surface tablet in power-save mode so this isn't some overpowered desktop monster.
That's very interesting. I've just measured it in private mode, all plugins off, from the moment of clicking login till the time the page load indicator stopped running. 13 full seconds.
I mean, we're probably not talking about a dozen milliseconds out of 20 seconds total, we're talking about extra milliseconds for every request or roundtrip or w/e constituting those 20 seconds. It's an extra cost that scales with the bloat you're decrying, it doesn't become insignificant next to it!
I honestly think the creators of IPv6 thought their customer was all the folks buying devices and forgot their real customers were all the sysadmins. They totally failed at making it an easy and rewarding jump from IPv4 to IPv6. Enthusiastic sysadmins would have got the conversion done years ago.
... which is what all those people say over and over who don't have the slightest clue how one could possibly build a system that has an "easier or more rewarding jump". It's always just a completely unsubstantiated "it should have been easier" (when nothing about it is actually difficult) and "it should simply have been backwards compatible" (with no word about how this inherent impossibility should have been accomplished).
Every criticism of IPv6 is downvoted into oblivion, but it doesn't change the fact that great technology is adopted by the target audience quickly if there is this much effort behind it. Where IPv6 hasn't been because it is painful and doesn't make things easier for the folks who have to implement it. It is difficult, and it is different, and they did a poor job of dealing with the conversion from how IPv4 is used (NAT). Sun spent money on Java and got a huge adoption because it solved some serious problems for C++. The amount of broken IPv6 code should have been a clue.
So, you are doing exactly what I said: You say it should have been easier and (more) backwards compatible. With absolutely no word about how that could possibly have been done, nor really even about what the supposed problems actually are, other than a vague "difficult and different". Absolutely nothing of substance.
So, you are doing exactly what I said: You say it should have been easier and (more) backwards compatible. With absolutely no word about how that could possibly have been done, nor really even about what the supposed problems actually are, other than a vague "difficult and different". Absolutely nothing of substance.
No, you aren't getting it. The "supposed problems" start with most networks using NAT and a real lack of conversion or easy bridge strategy. Are you actually saying a network of any size is easily converted to IPv6 with all services coming up fine? Its work and utter disdain for the folks who have to do the work just adds to the problem. They built a second system with all the problems of feature creep when what we needed was some more address space.
Also, what broken IPv6 code?
Check the Windows 10 Spring Creators Update for the latest examples, but a simple search on broken IPv6 code would have yielded the results.
> The "supposed problems" start with most networks using NAT and a real lack of conversion or easy bridge strategy.
First of all, NAT is neither a part of IPv4 nor specific to IPv4. IPv4 works just fine without NAT, apart from the lack of addresses, obviously, and there is nothing in IPv6 that prevents you from doing NAT. It's a stupid idea, but you can do NAT with IPv6 if you insist.
But I guess more importantly: IPv6 was invented when NAT wasn't all that common yet, so really, the original choice was between deploying IPv6 and installing NAT everywhere, and people chose to deploy NAT ... to then complain that there is no transition mechanism from the NAT dead end they chose to the IPv6 alternative they had right from the start?
And also, I don't really see the problem. Why would you need a special transition mechanism? Just don't use NAT?! There is relatively little actual benefit to using NAT anywhere, other than working around the address shortage.
> Are you actually saying a network of any size is easily converted to IPv6 with all services coming up fine?
Why would you possibly want to "convert to IPv6"? You just roll out IPv6 in addition to IPv4 and enable IPv6 on one service at a time and fix up any problems you see!?
NAT is not something that should be preserved, it is 100% a hack that fundamentally breaks the nature of ipv4 and introduces a shitload of statefulness in an otherwise stateless protocol. You can achieve (effectively) identical security semantics with a network firewall, NAT is not something that should be propagated forward.
How about some specifics, rather than rhetorical noise about HOW exactly do you propose that converting any network should have been easier. No matter what it would involve upgrading every network service, and every endpoint. This problem is not avoidable, if it was we could just use the same mechanism to make ipv4 last longer.
Regardless of your feelings on NAT, it exists, its deployed, and it needs to be part of the conversation from the start. If you want people to move off NAT, then it really needed to be gradual. Look at Perl 5 to 6 for how bad this stuff can go (heck, even Python 2 -> 3 was painful).
How about you add the additional 96-bits to the protocol (why 128-bits - that's boil the ocean numbers) and then the protocol sees all the 0.0.0.0 numbers as a short cut for a standard 32-bit range (I supposed with bits above and below for the eventual address expansion) while the : notation is the real full address. Then my day one experience is install the new software that is 128-bit aware and leave all my configs alone since I already have been assigned those in the IPv4 address space. Now we can do some phase 2 to get rid of the evil NAT and clean other clean up.
There are a lot of much smarter people than myself, and I cannot believe someone didn't have a great suggestion to cause a minimum impact on all the folks running systems in the field. I hate burn the bridges system conversion.
> Regardless of your feelings on NAT, it exists, its deployed, and it needs to be part of the conversation from the start.
Why?
> If you want people to move off NAT, then it really needed to be gradual.
Why?
> How about you add the additional 96-bits to the protocol
Yeah, that's a common variant of "they should just have made it backwards compatible".
> (why 128-bits - that's boil the ocean numbers)
That is not how efficient routing works. You cannot expect to fill the complete address space unless you spend excessive amounts of resources on the routing equipment, which has exactly one effect: Boiling the oceans (and/or slowing down the network). In order to keep routing efficiency up, you need a far larger address space than the theoretical minimum required to contain all nodes, because that is how you keep routes aggregated, and therefore the routing tables in the core of the network small, so you can use cheap equipment to move massive amounts of data at low prices.
> and then the protocol sees all the 0.0.0.0 numbers as a short cut for a standard 32-bit range (I supposed with bits above and below for the eventual address expansion) while the : notation is the real full address. Then my day one experience is install the new software that is 128-bit aware and leave all my configs alone since I already have been assigned those in the IPv4 address space. Now we can do some phase 2 to get rid of the evil NAT and clean other clean up.
In other words: You don't have the faintest clue what you are talking about, what you are describing makes no sense, though it sounds somewhat like the common anti-IPv6 crackpot theories.
If you add even a single bit to the address, you have a protocol that is unavoidably completely impossible to interoperate with IPv4. IP is not some markup language, it's a frame format. IPv4 has a fixed sequence of bytes forming the packet header, and in a fixed position in that sequence there are 32 bits for the destination address. Every single IPv4 router and host on the planet uses exactly those 32 bits to make routing decisions. If you insert a bit somewhere, that just moves everything following that bit one bit to the right and therefore makes everything following that bit incomprehensible to every single IPv4 router and host on the planet. The only solution to that is to add support for the additional bit to every single router and host. Sounds familiar?
And it's not just that you have to upgrade every single router and host to complete the transition, it's also that a node that suports the additional bit cannot use that to talk to any node that hasn't been upgraded yet. If you only have an address with the additional bit, and you send a packet to someone who has only a legacy address and doesn't support the new format, they cannot send you a reply, because in order to do so they would have to know how to send a packet to your extended address, and every router along the path would have to be able to do so as well.
So, the only thing that remains that could theoretically have been done would be to reuse the existing IPv4 address space allocations as IPv6 allocations. But first of all, there is no advantage to doing so. The difficult part is adding the protocol support, address allocation isn't difficult. And then, there is a very good reason to start with fresh allocations for IPv6: Routing table fragmentation. The global IPv4 routing table now has ~ 700,000 entries as most ISPs or larger companies have tons of prefixes that they have collected over the decades. The global IPv6 routing table has ~ 50,000 entries, because most ISPs and companies have only one prefix. That means that a pure IPv6 backbone router needs less than a tenth of the TCAM of an IPv4 router, which should considerably reduce costs/improve throughput.
> There are a lot of much smarter people than myself, and I cannot believe someone didn't have a great suggestion to cause a minimum impact on all the folks running systems in the field.
Just as I said: You have no idea how it could have been done, you just know it should have been done better, because doing it better is easy, even though you don't have the slightest clue how.
> I hate burn the bridges system conversion.
Well, great then that IPv6 isn't one of those. You just deploy it in parallel with IPv4, gradually enable IPv6 everywhere, and at some point when you think IPv4 isn't worth it anymore, you switch it off.
It's not like that. People here can come up with plenty of ideas of how to do better, but they have no say in it or even ability to somehow influence it. It's those large corps that push for crappy ideas, as they always need them to be used to sell more stuff, raise barriers to entry, hurt competition, etc. Rule of thumb here is that if doesn't bankrupt a corporation like Cisco it's not a good enough idea.
This article started out really well, saying you shouldn't e.g. byte-swap as part of the overall task at hand, then wanders into comparisons with other languages. I had hoped it was instead going to say that you should use some verified parsing tool to unpack data. All the other languages, at some level, will byte swap! It is the choice of the level of abstraction (or lack thereof) which is the systemd bug here, not the choice of a C family language IMNSHO.
He talks like the way he writes code has been established as best practice for decades, and if he can just write enough blog posts and convince enough people, maybe we can solve this problem once and for all.
I think the only way to have high assurances is to program under a system that doesn't have these classes of errors. I don't care if it's a c static checker or a new language or what.
Unfortunately the long term solution to these problems in system programming seems to be rust. But it's going to take decades for it to be established and I'm going to look like a language hipster member of the rust evangelism strike force every time I bring it up.
But seriously, it's been decades and decades and we're still writing buffer overflows. What is the long term final nail in the coffin for buffer overflows?
> But it's going to take decades for it to be established...
I wouldn't be so pessimistic. Software is becoming increasingly complex and many people are starting to see that it is starting to get a little ridiculous. All we need is a tipping point in the form of a well designed kernel that works well. It doesn't need any hype, no HN announcements, nor major corporate backing. It just has to work and work well. That's how Linux got started. It filled a vacuum. We still have a vacuum in the form of crusty OS designs patched over with layers of code miles deep. That vacuum is the need for a new clean system for the networked distributed resource internet age.
What we need are more passionate hackers who work on this for the pure enjoyment and self satisfaction instead of selfish ego fueling; meaning don't code with the end goal of getting something on the front page of HN/Reddit or super git committer. Just code for the sheer enjoyment and challenge of it. Besides when you post these ambitious projects you get all sorts of jealous reactionary posts that attempt to pick apart your ideas and stomp on them. That can discourage people from making true progress.
What. That's a horrible idea. Not only would that require your hardware to understand abstractions several layers above what it actually operates on, it would also mean it'd have to check bounds of - for example - an array on every instruction accessing it. That is something compilers can do at more strategic places.
Please keep that complexity out of my CPUs, they're buggy enough already. At least with software we can fix issues later, which isn't a given with hardware.
LoadAddressIfSmallerThan makes for a long opcode name, but it doesn't look much more complex than your usual load. The added complexity is negligible.
And, of course, the added complexity of doing that in two instructions is also negligible. The entire problem seems to be that arrays are on a more abstract level than C operates. It looks like entirely a language problem.
Having the CPU do the checking within an instruction saves you instruction issue events. ARM have an interesting solution to this, pointer authentication: https://news.ycombinator.com/item?id=14479438
> several layers above what it actually operates on
The instruction set is something of an abstraction bottleneck - C is tied to machines that look something like a PDP-11, while a lot of the stuff that the machine is doing is invisible at the instruction set layer.
Pointer authentication is not a form of bounds checking. It solely exists to prevent a malicious attacker using a bug to directly overwrite pointers (and in practice, it's mostly limited to return addresses right now). Code which does buggy pointer arithmetic will still create valid pointers to data outside the intended buffer.
Turns out the x86 already has some bounds checking support; see chapter 17 in volume 1 of Intel's developer manuals and the BOUND and BND* instructions.
I worked (a long time ago) on Burroughs mainframes which did bounds checking of arrays (and in fact even type checking), because the whole machine was build around ALGOL.
I am not sure whether the experience from these machines was negative. It may be that when you move beyond one-dimensional arrays, bounds checking becomes more complicated.
Perhaps performance suffers too. Languages which do bounds checking often have intelligent compilers which can skip the bounds check in a variety of conditions. So even if the hardware is faster, it may be doing unnecessary checks. See for example : https://www.ardanlabs.com/blog/2018/04/bounds-check-eliminat...
I also believe that bounds checking in hardware is not worth it. Memory accesses dominate the runtime such that a few extra instructions for bounds checking is negligible.
Segmentation was expensive (loading an 80286 segment register was very slow and the segment table was impractically small) but could express "offsets between 0 and 42 are valid". Paging is much coarser and suffers from false negatives, because x[y] might accidentally walk off the end of x, skip a guard page, and land on a mapped page from another object.
I'm curious; what does your memory hardware and cpu instruction set look like in this scenario? What's the interface between the cpu and memory? (rough idea, not asking for a whitepaper here :) )
Rust isn't the long term solution either. I like Rust, but in terms of fully provably secure systems, it's a stopgap until the usability of fully verified approaches like ATS, F-star et Al reach a point where they can be economical.
Also what the other person said about burning down x86 and starting over.
Those are all things that could fix the problem. But let me ask this another way. Imagine that you travel into the future a decade or three and a computer scientist tells you "pretty much no one gets buffer overflows anymore because of X".
It seems unlikely to me that the answer is going to be ATS, or "we burned down x86".
Ok, sure, if you only focus on buffer overflows. Those are the low-hanging fruit of software security. If in future decades we can say "professional software is pretty much secure because of X", X is absolutely not Rust. And I don't think we'll still be using x86.
> "The other thing that you shouldn't do, even though C/C++ allows it, is pointer arithmetic. Again, it's one of those epiphany things C programmers remember from their early days. It's something they just couldn't grasp until one day they did, and then they fell in love with it. Except it's bad. The reason you struggled to grok it is because it's stupid and you shouldn't be using it. No other language has it, because it's bad."
Honestly, this is a pointless rant. "Do this, but don't do that (because that's stupid)".
> "No other language has it, because it's bad"
The reason you "don't have pointer arithmetic in other languages" is - it's been abstracted away from you, for your convenience, so that you can just concentrate on your business logic and not have to worry about the rest. You may not deal with pointer arithmetic, but the people developing your JVM/CPython/V8/Chakra sure do. Author is simply ranting about the possibility of abusing/misusing the mechanisms offered by C/C++.
Programming as a profession is strange because we each do such different things but all tools are available to us all the time. Imagine if this were true in the physical world; you would read arcticles written by accountants like "Dynamite Considered Harmful" with rebuttals titled "Why Dynamite Matters" by construction engineers...
You see this all the time here on HN. "Interpreted languages are useless" coming from the systems and embedded people. "You can't call yourself a developer if you're not an expert in [FOTM web framework]" from the web devs. "Mutability is evil and reassigning variables is a sign of incompetence" from the functional programming sect.
It's almost as if things aren't really black or white, there's usually plenty of grey in-between. Things like use-cases, scenarios, trade-offs, intricacies. I find it awkward that a security shop would publish an article using that language - I would personally be embarrassed.
The rest of the comment is good, it adds to the argument, and I even agree with it, but there is no reason for this to be here. It doesn't add anything; it's just like calling names. From the HN guidelines:
> Don't be snarky.
> When disagreeing, please reply to the argument instead of calling names. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."
EDIT: To be downvoted like so for reminding people of the guidelines, HN sure is changing. I guess no good thing lasts forever.
Do you just take issue with the specific claim that no language should have pointer arithmetic, or do you disagree with the more general thrust of the post?
Sure, it's ranty and, like, maybe closer to op-ed than computer science publication, but I think even among the rarefied breed of people developing your JVM/CPython/V8/Chakra, you could easily find approval for the notion that pointer arithmetic is bad(*) and should be avoided as much as possible.
Array-indexing in C and C++ _is_ pointer arithmetic! It's syntactic sugar. Array-indexing in safe languages is bound checked iff the compiler can't prove the index value to always be in-bounds.
In case anyone doesn’t already know, this isn’t a figure of speech. array[5] literally translates to *(array + 5) in C. Since + is a symmetric operator, that means 5[array] is a valid, equivalent expression.
Additionally, I don’t think it’s reasonable to extrapolate from the mess that is systemd to all C/C++ code. Some fantastically reliable systems have been developed with them.
The rather less strongly titled "Systemd-networkd memory corruption" pointing to the much more detailed Ubuntu bug has also been on Hacker News for a day.
Some of these complaints are more "waah, I don't like C and what people use pointers for!", which is a shame, because some are pretty reasonable concerns.
systemd works, that's why people use it. If it needs improving, then improve it, or file these. Somebody will probably agree and at least look into them, rather than a gaggle of strangers pointing and laughing.
I have to admit I stopped reading after the combined section on byte swapping and casting external data to a internal structure. Perhaps the author of the code should have used ntohs() instead of be16toh(), but the effect is the same? There aren't any conditionals in ntoh...
The argument against byte swapping suggests that you shouldn't take a value of a numeric type and apply some (maybe no-op) permutation to it to produce another value of the numeric type. Instead, you should only ever think about byte order at the step where you have an array (or stream, or...) of individual bytes and are converting it to a value of a numeric type (larger than uint8).
I think this is also the main point of the post you linked: the "bad" example has a numeric value and then performs byte swapping on it, the "good" examples only ever construct "correctly ordered" numeric values.
If you can possibly get to a place where you have a uint16 but the value is "in the wrong byte order", you've done something wrong already.
Admitting the possibility of values with the wrong byte order into your system casts every single value into doubt. Of course it's possible to resolve that doubt on a case-by-case basis by carefully identifying the right places to insert ntohs()/htons() calls, but to me that sounds like creating extra chores for yourself while working against the language.
This philosophy is similar to Python 3's hard separation of bytes and strings. Bytes are raw data you get off the wire and which are encoded in a certain way. Once decoded into a string, you have a clean, consistent internal representation that behaves as you expect a string would. If you're juggling (or even thinking about) byte encodings past the parsing stage and before the output stage, you've done something wrong.
Yeah, it generalizes to an argument for using types to encode assumptions about data, so you have static certainty/remove dynamic uncertainty. I'm strongly in favor of all arguments that let me remove sources of uncertainty from my mental model, since enough stuff I'm uncertain about always remains.
Yes, agreed. It is a step backward to manipulate and recombine bytes. "Extracting an integer from a packet, then changing it's internal format" is exactly what is being done, and thinking at that abstraction level is beter than playing with individual bytes. And suppose you hoist those operations into a separate function instead of doing it for every field? Well, congratulations on reimplementing ntohX. If the boundary between internal and external data is not clear, that's a bigger issue that could cause incorrect or unnecessary data transformations, whether we're talking about byte order, string sanitizing, unit conversions, etc.
The problem is that the higher-level abstraction is broken. Specifically, the abstraction provided here is "extract an integer of unspecified size and endianness from a byte stream" (note that the size of short isn't guaranteed to be anything specifically, other than 2 <= sizeof(short) <= sizeof(int)). And ntohs is another abstraction which amounts to, "make the endianness correct for the current system, assuming that it was originally big-endian". Not that it doesn't work - it's just not well-designed, and too easy to use incorrectly.
A proper abstraction is the one where you specify both the size and the endianness at the point where you extract the value - i.e. where you just call a function like read_int16_be(byte_stream).
Who the actual fuck cares about native byte swapping instructions. We're talking about DHCP code, there is no computer in the universe on which parsing of DHCP packets is even noticeable as a blip on the performance chart.
We're talking about safe ways to write parsers for byte streams. The original article used systemd's DHCP bug as a motivating example, but none of this is specific to DHCP.
The blogpost you link seems to basically recommend the same as the author, except they only talk about the conversion step and not about where it should take place too?
That post by Rob Pike is arguing that you should not need to check the endianness of your system when swapping. Not that you should never byte swap. Systems and protocols have different endianness, which needs to be handled by programs where endianness has not been abstracted away.
which is the same the submission says? Extract the data using a function that produces a value "mathematically" that's always correct for the host machine. His example uses multiplication instead of shifts like Pike, but that's a minor stylistic difference.
Rob Pike is saying that your code should never care/check what the host byte order is; if you use the libc FOOtoh() functions, you aren't caring what the host byte order is, you're just converting something to it.
Put another way, Pike is saying that libc should have implemented be16toh() like:
The issue, I think, is when the swapping is done. If you cast raw bytes into a structure you can forget to swap. If you read fields individually with a helper functions that swap when needed, you can't.
Whether there's a endianness conditional anywhere there or not is a bit of pedantry, I think.
The code is also wrong because of strict aliasing. This is a real problem, your program can in fact exhibit undefined behavior because of this (it happened to me).
Some time ago I wrote some code to make these things simpler in C++. It allows you to define "structures" (which however are not real structs, it's an abstraction) and then directly access an arbitrary buffer through setters/getters. The code is here with documentation: https://github.com/ambrop72/aipstack/blob/master/src/aipstac... . In fact this is part of my own TCP/IP stack and it includes DHCP code (here are my DHCP structure definitions: https://github.com/ambrop72/aipstack/blob/master/src/aipstac...).