Hacker News new | past | comments | ask | show | jobs | submit login
The Development of the C Language (1993) (bell-labs.com)
255 points by gus_leonel on July 9, 2023 | hide | past | favorite | 325 comments



Original title: The Development of the C Language

Editorialized title: “C is quirky, flawed, and an enormous success” – Dennis Ritchie

HN Guideline:

> [...] please use the original title, unless it is misleading or linkbait; don't editorialize.


My career is in full stack web development but I program in C in my master's degree coursework and as a hobby. Every time I have to peel back a decade's worth of CSS to move a button on a webapp I daydream of moving to a career in C. Is the grass actually greener on the other side?


I'm an embedded Linux engineer and I love C and Linux in particular. However, I'm considering dipping my toes in non-embedded stuff for a while, particularly full-stack development and wondering if the grass is greener on the other side from me, haha!


Having made the flip the other way, I highly recommend listening to your instincts.

Neither side is objectively better or worse, but having experience in both has changed how I approach problems.


I did that... Then committed myself back to embedded a couple of years later.


why is that? I am an embedded linux guy but I'm learning full stack all the time, plan to be able to do both, though, it's really hard.

embedded linux pay is fine but not great, not sure how it compares to full stack jobs.

full stack at least is more remote friendly as it does not need deal with hardware hands on which is not remote friendly, and, things made by full stack is potentially more scalable.


I am going sort of the embedded full stack way. Like include microcontrollers and fpgas along with Linux embedded. Most of my hobby projects are heavy on electronics and ham radio. I'd rather spend more time getting a better understanding in RF than getting better at react, typescript etc. Don't get me wrong, if you need a good UI it seems way more versatile to have a web server and a rest api than a qt GUI, but getting better at that just doesn't feel nearly as fulfilling as learning one more corner case in antena design.. to each his own


Depends on how much you would appreciate to do GUIs in C, Win32, Motif, X Athena Widget, GadTools, GEM, Gtk 4,...

You would appreciate that webapp buttom.


I'd love to have a career in C too. To me it seems like Linux kernel development is the most obvious C programmer career path. Anyone know of alternatives?


Pretty much any low-level library in userspace as well, though there's a (very slow) move towards more security-focused languages/variants in those areas (probably also true of kernel space). Though C++ creeps in as well (we see a lot of "C with templates" coding).


Maybe whatever RedHat is working on. Last I read there are only 2 people paid to work full time on GTK. I'm not sure if they are looking though.


I started on front end work early in my career (mid nineties), moved to full stack and now write C full time for embedded (STM32) and Linux (RPI & Nvidia). I've essentially been digging deeper and can't seem to put the shovel down.

I don't think I would have appreciated it at different times in my career, but for me, right now, I'm loving every minute of it.

The biggest issue I've faced (beyond the obvious issues of getting anything to work at all), is how to organize the concepts.

The "Data-Oriented Design" folks have had a huge impact on that. Specifically, talks from Andrew Kelley, Mike Acton and the book by R. Fabian.

The second thing is registers. Just toss the HAL mess, pick up the Reference Manual and start poking registers. It's so much more enjoyable (and reliable) for firmware work.

I don't know about job opportunities as I'm running my own hardware business, but if you're feeling pulled in this direction, I highly recommend taking a closer look.


I program in C (HFT) and I love it


That is interesting. Most HFT I hear uses C++. Why C?


The people who were there at the start were more proficient at C


I’m also a “full-stack” web developer and would love to transfer to a career in C (or Rust also?). There’s such a wide variety of interesting problems to be solved over many applications.

I’m not sure how I would begin making a career transfer. Would anyone happen to have any advice / experience on this? I would be really grateful!

(based in UK if that helps)


The best way to do it (if you can) is to get paid for making the change.

First, study C and/or Rust on your own. Maybe do a personal project or two.

Second, find something at work where Rust or C would bring some benefits. Tell your boss that you think this would work better in a language like Rust or C, and explain why. Volunteer to try to do it. (Note: Appearing too eager at this point might be a mistake.)

Do the second step a few times and you become the Rust/C expert. And you get paid as you do the work that helps you get better!


Also know your boss. I know I cannot suggest such a thing to my boss. Generally, the more "serious" your company is with an established stack the less likely you're going to get a green light to do anything not blessed by management.

If OP plays it wrong they can also look like a tonedeaf dunce. Definitely try to read the room.


Thanks so much for your advice! I agree, having the opportunity to use the language as part of your daily-work helps massively, and is one of the best but most difficult options to realize. I'm aware this will come with some sacrifices (pay reduction, longer hours to catch-up) which I'm willing to make while I become more competent.

As for the second point, that is a great suggestion. However, I'm very limited by my current working environment (regulation, corp. restrictions, etc.) so it becomes a little more difficult.

I believe I will just have to push very hard for option one and continue to study areas of interest in my spare time. I'm reading xv6: a simple, Unix-like teaching operating system which is helping me grasp some practical applications using C.


It depends on what you like. I do Linux kernel development, so I program in C and currently learning Rust.

The kernel has a lot of it's data structures and functions so my work revolves around using that instead of the built-ins.


Im currently working on a large scale commercial app in C. You have to be very careful with memory and safety, but overall it's so much better than writing apps in JavaScript, Java, or any other language I can think of.


That's very fun, very demanding, very well paid (when you have experience and know what you are doing) (as much as you can with C)


Very well paid? Where the hell is that? I mostly do embedded and IMO the pay just isn't as good as web dev stuff.


Both a specific request here and a general proposal for a norm here on HN: Can you be more numerically explicit when you say "well paid"?

Software development is a very segmented world. In the various social circles I'm connected to, I know devs who were thrilled to finally be making 6 figures 7 years out of college (in line-of-business software in a regional hub) and devs who were disappointed not to have crossed 400 k$/yr in that same time span (at FAANG in the bay area).


CSS can control a web browser (software). C can control a computer (hardware).

That is how I see it. Others may see it differently.


Doing embedded in almost exclusively C since 15 years. Never had a dull day.


Related:

The Development of the C Language (2003) - https://news.ycombinator.com/item?id=19338525 - March 2019 (10 comments)

The Development of the C Language - https://news.ycombinator.com/item?id=15134903 - Aug 2017 (22 comments)

The Development of the C Language* by Dennis Ritchie (1996) - https://news.ycombinator.com/item?id=11973627 - June 2016 (1 comment)

The Development of the C Language (1993) - https://news.ycombinator.com/item?id=10749358 - Dec 2015 (28 comments)

The Development of the C Language - https://news.ycombinator.com/item?id=3439843 - Jan 2012 (1 comment)

The Development of the C Language - https://news.ycombinator.com/item?id=2258287 - Feb 2011 (7 comments)

The Development of the C Language - https://news.ycombinator.com/item?id=726519 - July 2009 (1 comment)

The Development of the C Language (Dennis Ritchie) - https://news.ycombinator.com/item?id=365080 - Nov 2008 (1 comment)


"Although the first edition of K&R described most of the rules that brought C's type structure to its present form, many programs written in the older, more relaxed style persisted, and so did compilers that tolerated it. To encourage people to pay more attention to the official language rules, to detect legal but suspicious constructions, and to help find interface mismatches undetectable with simple mechanisms for separate compilation, Steve Johnson adapted his pcc compiler to produce lint [Johnson 79b], which scanned a set of files and remarked on dubious constructions."

Since 1979! And people keep complaining about being forced to use static analysis in C, on build pipelines.

"I know better", yeah, sure.


You don't need a separate linter for this stuff today, proper compilers (anything but MSVC basically) have the most important type-related warnings in the default warning set, and it always makes sense to bump warnings to the highest level (both in C and C++ btw) - also (because I know this will be brought up): implicit conversion of a void* to other pointer types is a feature, not a bug ;)


Even enabling -Wall is a debatable point of view in some circles.

VC++ is actually quite good, has SAL, /analyse and SFIR. Also much better than many other compilers, when looking beyond the big three.

Implicit conversations is a common source of errors. Certainly a nice feature for pentesting.


MSVC is almost completely silent with the default warning level though, both in C and C++ mode, at least from my experience (it might have gotten better in very recent versions, I wouldn't have noticed since first thing I do is bump the warning level to /W4 anyway).


they should simply use c++, like k&r did to compile their example code in their 2nd ed - see preface to book if you don't believe.

i will never understand why C programmers get so upset about C++. of course, the latest revisions of C introduce some new features not in C++, but nothing really major. if you want good type checking, compile your C code with C++, and fix all the type errors you will get.


> compile your C code with C++

You're most likely not aware (as most C++ coders unfortunately), but this advice is useless today since it's not possible to compile current-time C code with a C++ compiler, the two languages have diverged too much since around the mid-90's.

A C++ compiler only accepts a "common C/C++ subset", but this subset hasn't been updated to include C features that had been added after ca 1995.

A better advice is to simply use the highest warning level and enable warnings-as-errors, this gives you mostly the same type checking as in C++ (minus the void* conversions, but that's how it should be since a void* is basically an "any*").


I dreamed about a simplified c++, that is better/safer than c but much simpler than existing c++, call it c+, a subset c++ that enhances c but not bringing in all those c++ complexities that I 99% do not need in daily coding.


There are many new languages that describe themselves as taking C, adding some good stuff (slices, defer), removing some of the bad stuff (textual includes) and adding some of their own spice. One might have the right mix of features for you: Zig, Odin, C3, Hare, Jai.


There is such a language called D.


sadly no time to pick a new languages these days. will just cherry pick c++ features and build my own subset of it and call it c+ myself, this seems the quickest way to get daily coding done fast for me.


Petzold did the same on his highly acclaimed book for Windows 3.x book, also with a note on the preface regarding type safety.

Likewise, Microsoft introduced windowsx.h header file, to improve type safety while using C for Windows 3.x applications.


> Petzold did the same on his highly acclaimed book

takes me back. i thought it was crap. i used to work at The Instruction Set (one of UK's biggest tech training companies at the time) and everyone hated the Windows/C course based on Petzold. my boss came up to me (somehow I was the windows guy in a unix company) and said "we need a new windows course" and me said "OK, i need a framemaker license and to work at home for a week" - worked out great.

this was early 90s, i suppose?


Around 1991.

I mostly programmed on Windows 3.x with TPW and TC++, alongside OWL. Those nice Borland manuals.

For me, in what concerns C programming for Windows 3.x, the "Programmer's introduction to Windows 3.1" was a much better book.

Mainly due to its coverage of windowsx and message macros.

It's on the Internet Archive.

https://archive.org/details/programmersintro00myer


> Around 1991.

how do you know that?


Because you asked me about the date I got the book?


> they should simply use c++, like k&r did to compile their example code in their 2nd ed - see preface to book if you don't believe.

So what? And GCC allegedly has been C++ for many years. Please take a look at the repo and tell me why this means anything (besides language wars being a waste of time).

My personal experience with C++ is that I seem to always end up peeling off my nice abstractions again later. Most of what it offers hasn't stuck for me, at least for systems programming. There's a lot of bad C code I've had to work on over the years, but overzealously architected C++ codebases take the crown for inflicting the most pain for sure. One recent experience was when I replaced 4 files and 200 lines of C++ classes with 4 lines of straight C code. Not even a function was necessary. And that was one of the less bad experiences because it was actually possible to fix.

In my most recent attempt to be open-minded about it I've ended up keeping a few short methods, which can be nice for code brevity at the call site, and there is less of a tax about having to come up with naming schemes. But I have otherwise found classes (and in particular methods) to be painful for two reasons: All the procedures operating on the class have to be declared in the class (or as static methods in a friend class, but then they have to be declared there). This includes private methods and is just one more level of annoyance for a small (and debatable) syntactic convenience. It's pretty f***ing bad to have to turn implementation details to the outside (it extends transitively to implementation types used in your private methods etc.), and that is a big reason why C++ projects have infamously longer compile times compared to C projects.

Another problem with methods is that it seems you can't define them as having "static" linkage, at least not with MSVC. I suppose this can increase link times and prevent the compiler from making some optimizations.

One other thing I did was trying to buy more in to RAII, for example doing ref-counting in an automated way. It's another area where I feel I've lost a lot of control over what happens (it's hard to get it right), and my codebase is slowly deteriorating.

Another big problem that I personally see is implicit "this". C++ would already be a much better language without this. It's a bad tradeoff IMO (and Python was right to make it explicit), I can't see a benefit of not typing "this->". It is misleading while reading, and frequently having to change method parameter names just to be able to access both is a real annoyance. (From which code style rules like "m_" prefix arise, which typically don't get followed 100% -- so you'll see locals with m_ and members without it -- and which add two more characters, making the implicit this even more useless).

After a few months I am back to writing simple structs with none of this counter-productive (at least for small teams) access protection, and simple plain functions.

That's only scratching the surface of C++ (it's only about "C with classes" so far). I do know "modern C++" to a degree, and when digging deeper into the trends from the last 1-2 decades it gets much worse. I've been following what the C++ committee is up to these days and it seems they are stubbornly penny-wise but pound-foolish. One recent example -- they have now improved the type inference of "this" !!with an added new syntax!! [0] so you can have "easier" CRTP patterns. Does anyone but the most extreme freaks still understand what actually happens there or is the C++ audience mostly an army of copy&paste coders?

As a proficient C programmer, it's so much easier to be annoyed about most of C++'s features, because they break so quickly when put under stress, and just being a bit more explicit with C-style code seems to often lead to more maintainable code and actually not that much more code, sometimes even less (not having to deal with all the crazy abstractions).

That all said, throwing in the occasional templated function or class for good measure can be incredibly powerful, much better than dozens or hundreds of lines of C macro generator hacks. It can be useful also when working with IDEs. But it's a slippery slope and mastering it is hard.

[0] https://www.sandordargo.com/blog/2022/02/16/deducing-this-cp... . There is also a youtube video somewhere.


my point was that k&r used c++ for it superior type checking (and because the C++ compiler could actually compile C89 code, which no commercial compuilers at the time could) - obviously in a book about C they did not use C++ features. so i am not sure what you are going on about here.

i learned assembler and fortran, then abandoned them for C, and then abondoned C for C++. i did all that progression because it self-evidentially made me more productive.


Couldn't care less about the type checking differences (apart from using abstract classes with virtual methods, but that's not C syntax), they're quite small and the C++ way is in fact an annoyance e.g. when interfacing with straightforward void pointer APIs. That reminds me of enum class, another feature that brings something nice to the table (properly scoped enum names for better IDE completion) but is almost made unusable by the fact that they can't be easily used with bitwise operators.

The thing about C++ is that many of its features start with a good intention, but most of them are so specific that they have to go down one almost arbitrary route (non-orthogonal decisions, like enum class introducing at least 3 changes at once) and pessimize the other use cases. Good luck refactoring your codebase when you realize you have to change your approach and it's no longer supported by any kind of specialized syntax. That's a problem that C mostly doesn't have -- most of its features are needed when programming a computer, and they're minimal and orthogonal, with few ways to paint yourself in a corner.


> like enum class introducing at least 3 changes at once

changes to what? they don't clash with the original horrible enums at all - all your code that used original C-style enums will still work.


I should say differences -- in behaviour when compared to traditional enums.


well, they are different (and better). but if you want to use old C-style enums, go right ahead - you can do that too. i don't see why you are complaining about a new feature that in no way clashes with an old one.


I do not actually think they are better. They are broken for most of my use cases. Most of the time I'm better off using traditional enums with explicit sizes (a C++ extension that I deem sane), optionally wrapped in a namespace.

The mere existence of all those features costs a lot of time just to understand and navigate them. They can diminish productivity. C++ is a huge language. It has tons of features that suck up a lot of time until you understand the space where the features can be used to good effect.

And "good effect" often means just writing the same thing in fewer characters, or a little more type-safe (which often wouldn't be needed if the design were sane).

And if the requirements slightly change and that good feature breaks, enjoy your rewrite! Or add another set of abstractions or even macros to work around the breakage -- like MS did in case of enum class bitwise operators for example.

That's why many people restrict themselves to C entirely -- no time to waste on finding out why C++ features X, Y, and Z all don't work for the given problem. Time could be better spent than with obsessing over all the ways in which an arbitrary set of features can be abused to write the implementation in fewer lines of code, meanwhile making it harder to read & write. Just think about the algorithm & the data layout, and bang out the code.


People who still write C, honest question: Why?

C is full of quirks. From cryptic "undefined behaviors" to a type system that isn't really a type system (more like "size hints for the compiler"), the language doesn't feel easy to use/debug. Add to this CPP macros, a universally recognized bad idea, a clunky import system, and lack of a single reference implementation of the compiler/libC, and you have a language that is harsh to defend.

Also, documentation is all over the place. If a function isn't described in `man`, I have no idea where else to actually look for it.

I used to think "C presents the most honest representation of the low-level mechanisms of the computer", but... even this is shaky. I've been programming for almost 15 years now, and I don't think I've ever seen a computer where memory is actually a continuous array of bits sorted by memory address. The C representation of memory (and all the pointer arithmetic) is not a real representation of your hardware, and this too is an abstraction.

So, setting aside the need to maintain 30+ year old code, what would be modern reasons to start a new project in C?


1. It gives me a lot of control over how the program works, which lets me create programs that work faster and use less memory than would be possible in most other languages.

2. Relatedly, it's more explicit than almost any other language. If a line of code doesn't look like a function call, it's not calling anything. There is no hidden control flow. These statements are not true in languages which support operator overloading or exceptions. The only real competitor to C here is Zig.

3. If I give a Linux user the source of a C program, they can probably compile it with the tools they already have. This will most likely be the case 20 years from now too, as long as I keep my C mostly standard-compliant. I'm not sure that code in newer, faster-moving languages like Rust will stay compilable as long.

4. It's a lingua franca. C libraries can be used from most programming languages without too much effort.

I probably wouldn't start a large project on a tight deadline in C, but I think it's a great language for writing new command-line utilities and for rewriting tricky algorithmic code from scripting languages. I've gotten 100x and even 1000x speedups from replacing a couple of Python functions with C.

The ease of use is about to improve with the C23 standard, which I'm very happy with. On the other hand, some tricky areas like aliasing are likely to stay tricky forever.


> It gives me a lot of control over how the program works, which lets me create programs that work faster and use less memory than would be possible in most other languages.

While it is true to a degree, I would also add that due to its low level of expressivity, you often have to introduce less efficient solutions simply because language deficiencies. Things like small string optimizations in C++ are simply not possible in C.

2 is true, but it comes at the expense of bad expressivity, see the former point.

3. Well, will it really compile to what you meant? If you have UB, it might still compile but the semantics of your program could change entirely depending on which compiler and which version you use.

Also, your Python point: well, that’s because you used python in the first place, which is very slow even among scripting languages.


> Things like small string optimizations in C++ are simply not possible in C.

I don't think this is true, I've seen a bunch of libraries implement SSO in C:

https://nullprogram.com/blog/2016/10/07/

https://github.com/stclib/STC/blob/master/include/stc/cstr.h

https://github.com/mystborn/sso_string/blob/master/include/s...


> due to its low level of expressivity, you often have to introduce less efficient solutions simply because language deficiencies. Things like small string optimizations in C++ are simply not possible in C.

You don't have to use an inefficient solution. You can always roll your own optimized solution or use a library. I agree that C++ has some nice string optimizations built into the standard library, but it's not obvious to me that they're always better than the simplicity and predictability of a simple chunk of memory.

Besides, you generally don't write C code in the same way you write C++ but with more primitive tools. You often allocate a buffer once and operate on it; you don't emulate passing strings by value from function to function, doing lots of allocations and deallocations in the process.

> Well, will it really compile to what you meant? If you have UB, it might still compile but the semantics of your program could change entirely depending on which compiler and which version you use.

I'm not sure what point you're making here. If you have bugs in your program then it may work incorrectly, yes, but that's true no matter the language.


> You don't have to use an inefficient solution. You can always roll your own optimized solution or use a library

That’s not true. You for example can’t write a generic, efficient vector implementation in C - the language itself can’t do that. You either have to copy paste the same code for different sizes, or make use of some monstrous hack of a macro. Instead projects use hacks like conventionally placing the next/prev pointer in structs (linux kernel), and the like.

C++ is the de facto language for high performance computing, so I very much question that “you don’t write C as C++ part”, if anything you don’t write C++ as C as that would be inefficient.


Generic things are rarely efficient, the most optimal code tends to be specialized and tailored to specific hardware and/or the kind of data its operating on.

std::vector (which is a really inefficient way of doing dynamic arrays btw) can be cleanly implemented with macros (see stb stretchy buf) or by splitting the element data from the housekeeping data:

  int append(void *arr, size_t elemsize, size_t *capacity, size_t *size, const void *items_to_add, size_t num_items_to_add);


How is std::vector is inefficient?

Especially that that macro-hack from stretchy buf seems to do it in an even more naive way.

Splitting the element data is a different implementation with very different performance characteristics - it’s quite a bad thing if I have to resort to that due to a language inefficiency, especially in case of a language that is supposedly close to the hardware.


There are various constraints on std::vector because of language in the standard which makes concessions for generic use that might not apply to your application. Small vector optimizations aren’t possible in std::vector, also some operations that could be done in-place can’t be. You also give up control of some meta-parameters and allocation strategies that may be more efficient for your use case.


Six arguments - seriously? Avoiding that is the point of generic programming, and probably more efficient there too


You're talking about something that isn't related to efficiency. Copy and pasting, macros, generating code -- none of these preclude producing an efficient solution.

There is nothing in C++ that is inherently more efficient than C.


Except that more efficient solutions can be implemented much more practically? Solutions that you'd need to bend over backwards for in C?


What does that have to do with efficiency? We don't appear to be debating language ergonomics, but the notion that C is somehow inferior to C++ when it comes to performance.


C++ has a lot of compile time programming features that C cannot do practically. There are sometimes alternatives to those mechanisms in C, but they rely on mangling, macros, non-portable tricks, and so on.

On the topic of performance, the best counterargument to C++ from a C perspective would be that hand rolled code generation isn't all that bad in practice. It's just language theorists don't like that approach aesthetically.


> hacks like conventionally placing the next/prev pointer in structs

This is not a hack, it is the way it should be in C.


Except that that's a linked list and not an array.


You can store your 'list items' in an array and still link to random items in the array - although an index instead of a pointer would make more sense in that case, but what else is an index than a pointer with fewer bits ;) The main advantage being that you don't need to alloc/free individual items.


Great, so now we're writing our own memory allocator?


If you want to call about 10 lines of trivial code a 'memory allocator', then yup, we're totally going to write our own 'memory allocator' ;)


> I very much question that “you don’t write C as C++ part”, if anything you don’t write C++ as C as that would be inefficient.

Yeah, I was thinking about your string example when I wrote that. For high performance numerical code, I can see the advantages in using C++.


People who know how to use C rarely if ever have problems with undefined behavior. I particularly have written a huge amount of C code and my bugs have never been related to undefined behavior. This is an idea that has been spread to make people even more afraid of using C/C++. While there is a possibility of finding these problems, in practice it is almost a non-issue.


TBF, did you ever run your code with UBSAN enabled? There's a couple of UB cases which don't trigger any bugs until one of the popular C compilers changes some details in their optimizer passes, and which then only manifest with a specific combination of compiler options.


> People who know how to use C rarely if ever have problems with

I think we call this "No True Scotsman".

In real life lots of people write C because they want to or have to and they generate tons of bugs from bug classes that just aren't present at all in other languages.


> People who know how to use C rarely if ever have problems with undefined behavior.

I think the CVE database would disagree with that statement.


Same here, no problems with undefined behavior. Also, no memory issues either after done with code finalization using Valgrind.


“no memory issues in the tested state space”. That’s the only thing Valgrind can say. But it says nothing how a run with different input would behave, it just might segfault/leak/use after free/UB.


That is always the case in any platform. Just because something works on a Mac it will not necessarily work on a PC or vice-versa. If a language has multiple compilers, you also need to test in different compilers to make sure your code works there too. You're trying to make this as a C-only issue, when it is a general issue, maybe with different names.


C is an expressive language when you’re not working with strings and memory the way you do in most HLLs. Almost all operators return a value and can be nested in sub-expressions. Assignments and pre/post increments/decrements are expressions. The comma operator evaluates expressions in the given order and returns a value. There is a GNU extension called “statement expressions”, allowing you to define function-like macros.


>If a line of code doesn't look like a function call, it's not calling anything.

In C, if you for example write past the bounds of an array or otherwise do something that causes UB, there is no guarantee that the code you wrote in the source file is actually going to be what's ran.

If an attacker can clobber the stack (for example), the control flow you see in the source code and the actual control flow of the program are not the same.

In the worst case, an attacker can get your program to execute arbitrary code of their own choosing!

Maybe some consider this unrelated to the no implicit control flow thing, but I think when UB caused by a trivial mistake can alter your control flow, you have much bigger worries than an operator being sugar for calling a function.

I consider UB and arbitrary code execution exploits to be a case of implicit control flow!


> These statements are not true in languages which support operator overloading

I guess I will never understand the C and Java developers incredible fear of operator overloading.

Do you have the same reaction to user-defined functions? Because they are exactly the same thing. Is it because of the bad type system that won't let you know what operator you are using?


I guess I will never understand the C and Java developers incredible fear of operator overloading.

The answer is in the sentences right before the one you quoted:

Relatedly, it's more explicit than almost any other language. If a line of code doesn't look like a function call, it's not calling anything. There is no hidden control flow.

Consider the use-cases for C: operating system kernels, hard real-time software, low-level libraries, databases, embedded software. What is a common desire among these? Predictable low-latency and high throughput.

It's much easier to achieve these features if your language does not allow "magic." Implicit allocations, RAII, exceptions, overloaded operators; these are all examples of features which allow a library-writer to inject hidden control flow into your code. This can make it very difficult to analyze why code runs slowly or with unexpected random pauses, not to mention making it much harder to step through in a debugger.


The control flow is the same; you evaluate the parameters, and then evaluate the operator. Just like any other function call, there's nothing implicit or hidden. The only difference is that you can't create other operators with the same name for different types.

And whether something is called or run inline is always decided by the compiler. Modern C doesn't promise you any relation between the way you break down your functions on your code and the actual function calls on the assembly it generates.

So, I keep seeing people complaining about overloading; always with the same reasons; that are patently not valid unless there's some implicit assumption they keep not stating. What is that assumption that breaks the equivalence between user-defined functions and operators?


Just like any other function call, there's nothing implicit or hidden.

The implicit part is the question of whether an operator is built-in or overloaded. In C, every operator is built-in, so you can look at a block of code and see that there are NO function calls in it. With something like C++, you must treat every operator like a function call.

With C, if I write:

    a += b;
I can be VERY confident that this line of code will execute in constant time. With C++ (or other operator-overloaded language), I cannot. I need to know what the types of a and b are, and I need to go look up the += operator to see what it does for these types (and this is not one universal place, it's specific to the type).

Furthermore, this may be the last line within a particular scope. With C I know that nothing else will happen, and that the control flow depends only on the surrounding scope. With C++, I don't know this! There may have been many objects created within this scope and now their destructors are firing and potentially very large trees of objects are being cleaned up and deallocated, and even slow IO operations running.


> With C++ (or other operator-overloaded language), I cannot

All programming requires people to follow reasonable conventions. In C++ if you make a dereference operator with non-constant time, or an equality operator which doesn't follow equality semantics, the programmer messed up. It's like giving a function a misleading name, like `doThis()` and it doesn't.

Note that Java is filled with these kinds of conventions, such as overloading `equals`. How can you be certain it actually obeys equality semantics? You have to trust the programmer.


If I see `x+y` in C, I know 100% that it'll be ~0-1 instructions, O(1), and will have the lowest latency & highest throughput that a thing can have, i.e. basically completely ignorable for figuring out the perf of a piece of code, or determining what complex things it may do (additionally, it'll hint that the operands are pointers or numbers). For `f(x,y)`, none of those may hold. With operator overloading, f(x,y) and x+y have the exact same amount of instantly tellable facts, i.e. none. x+y becomes just another way to do an arbitrary thing.

In C, if I'm searching for how a certain thing may be called from a given function, I only have to look for /\w\(/ and don't have to ever think about anything else.

Honestly, operator overloading isn't really that bad (especially if an IDE can highlight which ones are), but it's still a thing that can affect how one has to go about reading code that might not even use it.


However as a novice I found it unintuitive that on an embedded platform without hardware floats x/y will compile but compiles to a polyfill with quite a few instructions.


That’s the only caveat. With operator overloading, the scope for what happens on a given line of code expands dramatically. Now your entire dependency graph is part of the search space. Heck, the operator might not even terminate at all!


> That’s the only caveat.

a = b + c;

Is the addition done by itself, so it costs 1 clock cycle? Is it merged into some complex operation so the net cost is less than 1 cycle? Is it completely optimized away at compile time, so it's infinitely faster?

Does the addition trigger some trap, that will run some distant code?

Is the addition by itself? Or are there store and load instructions that can stall for way more than 1000 cycles?

I doubt you can answer any of those questions. All you and everybody else keep repeating is you can micro-optimize C better because that line, that you expect to take something from 0 to 2000 cycles is certain to not do a call and return pair, that takes less than 10 cycles. All while the alternative is almost certain to do the exact same, but you would need to check it up.

Honestly, that argument doesn't make sense; and I keep understanding it as people complaining that they want to micro-optimize a program, but don't know if it's operating on native integers or 10-dimensional hypermatrices.

At the same time, every single person that is good at micro-optimizations look at the compiled binary as a first step, because C is a high-level language that has little relation to the code the compiler actually creates.

For a long time I did just shrug it away and file those complains as "those people don't even know the language they are using". But its universality forces me to consider that there is a reason for complaining, and maybe it's worthwhile to understand. Now, given that this is all the answer I get, it seems quite likely that even the ones complaining don't consciously know what the problem is... But one thing is certain here, the people repeating that execution time is well known didn't actually practice micro-optimizations based on that fact.


Your argument boils down to this: because we cannot look at an operator and have a 100% iron-clad guarantee of the exact sequence of instructions the compiler will ultimately emit, we should throw it all away and just settle for every operator in the language potentially being a function call that might be O(1) or O(n) or even O(2^n). That's called throwing out the baby with the bathwater.

every single person that is good at micro-optimizations look at the compiled binary as a first step

That isn't an option when you're writing portable code that runs on many different platforms, some of which may not even exist at the time you're writing it. Furthermore, micro-optimization isn't the only reason operator overloading is bad. The implicit flow control dramatically inflates the search space for what every single operation can do, making all code much more complicated to inspect at a glance. This carries over to debugging, where stepping through code is much more cumbersome when each operation can involve large amounts of indirection.


> Is the addition done by itself, so it costs 1 clock cycle? Is it merged into some complex operation so the net cost is less than 1 cycle? Is it completely optimized away at compile time, so it's infinitely faster?

Those are generic instruction selection/optimization questions, which are always gonna be *additional* complexity to any and all operations everywhere. So there's still benefit in cutting down the complexity elsewhere.

> Is the addition by itself? Or are there store and load instructions that can stall for way more than 1000 cycles?

..those are questions about the loads & stores, not addition. On embedded, afaik loads & stores will be significantly closer in latency to arith too.

> At the same time, every single person that is good at micro-optimizations look at the compiled binary as a first step, because C is a high-level language that has little relation to the code the compiler actually creates.

Yes, but being able to have good intuition is still quite important, because one can think & read code much faster than compile & read assembly.

> the people repeating that execution time is well known didn't actually practice micro-optimizations based on that fact.

The question of operator overloading is mostly about reading code, not writing it. And it doesn't have to be micro-optimization either, any level of optimization will be affected by a call happening where you don't expect one (probably most importantly the kind where you scan over a piece of code to figure out if it does anything suspiciously bad (i.e. O(n^2) or excessive allocations or whatever thing may be expensive in the codebase in question) but it isn't worth the effort diving into assembly or figuring out how to get representative data for profiling the specific thing).

Or you could just be exploring a new codebase and wanting to track down where something happens, where it'd be beneficial to have to just scan through function calls and not operators.


Right, that's definitely quite a strong point against the C operator-function separation. There can be a good argument made for just not providing unavailable operations as operators. But, still, x/y won't touch any of your memory (assuming a non-broken stdlib), so you're still free to skip over it while scanning for a use-after-free or something.


User defined functions require a function call pre- and post- amble to be added to the machine instructions that execute the function behavior. Typically this consists of growing the stack, adjusting required pointers at the top and then undoing that at the end. In C the operators defined by the language implementation do not involve any adjustments to the stack frame and do not invoke a ‘call’ or jump instruction in the assembly. Once operator overloading is possible this difference immediately becomes blurred.


I would say that C macros have inspired the development of concoctions of far greater magical qualities than, say, RAII. C programmers are not immune to violating the principle of least astonishment.


In terms of C functions _typically_ being globally defined, mostly unique identifiers are a good thing in terms of code readability.

Of course, C functions can be passed as variables. Or in a wider scope they might be inline, macros, or ifdef'd to different functions. But those cases are _typically_ recognized as undesirable and avoided.

Java's a bit of a different story, which I can't figure out a good way to explain. It's hard to explain problems in large code bases, as a quick example rarely suffices. I've seen more than one bug caused because foo.bar(qux) called a different method of bar than the original programmer intended (both because foo's bar was overwritten and qux was a different type than expected).

Don't get me wrong, I would use operator overloading in a heartbeat if I was writing code for a math-y CS coding assignment. It's fine for code that will have a lifepsan measured in weeks / months with probably only 2 or 3 people ever looking at it.

Saying what you mean, as clearly and directly as possible, has it's perks in certain applications (large code bases, life critical code bases, code bases that will last for decades with dozens of programmers). Otherwise stated, cases where code is going to be read many times more than written.

To answer your question more directly: User definable functions aren't a problem. Re-definable functions are!


> If a line of code doesn't look like a function call, it's not calling anything.

Why is that important to you?


Some of the worst bugs I have experienced are ones where code is executing without it being clear where it is executing. The front end stack is the most awful about this, where at any given moment all kinds of things might happen without notice. A clear, sequential program can be stepped through and understood.


It's important for reading other people's code. When I see a function call then I know that "anything" can happen inside that function, so I better investigate. For anything that's not a function call it is obvious what happens under the hood.

In languages like C++ I potentially need to check every operator if it is overloaded, and find the place where that happens (I think I haven't seen any IDE support to help with 'resolving' overloaded operators, but maybe that has improved in the meantime).


I did not touch C for 20y but its make code easy to read and understand. And helps debugging and error searching more easy task.


> If a line of code doesn't look like a function call, it's not calling anything.

except maybe allocating dynamic arrays, floating point ops if those don't exist in hardware. Then you have signal handlers that can be called on math errors, segmentation faults, ... . So basically every line in your code can implicitly call a function.

> If I give a Linux user the source of a C program, they can probably compile it with the tools they already have.

What is win32.h and why is it missing?

> This will most likely be the case 20 years from now too

What is Xlib.h and what do you mean I have to rewrite the apps front end from scratch?


Thank you for the input, one point that stood out for me was that you prefer to write command-line utilities with C.

Why is that? I use scripting languages mostly in my day work (Ruby and some Python bc AI) and have found my productivity using command-line utilities is amazing with Ruby. Do you do it bc of performance, ease of use bc you are proficient, a mix of both or something else?


Oh, I don't use C for every command-line utility. If it's something IO-bound like parsing a webpage and downloading a bunch of files linked from it, I write it in Python. The convenience of modules like argparse and requests is very hard to beat, and it would take me a lot longer to do it in C.

I reach for C when performance matters, for example when processing multi-GB files or looking for perceptual hashes that are similar. It can be a difference between minutes and hours of running time.


With C23 there's the #embed feature. Might be super useful for embedded software. How are the toolkits e.g. for ESP32 and TI in terms of C23 compatibility?


> How are the toolkits e.g. for ESP32 and TI in terms of C23 compatibility?

C23 hasn't been released yet so it's hard to talk about compatibility. It's probably going to take a few years before it's widely supported.


> The only real competitor to C here is Zig.

Why only Zig?


It's the only language I'm aware of that takes C's explicitness and pushes it even further: it bans some implicit conversions, and it makes you pass an allocator as an argument to functions which can allocate memory. Most languages choose to go the other way and introduce features like try/catch and operator overloading.


> C is full of quirks. From cryptic "undefined behaviors" to a type system that isn't really a type system (more like "size hints for the compiler"), the language doesn't feel easy to use/debug.

I guess because I just don't agree with this viewpoint at all. I've been writing C on and off for over 20 years now and I simply haven't encountered the amount of distress and pain that I see others deal with, especially when related to memory handling or undefined behavior.

I wrote a piece of software in Win32 C for a gas integration company many years ago that did tons of string manipulation to recalculate reports coming out of another piece of software. It even included a custom built on-disk database which basically ended up being my own version of BDB. Scratch that, I wrote this software twice because my first version was lost in a disk crash and I had to hex dump the database format to recover my original implementation.

Last I recall that software ran at that company for over a decade and probably helped them make millions in revenue. I didn't have a single support ticket and to be honest the last time I talked to the owner I thought they had just stopped using it. I was very surprised that they were still very happy with it and it was working fine.

That's just one of many examples of projects I've built or debugged in C. I've regularly been able to fix issues in OS drivers, large projects like Asterisk, and things like deadlocks in toolkit-based GUI programs. It's actually easier for me to use C than most other programming languages because it's clearer to me what should be happening, especially when dealing with anything systems-related.

That's just my experience. I totally get that others don't share that same experience but to be honest I'm pretty tired of seeing all of the confused hatred for C.


Adding that anything I wrote in C++/MFC at that time is now obsolete.

Everything I wrote in C/Win32 is as much fresh as it had been 30 years back.


While I undertstand the sentiment, MFC is still being maintained, and is in fact still the only C++ GUI framework worth using, being shipped in Visual Studio latest (2022).


Well... I'm in embedded systems. In embedded systems, you almost never change compilers. You usually don't even upgrade the compiler. Whatever the compiler is for a project, that's what it will be for that project forever. And in your case, it sounds like you only compiled that code with one compiler.

But as far as UB goes, that's cheating. We're playing on "easy mode". We know what that compiler is going to do, and that's all we need.

"Hard mode" for UB is when you have to worry about what a different, unknown, perhaps not-yet-written compiler is going to do with your code. What is the absolute worst that a compiler could do, within the rules, to your code? You and I don't worry about this, and it doesn't bite us. People writing library code do have to worry about it far more than we do.

So I agree that the concern is overblown. But I think that maybe we miss that it's a real concern, because it doesn't hit us.


Because everything speaks C.

If you write a library in C, it can be easily exposed to a variety of high-level languages and platforms.

You might argue this is more a property of the C ABI than of C itself, but unless the project is large enough that it's worth doing it in C++ or Rust instead, it's still a very reasonable choice.

Also not everything is web. Sure, if you're writing API endpoints in C you're just shooting yourself in the foot, just use Python or Ruby or Go and call it a day. For things like embedded it's often your only reasonable choice.


And since C has become more than just a language and actually a protocol (See https://faultlore.com/blah/c-isnt-a-language/), you sometimes would need to know the inner workings of C even when you write in other programming languages (C++, Rust, Swift, Zig, even Python, etc...)


So we're gonna be stuck writing a precambrian prototype language till the end of time because there's so much legacy code already written in it? Never seemed to stop people moving from Pascal, or Perl or literally all other languages that are now obsolete.

I really hate how for microcontrollers the only two choices are either C++ or Micropython, I mean how about some fucking middle ground instead of two polar opposites? At least eventually everything will be rewritten in Rust I guess.


> I really hate how for microcontrollers the only two choices are either C++ or Micropython

Why wouldn't you just use C for programming a microcontroller? Sure, it's not a great language for web backends, but microcontrollers are where it shines. You're probably not deploying 100,000 lines to a microcontroller for a personal project, so the lack of certain abstractions isn't going to be that painful. On the other hand, C lets you make the latency and memory usage 100% predictable, which can be a great asset.


Why wouldn't you use assembly for programming a microcontroller? Sure, it's not a great language for web backends but microcontrollers are where it shines. /s

Because as the OP states, it's an objectively (pun intended) terribly abstracted language. There is nothing 100% predictable about C except that you'll eventually get screwed because you didn't account for some random obscure thing that should never have even been possible to do. Any language that allows using static variables can have predictable memory consumption. There is nothing inherent to it that makes it better than a language that works at the same level but built to modern standards, except the piles upon piles of legacy code you can use.


Enable max warning level, use a static analyzer, and ASAN, UBSAN and TSAN (in order of importance), and most problems you listed just disappear. Most importantly though: don't use MSVC if you have the choice.


Yeah if you want to kill yourself from frustrations, maybe. I'm not writing microcontroller code for the fucking space shuttle, and I would suspect most people aren't.

C did a ton of things right, but it also did a ton of things wrong. Learning from that and moving on would be the sensible thing to do after 50 years.


> Yeah if you want to kill yourself from frustrations, maybe. I'm not writing microcontroller code for the fucking space shuttle, and I would suspect most people aren't.

You're really exaggerating the problems. Does your negative opinion of C come from experience, or did you listen to the Rust evangelists who have an incentive to make the difficulty appear bigger than it is? Because it hasn't been my experience that C is this huge minefield of bugs that are impossible to explain or debug. You prevent a lot of bugs by actually understanding the language instead of coding by trial-and-error, the remaining bugs usually get caught quickly if you use an advanced compiler like GCC or Clang with the right flags (warnings and sanitizers), and for the occasional bug that slips through, the debugger tends to be helpful.

It's true that C has a bunch of historical footguns like gets and strcpy that you need to avoid. It's a very bad language to learn by trying random things and seeing what works. However, it's possible for a "mere mortal" to write good code. You just need to do more up-front learning than you could get away with in e.g. Python. If you pick a good book and listen to experienced programmers, they will tell you what to do and what to avoid.

And regarding abstraction—you can go very far with just structs and pointers, but you have to do things the C way rather than trying to write Java in C. If it's enough for Linux devs and their millions of lines of code, it will be enough for your personal microcontroller projects.

There is a very promising contender in the low level space that aims to fix some of C's problems, it's a new language called Zig. However, it's at a pretty early stage; even if it catches on, it will be many years from now. Right now, if you want to do low level work, you'll benefit from becoming good at C.


Tell me an alternative which ticks all the checkboxes and I'll switch immediately. C++ isn't it because the committee has completely lost focus since ca C++11, Rust isn't it because they completely forgot about ergonomics, simplicity and elegance on their quest to fix memory safety (and both C++ and Rust suffer from "design by committee").

Zig looks perfect so far, but it's too early to switch over yet.

Any other promising candidates?


You didn't say what the checkboxes are, but... perhaps the 'BetterC' subset of D? https://dlang.org/spec/betterc.html#retained

Or D itself if you don't need a language as minimal as C. D is basically C++ redesigned and now that GCC includes D support by default I wonder whether it'll gain popularity.


Definitely an option, and D is actually one of the languages I haven't seriously looked into yet (or rather, I saw it as a C++ alternative in its heydays ca 2005 and that image stuck in my head - and at that time I hadn't been looking for a C++ alternative)

PS: my main use of C is currently to write platform abstraction libraries with minimal size and runtime overhead, so need to talk directly to operating system APIs, plus WASM is a very important target. The libraries must be usable from other languages via automatic bindings generation (quite simple with a C API). Also for performance-oriented stuff, direct control over memory layout and lifetimes please.

Also personal opinion from 20 years of C++ experience: high level abstractions never pay off in the long run. Simple imperative code always wins when it comes to "malleability".


Ada. 83 or 95


Interesting choice, but Ada is probably even less popular than Zig.

Even just requiring users to integrate my hypophetical Ada library source distribution into their project's build system files would most likely drown me in support tickets ;)


It certainly has more production deployments than Zig might ever get.


We are in Ada 2012 nowadays, with Ada 202x getting finalized.

https://www.adaic.org/advantages/ada-202x/


my point is either of these versions are very complete and usable implementations. more recent is even better.


> There is nothing predictable about C except that you’ll eventually get screwed …

This has been the exact opposite of my experience. I’ve been writing C for 10 years and have yet to find a piece of code where I was surprised at what it did. That’s one thing I love about C, is it is entirely predictable. If it isn’t, my code is wrong. The language is rigorously specified. It is not hard to avoid undefined behavior.

Contrast that with languages like C++ or Python which hide gotchas all over the place. In Python, one cannot even rely on a variable being a certain type, and if it isn’t, the program explodes. C++ allows plus to not be the inverse to minus, allows for hidden custom memory allocators (overloading the new operator). Template metaprogramming is borderline sorcery past the simplest of use cases. C++’s interoperability with C is an accident waiting to happen with all the reallocations which can occur without the user being aware.

C lays flat out in front of the programmer all the unpredictable behavior that many other languages implement behind the programmer’s back. Sometimes that’s not desirable, and sometimes it is.


I agree with your point about Python, which is why I'm glad type hints see adoption but dismayed that they're essentially fancy comments that don't enforce the actual runtime types.

The thing is, I'm not convinced avoiding UB is easy. E.g. what's the behavior of the following code?

    int16_t a = 20000;
    int16_t b = a + a;


Agreed on the dismay regarding type annotations. My opinion is that potentially misleading code which gives a sense of safety when none exists is worse than dangerous code. It lowers the programmer’s guards, which can lead to more bugs.

Integer overflow will result, I’m pretty sure. The largest value a signed 16 bit (so, 15 bit) can hold is 32767, IIRC.

I can see where that’s unexpected for people whose brains aren’t wired in powers of 2. This is one area where I think Rust improves upon C, with its availability of overflow detection in arithmetic. It’s unfortunately verbose, but it enables greater safety.


Not quite what I was getting at: On an implementation with 32-bit ints, the code is valid – the values get promoted to 32 bit, added and then truncated to 16 bit. Yet on a platform with 16-bit ints (and microchips & unusual platforms is a frequently stated reason for using C), the addition overflows and result in UB.

Luckily most other languages haven't decided to copy C's implicit promotion rules & target-dependant integer sizes.


Given that all arithmetic autopromotes to int if smaller than int, there's no undefined behavior in this code if int is 32-bits (which is true on most systems).


> Never seemed to stop people moving from Pascal, or Perl or literally all other languages that are now obsolete.

Operating systems written in Pascal are now obsolete. OSs in C are not.

Perl is much easier to replace because fewer things were dependent on it however even here Perl 5.x still pops up all over the place.


Yeah to my great annoyance I did have to grep for an ipv4 address with a perl regex the other day. But for any actual scripting it's basically dead.


> for any actual scripting it's basically dead.

Run "file /usr/bin/* | grep -i perl | wc -l" on your computer. You will be surprised.

EDIT: if you want a histogram for all the types of programs in your system, run this

    file -bL /usr/bin/* | cut -d' ' -f1-3 | sort | uniq -c | sort


Embedded Rust has been a viable option for at least 4 years now and especially so for the past 2 years. I really dislike having to learn the quirks of building, configuring and navigating typical embedded c based projects. They always seem to have an excessive amount of tiny files (in various languages) all over the place with obscure heuristics only the original authors know about. IMO, to build anything new your only reasonable option is to blindly copy and paste an example project and hack away. I’ve never been able to “start from scratch”.

An embedded Rust project is the same as a normal Rust project except that you mark it as not linking the standard library !#[no_std] and you define a main entry point and panic behaviour (there are helper crates for this).

You can still use the core and alloc crates which give you pretty much everything you need in an embedded system like strings and vectors. You also get to use modern tooling like vs code and rust-analyser instead of a different antiquated version of Eclipse for each hardware vendor.

I don’t think that Rust should only be used for big projects. You can use it for small projects and you really don’t need to get complicated with generics for application code. You need to put in the effort to get a fundamental understanding about what the borrow checker is trying to achieve and the rest may be easier than you think.


While it seems Rust supports ARM devices like M0, M4 and of course more powerful chips like those capable of running Linux, there are huge swathes of chips that it doesn't support like 8051, PIC etc.


> At least eventually everything will be rewritten in Rust I guess.

This is the new "Year of the Linux Desktop".


>I really hate how for microcontrollers the only two choices are either C++ or Micropython

There's TinyGo as well. https://tinygo.org/

I'd say that's the middle ground for me.


It is nice, but nowhere near as complete feature-wise than C/C++. The fact that it exists does not mean you can use it to achieve the same thing.


What do you mean no where near as complete feature wise? Go or specifically the TinyGo implementation?

Seems to do exactly what 99% of people need.


Feature parity is fine but support is not quite there. Doesn't support WiFi on NodeMCU boards last I checked.


"Seems" is an outside perspective. There are loads of hardware features that it just doesn't support on various boards, and lots of extra hardware (like sensors) that it has no libraries for. It's not just the MCU/CPU that matters here.


There's a niche doing C++ (vs. straight C) on microcontrollers but the rest are just tinkerer choices.


> So we're gonna be stuck writing a precambrian prototype language till the end of time because there's so much legacy code already written in it?

Yes. Unless somebody steps up and rewrites everything in Rust or Lisp or whatever, that's exactly what's going to happen. Lack of backwards compatibility with existing software will condemn programming languages to irrelevance on day one.


Isn't Lua middle-ground enough? Alternatively you can write it in V and transpile to C.


Mainframe and micros computers don't speak C, unless we constrain ourselves to their UNIX environment.

ChromeOS doesn't speak C, unless you mean shipping WASM libraries. (Not every Chromebook supports exposing the Linux environment).

iOS and Android, kind of speak C, but not if you care to actually ship an app.


I believe that with the arrival of ChatGPT and similar tools, writing code in C will become as easy as in any other language. The AI tools know how to generate good C code, and C is fast by itself. I believe we'll see a lot more code written in C now that we have new tools to analyze C code.


I have grown somewhat tired of these ChatGPT responses. It's a tool...not a panacea. C is a fantastic, albeit somewhat complicated, language. The problem is a C programmer knows the quirks and ChatGPT will dump you some code that could have undefined behavior depending on the compiler. Will ChatGPT always use restrict correctly (for example)?


Why not? You seem to underestimate the ability of AI tools to understand code. Undefined behavior is something that a good AI tool may avoid without major problems.


The issue to me is not the generation of code. It's that the person using it is inexperienced with the given language. We will never be able to place 100% faith in AI. At least in my lifetime. Given that, I think it's a relative danger that is washed away in all the hype. A junior dev copy-pasting code from chatgpt. I couldn't imagine a more dangerous combination.


Junior dev copy-pasting from stack overflow: this is already happening! Whatever bad thing AI tools can do, this is already reality all over the world.


That's not even close to the same thing. Stack overflow posters don't hallucinate solutions, and in all but the most obscure questions, the selected answer will have been reviewed dozens of times over.

With ChatGPT you get exactly what it gives you which must be trusted as a source of truth. That's bad.


> writing code in C will become as easy as in any other language.

I look forward to a raft of CVEs over the next decade where ChatGPT is a root cause...


Oh jeez, please don't bring AI into the discussion. AI tools will just repeat all the bad StackOverflow advice and hilariously terrible trial-and-error C code from student assignments.


> The AI tools know how to generate good C code

Are you sure about that? ChatGPT doesn't understand C. It wouldn't even have enough context to reason about UB even if it understood UB.


Microcontrollers exist. Their libraries are written in/for C. The programs running on them are small and need tight, efficient memory management.

I also like the minimalist nature of the language itself. I get that for desktop applications, you usually want more integration with the operating system so you can say "I want a window here and a button here" rather than having to manually build the window from scratch, but that's not something that's a concern in most embedded systems.

I'm operating in a world of voltage inputs and outputs, memory mapped devices, registers, flags, and timings... with almost nothing between me and the hardware. A simple language makes a lot of sense here.


Are the Arduino and ESP32 microcontrollers?

Hint, might check their libraries/SDKs before answering.


Arduino is a platform, not a microcontroller. ESP32 is technically a microcontroller, but it's an SOC... which is not the kind that generally gets used for industrial applications in the field I'm in.

You shouldn't assume I get to choose the platform I'm working on. That's not how it works where I'm at, and if (when) I do get to choose, programming language is unlikely to be near the top of the list of criteria.


Whatever you are forced to chose doesn't make the other options disappear from the market.


The other options aren't relevant to my comment.

If you're going to be pedantic, you need to be both relevant and correct. You are neither.


Don't think I'm too crazy but last time I checked:

1. Yes they are microcontrollers.

2. Yes they use C/C++. (check the libraries/SDKs, 1 layer under the hood it's all .h/.cpp files, and most of the arduino calls are just #defines)


So it isn't only C.


It absolutely is only C on the microcontrollers I'm doing work on.

I don't understand why you're trying to cherry-pick like this.


I wasn't the one making an universal truth out of it.

"Their libraries are written in/for C"


The statement you quoted is true of both of the examples you gave.

Also, you've deliberately chosen a specific interpretation of my statement in order to manufacture an argument that doesn't exist. You should probably avoid doing that in the future.


It is not, because they use C++.

I avoid whatever I feel like.


Atmel and Xtensa have libraries in C.

It's almost like I have some domain knowledge that you don't. Imagine coming in here with examples that aren't even microcontrollers as if that "debunks" what I said above. Like somehow magically I can just switch to a whole different platform. No problem, just crank out a new board spin and swap my whole toolchain over so I can... what... use a non-standard version of C in the arduino IDE for production code? If you think THAT'S a viable option, you've lost your mind.

Why continue to double down when you clearly have no idea what you're talking about?

> I avoid whatever I feel like.

You should "feel like" avoiding inventing arguments that hinge on misinterpretations of other people's statements. The fact that you don't makes you a problem.

Meanwhile, I can't avoid C even if I "feel like it"... because I write code for microcontrollers... which have libraries that are written in and for C.


"1. Yes they are microcontrollers.

2. Yes they use C/C++"

So how it is?


The things you mentioned aren't both microcontrollers, and they use C.

Why continue to double down when you clearly have no idea what you're talking about?


So are you taking back the original answer?

Those are not my words.


No.

The things you mentioned aren't both microcontrollers, and they use C.

Why continue to double down when you clearly have no idea what you're talking about?


With all respect I think there’s a kind of false dichotomy implicit in your comment.

The availability of new tools with significant advantages over the old tools is almost always a reason to consider the new tools for certain use cases, but the new tools are rarely just strictly better on literally everything, there are generally now use cases when you say “the new tool is a solid fit here” and other cases where you say “the old tool still hits the sweet spot better”.

And that’s before you consider massive existing code and infrastructure and and tooling and investment: which is very, very often a far higher order bit than C vs not-C.

A great example would be a JVM-caliber GC? Thats just such a win over malloc/free so often, but it doesn’t obsolete malloc and free across the board: it gives a thoughtful and mature team a whole new set of options.

Rust would be a (comparatively) recent example of a language that hits a lot of the sweet spots of e.g. C/C++ and brings some cool new stuff to the party, and might even represent a better default these days, but the idea that it strictly crushes them in full-stop everything is a political-style conversation not a reasoned engineering tradeoff conversation.

Even C++ which has been around forever and is give or take backwards compatible with C with good tools? Hasn’t obsoleted C.

More options is generally a good thing (there are exceptions).


> I used to think "C presents the most honest representation of the low-level mechanisms of the computer", but... even this is shaky. I've been programming for almost 15 years now, and I don't think I've ever seen a computer where memory is actually a continuous array of bits sorted by memory address. The C representation of memory (and all the pointer arithmetic) is not a real representation of your hardware, and this too is an abstraction.

It's true that almost nothing works the way it's presented: the computer doesn't necessarily actually do the instructions you specify, it does its machine commands that are compiled. It also doesn't necessarily even do them in the order they are specified. The memory isn't actually a big continuous space, it's mapped as virtual memory. The actual memory isn't used in that way either, there's a hierarchy of NUMAed caches between the CPUs and the actual memory.

But it's a useful abstraction. Partly because a lot of the above things are built so that the abstraction works. But also because we want it to look that way, and it's kinda natural to let programmers imagine a virtual machine that works that way.


More importantly, it's also the abstraction that the CPU itself provides, not C. It'd be neat to be able to control all those things, but that's largely impossible, so I'll take the next best thing.


C presents a fairly honest representation of the low level mechanisms of x86 Assembly. The way Assembly has drifted away from actually CPU instructions is interesting, but not something a programmer will get much benefit from trying to deal with. Itanium was an interesting experiment, but the new set of instructions did not offer large gains in practice.


>>I don't think I've ever seen a computer where memory is actually a continuous array of bits sorted by memory address.

I may be being pedantic or outright wrong (since it's been a while since I used C), but I don't think C can address memory by individual bit.

You have to read one or more bytes from memory, twiddle the bits in them, using C's bitwise operators (like !, &, | and tilde), and then write the changed bytes back to memory at the same addresses you read them from. At least for the earlier C versions I used, this was the case, IIRC.

And to read and write those bytes, you do it via scalar variables like ints or longs, or via structs or arrays, or via pointers. Or using library functions like memset().


Indeed, bytes are the smallest addressable unit, which is 8 bits in most architectures. You can't address a bit, so to do anything with it you have to get the byte it's in and twiddle.


Why do programmers in 2023 need to imagine a virtual machine (basically a PDP-11 from 1970-something) at all?

You only need that abstraction if you're doing low level bit/byte bashing and I/O, or there's some chance you may run out of memory and need to handle that manually.

That applies to a tiny slice of all possible applications.

There are far more useful modern abstractions that don't need to make those assumptions.


> basically a PDP-11 from 1970-something

That PDP-11 from the seventies had ADC/SBC (addition/subtraction with carry) in its instruction set, the result of MUL was twice the size of the inputs (i.e., multiplying two ints produced a long), and DIV produced both the quoitient and the remainder. None of that is visible from C and yet people keep clamoring that "C is close to the metal". Bah, humbug: while " * p++" and " * --p" idioms translate directly into an addressing mode particular for PDP-11 — most other architectures don't have autoincrement/decrements — there is no specific support for " * ++p " or " * p--" in the machine itself.


Yeah that's true, and that's why people don't use C for stuff that isn't close to the metal. If you're just serving some web page you can just think about the business logic and a higher level language will deal with the rest for you.

But someone's got to write drivers and someone's got to write the thing that connects the higher levels to the metal.


Because when you are writing drivers for MCUs, you are writing into arbitrary pieces of memory on arbitrary addresses specified by reference manual for you MCU. And when you will write 0xABCD into memory address 0xF120, then your UART will throw out 0xA, 0xB, 0xC, 0xD on a pin using clocks defined by register 0xF124 which is actually a divider definition from VCO connected to XTAL.

No amount of abstraction under any language will isolate you from such memory model.


Writing C code is fun and enjoyable. C programs are typically fast due to the use of primitives and low overhead. C's set of tools and abstractions typically forces you to think about how best to implement a particular data structure or interface, which is the kind of problem I most enjoy.

>I used to think "C presents the most honest representation of the low-level mechanisms of the computer", but... even this is shaky. I've been programming for almost 15 years now, and I don't think I've ever seen a computer where memory is actually a continuous array of bits sorted by memory address. The C representation of memory (and all the pointer arithmetic) is not a real representation of your hardware, and this too is an abstraction.

Pointers are an abstraction, but they are less abstract than most languages simply assuming there is just one giant sheet of memory to take from.


> cryptic "undefined behaviors"

It's not really that cryptic (aside from like strict aliasing, but -fno-strict-aliasing). There's some UB that might be considered unnecessary/too strict, but it still makes sense in its own right, and, if understood, is quite powerful, and leads to a bunch of neat optimizations.

> the language doesn't feel easy to use/debug

If debugging at the assembly level, stepping by instructions, it's actually quite nice (despite what everyone says about it not mapping well to hardware, in my experience there's still a pretty clear & immediately obvious correspondence between each C thing and assembly subsection, and vice versa)

> CPP macros, a universally recognized bad idea

I don't know, they're quite neat for things I have to do. Sure, a turing-complete compile-time language would be nice (I'm not saying that sarcastically, I even use a DSL for writing SIMD that is exactly that!), but it'd add a ton of complexity to mapping C source to assembly.

> Also, documentation is all over the place. If a function isn't described in `man`, I have no idea where else to actually look for it.

Use of the standard library grows less and less significant as the size of the C project grows. Besides that, cppreference.com has pretty much everything.

And yeah, as others have said, a linear sequence of bytes is still a thing every CPU presents. Yes, there's cache & whatnot, but there's like precisely no way to usefully map that to any user-controllable/visible thing, because it's pretty much not user-controllable and intended to be invisible (and varies across all hardware).


> Sure, a turing-complete compile-time language would be nice

I wrote Metalang99 [1] as a compile-time language that is able to perform loops, recursion, etc. It's not Turing-complete though, as the C preprocessor is not Turing-complete.

[1] https://github.com/Hirrolot/metalang99


> From cryptic "undefined behaviors" to . . . [the] lack of a single reference implementation of the compiler/libC, and you have a language that is harsh to defend.

I think you're confused, because this is internally incoherent.

In single reference implementation languages, all behavior is undefined behavior. Undefined behavior is just behavior for which there are no requirements imposed by the international standard. It's an unbounded form of implementation-defined behavior.

Undefined behavior does not mean that the behavior is completely unpredictable. It does mean you should read your compiler's documentation (including tweaking what happens with certain common UB). For example, if you want signed integer overflow to always wrap, and you read the GCC or Clang documentation, you'll know to use -fwrapv. If overflow could cause catastrophic failure and the program should abort if it happens (e.g., Therac-25), you'll know to use -ftrapv. There's nothing wrong with writing to an arbitrary memory address, either, if you've read your documentation and that's how your environment communicates with a particular I/O port.


> People who still write C, honest question: Why?

Because loops are fast.

I do scientific computing, where many people use python nowadays, and a few years ago it was matlab/octave. These languages feel "cramped" because they artificially force you to program in a certain way in order to avoid loops. While such a "vectorial" notation is often useful, many algorithms are better expressed using a loop notation, and C does not impose an artificial distinction between the two notations: both are as fast as they can be. The fact that python is not an appropriate language for low-level numerical computation is evident when you notice that most numeric algorithms in python are just interfaces to code written in other languages (C, C++ and Fortran).

Of course, C is not the right tool for the job either... Modern Fortran is, objectively, the ideal language for low-level numerical computing: it has native multidimensional arrays and a lot of other goodies, which C lacks.

Julia would also be a nice alternative, and I check it regularly. But I find the current interpreter too quirky. I would love to see different interpreters/compilers for this lovely language!


C has no in-built way to deal with SIMD, which is essential for high-performance computing over loads of data. On that count alone it is already out of the game.


What are you talking about? "in-built"? Have you ever written SIMD assembly before? It's comically easy to integrate SIMD optimizations into a C program.


Through in-built assembly, or some compiler-specific annotation. None of them is vanilla C, which was my point.


Actual "standard C" (along with most of the C stdlib) is pretty much useless for writing real-world applications, any non-trivial C code base will almost certainly use at least a handful non-standard extensions (sometimes even without knowing it) and both compiler- and platform-specific conditional code paths (just try how many libraries would compile with gcc's "-pedantic" flag, I bet it's not all that many).

This pragmatism by compiler vendors to just ignore the C standard where it doesn't make much sense, and to extend the language where it helps to solve real-world problems is actually a pretty powerful argument for C.


If you want truly high-performance, architecture-generic SIMD won't get you particularly far though - the utter mess of things that x86-64 does and doesn't support is an utter mess, and doing things well across fixed-width and variable-width SIMD architectures will require compromises on one of those quite often. (not at all to say that it's impossible, it's just quite full of asterisks that I personally think is too much to bother standardizing)


Part of what makes C touted as a 'low level language' is the relative ease of inlining assembly.


Which isn't part of the standard, and no compiler is required to support to achieve certification.


gcc had emitted simd instructions since the egcs days.


So does JS, Java, whatnot. That’s not the point.


I needed to improve perfomance of some numerical computations in an existing Python script. The only choices felt like C and Fortran.

I tried Rust at first but went back to C when I realized I was spending more time appeasing Rust than solving the actual problem, which wasn't really complicated enough to gain significant benefit from Rust's features.


I am working on a translation of a game engine from Go to C with another coder. One of our end goals is to make it easily available via WASM in a web browser.

As to why work in C - it’s incredibly fast, it feels very powerful as long as we manage memory correctly. We use fsanitize, which is an amazing library that can find memory leaks, buffer overruns, etc etc and run it on all unit tests. I think fsanitize is essential to have in your tool belt if you’re doing any C programming at all.

A pretty direct translation from Go to C resulted in about a 125% speed up (ie the C code was 25% faster) and this was already very optimized Go code with no allocations. From Go to WASM the results were disappointing to say the least - WASM was about 32% the speed of Go and not at all easy to multithread (and a gigantic file). From C to WASM I got a much better 79% of native speed - would have wanted a little bit more, but this is much more doable, and we haven’t begun to optimize some parts of this engine yet. And Emscripten seems to have very good pthread support, which I will try soon.


> So, setting aside the need to maintain 30+ year old code, what would be modern reasons to start a new project in C?

C code written today will still be runnable 30+ years from now, and likely on whatever platform you're using, unlike code written in some flavor of the month language. C is standardized, has been ported to every architecture, and is easy to port in general, and there's so much code that's already been written in it that the inertia behind it is virtually insurmountable. I've invested significant time in other language ecosystems (like Perl, coincidentally also on the front page) only to see them eventually declared "uncool" (however productive) and killed-off by faddish HN types. But I'm confident they won't have similar success against C.

C is the real Hundred Year Language: http://www.paulgraham.com/hundred.html


Extremely minimal runtime, portability, and very low overhead when compared to other languages. I have a tiny statistics daemon that scrapes /proc and sends out multicast packets, and it builds and runs on everything from ARMv5 to Xeons, barely showing up on any kind of resource meter and with an absurdly small binary size.

I considered rewriting it in Go a couple of times but just didn’t see the point.


I like making things (air quality monitors, web nfc login, automated garden, power monitor and etc) with microcontrollers like the Raspberry Pi Pico, the only real choices are C/C++ or some flavor of Python. I really do not like Python, it rubs me the wrong way for some reason and also I can find libraries for all the components/sensors in C/C++.

It's not so bad. Manipulating strings is a pain in the ass so everything becomes a char and managing types is so annoying, especially dealing functions that could easily take an int or float, you either have to make a template or different versions of the function for each type. This makes me appreciate dynamically typed languages a lot. Those two issues are the only problems I seem to have, everything else has been easy and breezy

Besides those two things it's pretty nice. My code is a bit verbose because I'm not that great at it but I'm sure I could reduce the lines of code in my projects (the biggest one has 4000+ lines of code, but it does a lot) by using structs and more loops, but that's mostly a skill/experience issue.


You had many answers.

You don't really start a project in C unless you target limited hardware or some low-level library that can be embedded in other things and interact with other language that can make us of C-style APIs.

C became the "new assembly", meaning it sort of replaces the role assembly had. The chips that are sold are not programmed in assembly, because they're sold with a C compiler target directly.

C is more than a programming language, it's an universal glue, so it often makes sense to use C because it gives access to everything. It's like english: you can't expect to use esperanto just because it's a superior language. Programming languages are the same.

Disclaimer: I mainly use python and C++.


Honestly because I don't want to learn another language.

And because most of the world uses C for low level stuff. You can say that Esperanto is a much better international language than English but what good does it do if nobody speaks it?


Veering off topic, there is a great rant about why Esperanto is a horrible international language: https://web.archive.org/web/20110515155117/http://www.xibalb...


Justin Rye's site is now at http://jbr.me.uk/ (and the espe-ranto at http://jbr.me.uk/ranto/ )


Quite simply there haven't been any candidates so far which both got the "essence" of C and had the momentum to actually replace C. Zig looks like the most promising so far, if they don't fuck up on their way to 1.0

(disclaimer: I switched back from C++ to C as my language of choice for writing libraries ca 2017, but also continue to write C++ (if necessary to talk to C++ libs) and a lot of Python and Typescript for simple cmdline tools and web stuff, also ObjC on Mac of course for talking to system frameworks, in recent years dabbled with Rust, Odin and Nim, in the long distant past also with C#, Java, Lisp and some Forth, and eventually hope to transition over to Zig for the stuff I currently use C for (maybe in 3..5 years?)

TL;DR: use the language that suits a problem best, and C is a very good tool to have in any language toolbox, because it can usually provide a solution where other languages have to give up or just become to much of a hassle (for various reasons)


> The C representation of memory (and all the pointer arithmetic) is not a real representation of your hardware, and this too is an abstraction.

By and large memory is a contiguous array and the C representation closely matches what is actually happening, so I am curious about which platforms you have worked on.


Tagged memory architectures don't match the C model of linear memory. They're essentially obsolete now but C is still designed to accommodate them.

A lot of the UB the people grouse about can generally be ignored because 99% of the platforms out there have the same behavior in areas where the standard is extra permissive for obsolete exotic hardware. Tagged memmory is dead, 1's complement is dead, big-endian is mostly dead. All the UB associated with them is not that relevant most of the time. The downside is that people write code that takes a lot of liberties assuming behavior that the standard doesn't guarantee. A common one is unaligned access because x86 has always been permissive about it and it took until C11 to have power tools needed to manage it in the language.


The UB problems have no relation to the machine behavior. UB exists only on the compiler.

The fact that many C developers keep confusing it with implementation dependent behavior gives me no confidence on their other opinions about the language.


> UB exists only on the compiler.

Sure, but UB exists so that compilers can generally do whatever's fastest on that particular architecture.

Your signed add instruction traps? Great, do that. Does one on a different architecture overflow? Fine. Just emit it and it's conformant.


Does the C machine model in fact consist of a single linearly addressable memory space? I think the spec mostly talks about "objects" that are linearly addressable -- not about the whole "memory" (there might not even be such a thing). Technically you aren't even allowed to compare two pointers other than for equality (relational comparisons are possible only within the same array). Just making up pointers is probably already a stretch of the spec, although you'll see lots of that in e.g. embedded projects.

(Disclaimer: I really don't know all the details of the C standards, am not a language lawyer but know enough about the language to feel quite productive in it. Please fill me in or correct me where I'm wrong).


People will abuse [u]intptr_t to compare addresses from different objects. There is no guarantee that the integer value stored in such variables is representative of a linear memory space and you're supposed to treat them as opaque data but most platforms permit such comparisons. All you're permitted to do is cast a pointer into those types and cast it back to the original type.


A couple points:

- CPU memory subsystems are very complex these days and represent a lot of shared mutable micro-architectural state, which makes it hard to reason about. That's not linear and the C language does not offer concepts which represent that complexity. Short of some prefetching intrinsics.

- Pretty much all memory will be virtually addressed, pushing you even further from the concept of flat linear memory.

- Pointer provenance [0] binds memory to types and allocations which doesn't map onto the concept of linear memory and a pointer is just an offset.

[0] https://faultlore.com/blah/fix-rust-pointers/


AFAIK C's machine model is not that linear (see my other comment). On the other hand, what most CPUs offer as an abstraction (through their instruction set) is very much so.

There are couple of arguments like that floating around and it just doesn't make a whole lot of sense. The C model is in fact a usable abstraction (and easy enough to peel off when required), otherwise it wouldn't have stuck around for so long. No amount of "network effects" and "free beer" arguments can discuss this away.

There is an argument that instruction sets might have developped a linear address space abstraction because of C, but I doubt it. Binding the IR closer to a specific physical layout would be very bad for portability and longevity of the code.


> the C representation closely matches what is actually happening

It really doesn't, though. Although your CPU might present system RAM as one contiguous array of bytes to your program, the C compiler follows different rules – see strict aliasing and other pointer dereference rules. For example, the following is Undefined Behavior and your C compiler may or may not generate the assembly you expect:

    int x = *(int *)0x1234568;
Your CPU would happily execute the equivalent machine instructions and load from address 0x12345678, while a C compiler is free to replace your entire program with return 0;


Casting an integer to a pointer is implementation defined, not UB.

And every sane implementation does what everyone expects because its how memory mapped IO works (but you probably want a volatile in there and maybe a compiler or memory barrier as well depending on what the hardware guarantees about the access patterns for that particular range of addresses)


> Casting an integer to a pointer is implementation defined, not UB.

You're right, that was a bad example. Here's a better one:

    int x, y;
    ptrdiff_t diff = &x - &y;
This is Undefined Behavior, because &x and &y don't point to the same object.


The original author was talking about hardware not behaving like linear memory, and other than caches and maybe some thread local tricks, I'm not sure what he meant. However, it seems pretty clear that CPUs do try really hard to make:

    mov rax, qword ptr [0x12345678]
do what you think it would/should.

And as for the C memory model, aliasing, and optimizations, I'm firmly in the camp that thinks the standards originally gave the compiler writers an inch to work on weird platforms and they've taken a mile when they work on reasonable ones. The intent of your integer to pointer cast is very clear, but it's been undefined to insanity. So now there is some variant of the following, which doesn't have UB but does the exact same thing less clearly:

    uintptr_t i = 0x12345678;
    int* p = 0;
    memcpy(&p, &i, sizeof(int*));
    int x = *p;
I'm sure some language lawyer will correct me on some obscure detail of the standard, but it could be fixed with some modification. The point to me is that using memcpy instead of pointer casts is NOT an improvement. The good compilers will generate the same code as the assembly above, so all they've done is made the C source less readable.


> The point to me is that using memcpy instead of pointer casts is NOT an improvement.

The improvement comes when there are multiple accesses that could potentially point to the same memory. Consider a silly function:

    void f(int16_t* a, int32_t* b) {
      for (int32_t i = 0; i < 100; i++) {
        b[i] = a[0] + i;
      }
    }
If type-based alias analysis is enabled, then the compiler can assume that a[0] does not alias b[i] because they are different pointer types. So it can hoist the load of a[0] outside the loop, improving efficiency. If strict aliasing is disabled, it cannot assume this, so it must reload a[0] each time: https://godbolt.org/z/E7jxfYsbx

The memcpy() makes it clear that the memory could alias anything, so it will generate the less efficient code even if strict aliasing is enabled: https://godbolt.org/z/KoPxK9fPj

Memory aliasing is a huge thorn in the side of the optimizer, because the compiler frequently has to allow for the possibility that different pointers will alias each other, even if they never will in practice. The code might end up being slower than necessary for no real reason. Strict aliasing is one of the few tools we have to tell the compiler that aliasing will not occur.

I don't think that C actually forbids this code:

     *(int*)0x12345678
The rule is just: if you access it as an int, you have to consistently access as an int. You can't mix types from one access to the next, eg:

    *(long*)0x12345678
    *(int*)0x12345678


> Strict aliasing is one of the few tools we have to tell the compiler that aliasing will not occur.

I can see the argument, but there's a much better way to indicate what you want with your example:

    void f(int16_t* a, int32_t* b) {
      const int16_t a0 = a[0];
      for (int32_t i = 0; i < 100; i++) {
        b[i] = a0 + i;
      }
    }
Now a clean (well defined) compiler could do what you asked.

I've seen other people suggest that UB is a mechanism to have these magical backdoor conversations with the compiler to express optimization opportunities. I think that's absurd and reckless. Propose adding assertions or "declare" statements instead, and quit thinking of interpretive dance through a minefield as a method of communication.


You are entitled to your opinion. C isn't perfect, but as someone who spends my life trying to optimize the efficiency and code size of critical loops to the max, I like the direction C has gone with UB and optimizations. It's not the right tool for every problem, but for the most size/speed critical code it's hard to beat IMO.


> I don't think that C actually forbids this code:

     *(int*)0x12345678
If not, give it time. It was only a few years ago when you were allowed to use a union for that kind of thing. I really believe they'll eventually make everything except unsigned integers be UB.

"Oh, the code was never correct. You just got lucky before."


If you want to load from address 0x1234568, assign it to a char pointer first. Then the cast is legal and defined.

Your point that C is stricter than asm of course still stands.


> and load from address 0x12345678

and most likely seg fault, or similar


1. If the CPU lacks an MMU and the address falls into an accessible address space, it won't segfault.

2. If the CPU has an MMU, it won't segfault if the address is mapped to an accessible region of memory.

3. This is besides the point, because the CPU will execute the instruction and attempt to load from that address. A C compiler might emit the load instruction, or it might assume that this code branch will never be executed and can therefore be replaced with code that sends an angry email to your mother.


Only if you ignore memory layout and understand the UB on your platform. The language does not make as many guarantees as "C is simple" folks seem to think. Throw a sanitizer at any of their code, and you'll see unaligned memory accesses all over the place.


Most modern hardware make use of registers and multiple layers of caches and you need ton of UB to justify the compiler making use of them.


Registers are orthogonal with memory layout. Cache does not change the general model.


> Add to this CPP macros, a universally recognized bad idea

I don't think is not a bad idea. You can't solve language incompatibilities in the language it self. Textual macro languages solves this nicely.

CPP is what makes C and C++ work for projects aimed at multiple platforms or compiler vendors.


Yes, you can. Two approaches:

1. Multiple implementations providing a unified interface, selected by the build system. Aka the Henry Spencer approach: <https://www.usenix.org/legacy/publications/library/proceedin...>

2. Less-bad macros, e.g. cond-expand: <https://weinholt.se/articles/cond-expand-and-ifdef/>


Option 1 only works if there is a sensible unified interface, and if you feel like spending the time making that for what could be just one line per target. And it just won't work for things that don't really "have an interface", i.e. conditionally adding an __attribute__((optnone)) to a function that a specific compiler version gets stuck in an infinite loop optimizing, or macros that expand to some _Pragma-s that apply to a loop following it for controlling unrolling/vectorization if available, or managing custom inlining configurations for functions based on the optimization/debug levels, or defining a type as either 32-bit or 64-bit depending on requirements, or redefining all printf & fprintf usages to something mingw-friendly.

Many of those could be solved by some other means, but C macros neatly encompass all of those.


Because most of the projects I want to work on are in C. Postgres, the Linux kernel, lots of legacy systems stuff. All the foundations of our field are in C, so that's what I use when I want to contribute or study them.


Longevity. As sure as eggs is eggs, reasonable C that I write today will be compilable in 30 years time. Python? Breakage every couple of minor versions.


It's more that it's the most honest representation of the assembly/machine code. We can't really get closer to the hardware than the interface the CPU offers, and C then sticks pretty close to that (or a subset of it, I suppose).

It's the simplicity and power of C that I find attractive. I don't write it professionally at the moment, but I enjoy it. It's obviously not the right tool for the job most of the time for the reasons you give, but I miss its elegance.

I am a big fan of rust, but it's massive compared to C. I'd like to explore Zig some day.


> People who still write C, honest question: Why?

It was my first programming language and I still think it's a simple and fun language. Also many things have a native C interface so it's a natural choice in those cases. It's certainly not the only language I use, but for many things my default. What's nice is that I don't have to consciously think much about the language when I use it because I know it well.


It's kind of like English. English is in some sense a simple language (grammar), a poor mixture of other languages and its orthography is not good (not an elegant language), but everyone speaks it.


It’s a very simple and explicit language that is easy to write high performance code with and can be used as high level, portable assembly which integrates easily with actual assembly due to a simple and stable ABI. It compiles extremely quickly, its tooling is mature and robust, and you can write it for any platform and do basically everything with it because it is a lingua franca where almost everything has a native API that uses the C ABI.

C’s type system is lacking, I wish it was more strict, and sometimes I wish it had some features from C++ (operator overloading for mathematical types, templates for generic programming) and features of other languages (multiple return values especially), but overall I’m okay with its limitations and have become used to working around them. Sometimes I compile C code with a C++ compiler just to take advantage of stricter typing, templates, etc. but for a lot of projects this isn’t a necessity.


> I don't think I've ever seen a computer where memory is actually a continuous array of bits sorted by memory address

well, that is not the C memory model. C does not allow you to access bits in memory directly. maybe you meant bytes? or words? if so, many cpus have exactly that architecture.


> C does not allow you to access bits in memory directly.

of course it does what are you talking about?


Not the commenter you're replying to, but I suspect what they mean is that the C memory model is byte-addressable not bit-addressable. You can't point/refer to a specific bit in memory, instead you have to first read the byte and then select an individual bit using bitwise operations, much like most modern processors.


That has nothing to do with the C memory model, but how the CPU is structured. No modern CPU has an interface for bit-address accessing as far as I am aware...

C makes no assumptions about the size of a byte


C doesn't really know about bytes. It has chars, but I believe there are some constraints on char, specifically, they have to be big enough to hold the ASCII charset. (I'm pulling real deep here, someone correct me if I'm wrong)


C11 3.6p1 byte "addressable unit of data storage large enough to hold any member of the basic character set of the execution environment"


If I remember correctly, it assumes the size of a char is greater than or equal to seven bits, and a char is defined to be the smallest addressable unit.

C does not support bit-addressing.


The width is defined as CHAR_BIT >= 8 (C11 5.2.4.2.1p1). The size, sizeof (char), is always 1.


I wouldn't consider accessibility (via masking & shifting or struct bit fields) to be the on the same order as the byte-level addressing you get with pointers.


bits are not addressable in C and are thus not directly accessible.


They are also not normally directly addressable by the CPU, you'll have to do some combining and splitting with separate instructions. Some CPUs are better at this than others.


Tangent: some Arm Cortex-M class CPUs had a feature called "bit-banding" where you could do byte accesses to an area of the address map and the CPU would turn these into bit accesses to a different part of memory. So the alias word at 0x23FFFFFC maps to bit [7] of the byte at of RAM at 0x200FFFFF, for example, and you can do a word write to 0x23FFFFFC to change just that bit 7, saving having to do it by hand (which is particularly awkward if you need to ensure the atomicity of the bit update).

https://developer.arm.com/documentation/100165/0201/Programm...


I wouldn’t quite count it as bit-addressing, but x86, for example, can load bits directly into the carry flag using the BT instruction which can take a register or memory address as it’s first argument, with the bit being given as the second.


There have been all kinds of variations on that theme. One of the nicest is 'bit test and set' as an atomic instruction, that one enables a whole raft of nice stuff.


Existing C examples from semiconductor vendors do not allow other languages. Ok, C++ is also used, but that’s it. So it’s no brainer taking available drivers and building logic around them. That’s current state in embedded development. Client does not pay for use of modern languages.


I target microcontroller platforms, some of which only have a single compiler, usually some patched 20 year old version of GCC. The only possible alternative to C would be something that transpiles to ANSI C, given that some of these platforms don't quite have full C99 support.


It‘s the only language supported by basically all platforms, microcontrollers, GPUs, web-browsers. Although that is also almost true for C++ nowadays. I‘m also curious which memory model would be superior in your opinion?


I use C for microcontrollers. I think Rust is making some inroads, but the libraries/tooling is not there yet.


neither does the IDE tools I feel, it's going to take a while, and Rust has been here for 17 years.


IDE tools like what? LSPs for Rust are on par / better than that of C/C++, partially because of language being stricter, no #include nonsense etc. Unlike C/C++, sane build system and dependency management system that are universally agreed upon actually exist. What exactly is "going take a while"?


I would assume debugger support. Rust is in a tough spot because a lot of code gets compiled away, and debuggers need to understand some Rust-isms for good experience, like enum support. I don't think this is an insurmountable situation, though.


People universally agree that replicating NPM's dependency hell was a good idea?


>it's going to take a while, and Rust has been here for 17 years.

Technically correct, but Rust was changing significantly from version to version prior to the 1.0 release some 8 years ago, notably the green thread runtime was removed.


Because I don't like "magic". I can understand the appeal of one-liners that do the work of 100 (or more) lines of C but that's just not what I like to do. I like to be in control. I don't like side effects. "Undefined behaviors" is a propaganda. In my 15 years of programming in C, I never had an issue with "undefined behaviors". Things I created a decade ago still run like a champ on a damn coin cell.


History.

You do what your operating system vendor does.

Not few operating systems have a C interface. The implementation of binaries (see also application binary interfaces) depends on the operating system.

Shared libraries (e.g., DLL) are binaries, too.

C compiler developers have the ability to generate consistent[1] binary outputs.

In simpler terms, vendors of these compilers can reach a consensus on how to convert C code into binary files, known as Application Binary Interfaces (ABI).

It is not uncommon[2] to have a foreign function interface in C.

1. http://yosefk.com/c++fqa/defective.html

2. https://learn.microsoft.com/en-us/cpp/dotnet/calling-native-...


There are a few reasons:

    The Lindy effect.
    You can run C code from anywhere.
    There are places where it is much easier and better to run C code than anything else.
All of these are related to how long C has been around. I think that's also the reason why we use JavaScript extensively.


C just feels good to read and write. Every other language suffers from not being C.


If you're doing anything in embedded systems/hardware, expect to be using C. Yes, Embedded Rust and MicroPython are a thing now, but if I need to work with any partner or customer I'll be in a world of pain, because 99% of that industry uses C. My customers start new projects in C every other week. If you need to be Processor independent, Portable, Performant, have access to Bit manipulation, and need direct control the Memory management, along with a massive ecosystem, C is almost the only option.


For me I use C because it's the de facto system programming language for botb Linux and Windows. Another reason is that C grammar is simple (but has a lot of quirks I do admit.)


Professionally I do Python. From my experience the breakages occur due to an over-reliance on libraries to do trivial tasks. Do you find a different case?


Unfortunately I'm not professional enough to answer this question. I use C to learn system programming only and I never had the capacity to look at the kernel.


C can be coded much safer as long as I don't code in 'odd' ways, e.g. trying to be really smart with it. By following common-sense coding rules it seems pretty safe to me.

Like it or not, C might still be the most widely used language after 50 years, it will not go away, instead, future AI code review tools, static analyzers, more powerful compilers will evolve fast to make C safe and alive. Why, the price to replace it will be much higher in practice, it might simply be impossible.


Because there's no easier way to access various libraries.

Yes, that library does things with pointers the new language can't prove are safe. It's been used for longer than you've been alive and it isn't changing. If a new language can't express what it's doing, well, the library isn't going to move, the language is. Therefore, I either have odd shims and contortions or I have C.

I await a Buzz Language to eventually have "inline C" the way C has inline assembly.


I don't write C as much as i used to but i still write a lot of it, including new code. The reasons are:

1. C is relatively simple. Sure, not as simple as it could be (e.g. compared to something like Oberon-07) but in the grand scheme of language things, it is far on the simpler side of the spectrum. I can write a C parser relatively easy if i want to for example (and at some point years ago i did that to transpile a C project to C# to run under Sony's PSM platform that was based on Mono and allowed only C#).

2. Undefined behavior is annoying as it can break previously working code with newer versions of the same compiler (though language lawyers playing word games like the code already being broken are way more annoying - the code did the thing i wanted previously so as far as i am concerned it was not broken), but this is something that aside from "obvious" things (accessing invalid memory) i can probably count in my fingers the times i encountered in practice (i write "probably" because right now i can't remember any case, but i've being writing C for more than 20 years). Valgrind and Ubsan help with these so they are not much of a practical concern.

3. I find CPP macros to actually be very useful and a feature that a) i'd actually like expanded instead of being stuck in the 80s (let me store some state or have a loop, FFS) and b) were available on languages too (Free Pascal is a language i also use and does have some C-like macro support, which is more than what you'd find in other languages but still not to the same extent as C). D's mixins essentially being ubermacros are a thing that i liked with that language but sadly their stance on breaking things is something that kept me away from it.

4. A C compiler is available on pretty much everything that can compute things - or at least on pretty much everything i might think on targeting with C anyway (and chances are there are multiple C compilers instead of just one). If not, i can probably write a compiler myself - it'd be rather simple and not that great but i'd be more likely to finish it than a compiler for some other language.

4b. Very related, so it gets a "4b" instead of 5 :-P, but there are a bunch of IDEs and editors that "understand" C. I like IDEs, i like syntax completion, i like semantic highlighting, i like being able to easily rename an identifier, etc and C being easy to parse (see #2) means it has a lot of those. Let me correct that, i don't "like" IDEs, i love IDEs.

5. Most modern computers might not technically be like how C presents them to be, but they're close enough where any differences only matter if you're trying to perform microoptimizations to your microoptimizations - at which point you'd most likely be using a combination of compiler-specific heuristics and assembly code anyway.

6. In most systems where that'd be a concern, the C ABI is pretty much stable or at least there is a stable C ABI, allowing any code written in C to be usable by other languages as well as shared libraries to be able to expose an ABI that will remain backwards compatible and usable by other languages. Of course other languages can do that but they pretty much always do it through a C-fication of their APIs.

7. C compilers - even those that perform a dangerous (see #2) number of optimizations - tend to be very fast. I hate waiting the computer to finish doing things so i tend to prefer languages with fast compilers.

8. While i don't (always) need to maintain 30+ year old code, i do have existing C code that (seemingly, see #2) works and i don't see a reason to waste time rewriting that code in some other language. Even if it'd be broken chances are it'll be faster to fix it than rewrite it.

9. I am comfortable with C. For me being comfortable with a language important because it lets me focus on the thing i'm trying to use the language for instead of the language itself.

There might be other stuff i forgot, but the above should give you an idea why i personally write C. Though note that i don't see as any sort of perfect language, there are a lot of things i'd like it to do better - including the type system you mentioned as well as the compile-time code evaluation i wrote above, be it via CPP or by some other means - but it is good enough.


About 4., there is already a preprocessor loop proposal for clang[0] and I think I have also seen it some obscure Qualcomm bluetooth chip compiler.

[0]: https://discourse.llvm.org/t/rfc-new-preprocessor-macro-dire...


> let me store some state or have a loop

This is arguably already possible, and was possible even before c99 added variadic macros. Although the code is a bit cumbersome to write.


It it technically possible in that C macros are supposedly Turing complete, but i mean i want something like being able to add a value to a variable, iterate through values (proper list would be neat but i'd be ok with a string of space separate values), etc.


> It it technically possible in that C macros are supposedly Turing complete

It isn't Turing complete, because it will always terminate, but you can make the execution time (number of execution steps) arbitrary large exponential in respect to the number of source lines.

There are a few libraries that implement that.

https://github.com/rofl0r/chaos-pp: Quite high level implementation, that supports arbitrary precision decimal base arithmetic.

https://github.com/camel-cdr/boline: Mine implements 8/16/32/64 bit arithmetic, and low level control flow.

You can very often get away with using unary numbers and/or constant expressions to work around the limitations without needing a library.

Got any problem in mind? I've got some time on my hands to problem solve.


> It isn't Turing complete, because it will always terminate

Well, i wrote "supposedly" because i didn't try it myself but found a post[0] that claims it is. The example is even about making loops.

But the point is that these aren't only way too hacky but also slow down compilation. I did use some of my own preprocessor hacks when i wanted to do some fancy stuff with it at the past to implement an RTTI system that allowed automatic serialization of structs with nesting and references and while it worked (x-macros FTW), it was cumbersome and slowed down compilation so much that at the end i found it both much simpler and faster (in compilation time) to replace a ton of preprocessor macros with a code generator and a couple of #includes that included the generated code.

[0] https://stackoverflow.com/questions/3136686/is-the-c99-prepr...


Many people still write C because tons of crucial software, probably things you use every day, are written in it and that software needs to be maintained and improved.


IT's an old war-time friend that when battle is up we both know how to shoot and be effective at it - both towards enemies as well as our feet.


Small binary and a toolchain that's small and older than most programmers using it, and known to be bug-free. Top tool if you want to write something that a real human being can "get under the hood of" and understand throughout.

As for low-level, sure that's no longer the case. It was a low-level language for K&R and their PDP-11 where they could tell precisely what will be assembler code for each line of their C code and how many CPU cycles it will take. That's no longer the case indeed.


> toolchain [...] known to be bug-free

You cannot be serious. "Well known list of bugs" would be more in line with the state of affairs.


Because FreeRTOS is written in C.


The last time I wrote C was for my OS class


Sometimes Safety suffers performance


I want to read this. I'm also interested in the story of how SHA-0 was developed internally in the NSA if anyone has any idea.

I'm a little surprised at folks incredulity that C is used "nowadays". To me, there will always be a place for C. If you look at:

- the bulk of the internet traffic

- the bulk of the base OS systems and libraries it is running on

I'd say 99.9% of that is written in C. Hence my surprise.

I know Linus now embraces Rust in the Kernel or whatever and I'm not disparaging that, just seems obvious that C, old as it may be, is still highly relevant. C is a graybeard. Give it a break, right? :)


Dont forget embedded/iot/automotive etc… The use of C will even increase as every square meter of the globe is going to be littered with small connected devices.


Exactly! Tho I don't know much about that stuff so I thought, maybe there's a possibility they all be doin trend thangs like bein Rustaceans etc or Gophers even (if that's a thang still?)


This might miss a (1993) tag.

Interestingly, some might print the same title about JavaScript.


Never underestimate the value of being at the right place at the right time (and having _something_ to offer).


So the conclusion is, Rust will never be a success because everyone loves it ?


Sure, that's the critique implied by C++ programmers quoting Bjarne: "There are only two kinds of languages: the ones people complain about and the ones nobody uses"

Except, we do complain about Rust (e.g. I think narrowing conversions should require TryInto or a specific call, not just 'as', and I don't think String impl Add<&str> is a good idea) it's just that we think the other options are far worse.


The intention is, in 2023, the original quote should be treated responsibly.

"Programming language like C, which is quirky, flawed, and should NOT be an enormous success" instead.


When I get to achieve my target 3 times faster using Go why should I waste my time following “functional ways” just to use to write project A


Rust is not "functional ways", though, and Go is certainly not devoid of annoyances (subjectively).

Really wish people would stop having language wars and realize that languages are tools for a job. C is like a flathead screwdriver, C++ is like a philips, Rust like a torx and Go like a hex. You can probably use a flatheat or even a philips for the torx or hex screws but you probably shouldn't, because they're not the right tool for the job.

I love Rust. I use Rust often. I choose it over C or C++ these days for a number of reasons. But I'm not going to write some throwaway little one-off scripts in Rust. I'd probably choose Python if I need to crunch some data, or Node (javascript) to do some quick I/O related tasks.

I'll use a makefile when windows compatibility and graph evaluation speed aren't important, and a shell script when compatibility and graphs aren't important at all.

And so on. The sooner people begin to realize this, collectively, we can stop having these silly "X language is better than Y" discussions.

C has its place. It's simple (quirky, but simple), doesn't take on a philosophy, and has a very, very wide set of compatible toolchains. That is the reality. Perhaps C-like's will take off and replace it (e.g. Drew Devault's Hare[0]) to get us away from most of the quirks, but that probably isn't happening any time soon.

In the same way that "putting ChatGPT in front of a computer-enabled machine gun is irresponsible", so is using unsafe languages in cases where you absolutely cannot afford a security risk without some sort of safeguards, verifiers, etc. and just good ol' fashioned "good engineering". And even in "safe" languages this is often hard to achieve, so it often comes down to the engineers anyway - not the language.

I feel like we have beaten the "C sucks" horse to death so many times that we could extract oil from it at this point. What is the goal with such discussions in 2023?

[0] https://harelang.org


>Rust is not "functional ways", though,

Rust carries a lot of design decisions heavily influenced from ML and OCaml. The type system and exception handling implements things that look a lot like Monads, but with a bunch of boilerplate code thrown in due to the imperative execution. That sort of syntax turned me away from the language, and probably would for a bunch of other people who don't care for that kind of language design.

I struggle to understand a use case for Hare compared to the other C-sequel type languages.


>Really wish people would stop having language wars and realize that languages are tools for a job. C is like a flathead screwdriver, C++ is like a philips, Rust like a torx and Go like a hex. You can probably use a flatheat or even a philips for the torx or hex screws but you probably shouldn't, because they're not the right tool for the job.

There are far more languages than types of jobs though. If it wasn't for language wars, how would you ever decide which one to use?? :)


To add to this, with a flathead screwdriver you can open a beer bottle, kill someone, make holes in a wall, nada all sorts of other useful things.


I'd probably prefer the philips to kill someone, but I've never killed anyone so what do I know :D


I invoke my rights under the 5th…


It will depend on your goal. Do they produce the same output?

Or just comparable results?

Edit: oh, look at the chances - the ("the [only]" - 2nd to 1st) neighbouring submission is "How small is the smallest .NET Hello World binary". Suggests non overlappable outputs.


everyone?


Added


everyone should learn c/c++.

you need it when exploring the performance ceiling of a workload. without an understanding of performance ceiling, you can’t design a system well.

if you want to approach that ceiling[1], you need to implement c/c++. if not, it’s better to have chosen not than been forced not.

between ccls, clangd, and clion, tooling is fantastic now. it’s a great time to start.

1. https://github.com/nathants/bsv


What exactly is "c/c++"? Do you mean C and C++? Though they're obviously closely related, they're two distinct languages.


C for utilities and your OS, C++ for your video game.


These men were/are giants in the field.

But look at how simple the language was in the beginning. A handful of concepts, strung together. Incremental changes; not all were deemed best in hindsight.

You can fit what is going on in 16K in your head.


""As we said in the preface to the first edition, C "wears well as one’s experience with it grows." With a decade more experience, we still feel that way."" -- Brian Kernighan


When the inventor of a language can calmly tell you it is quirky, flawed, it immediately shows how enormously successful it is.


i had read this before, and it has been posted here multiple times, but somehow i had missed:

> Thus the core C language escaped nearly unscathed from the standardization process, and the Standard emerged more as a better, careful codification than a new invention.

which made me grin.


In 1993 the unintended consequences of X3J11 “undefined behavior” had not yet emerged, nor had the reality that future Standard committees would double down rather than fix it.

If anyone had realized what UB would turn out to mean, it would not have been invented that way. Dennis Ritchie's 1988 comments on the `noalias` proposal match exactly: “the committee is planting timebombs that are sure to explode in people's faces”; “a license for the compiler to undertake aggressive optimizations that are completely legal by the committee's rules, but make hash of apparently safe programs”, and consequently: “[It] must go. This is non-negotiable. [...] The concept is wrong from start to finish. It negates every brave promise X3J11 ever made about codifying existing practices, preserving the existing body of code, and keeping (dare I say it?) ‘the spirit of C.’”


I don't think it was even possible to predict what "undefined behavior" would turn out to mean, because the modern interpretation does clearly not come from the standards text.

The standard describes the old interpretation of "if you do that, you get the consequences". The fact that somebody came up with a reinterpretation that bring a completely unreasonable meaning that just happens to be compatible with the text is all up to the people reinterpreting it. Nobody preemptively disavows unreasonable interpretations when writing something.


I wonder how much the pile of undefined behavior contributed to C's success. I can imagine that vendors picked C over alternatives due to C giving them more freedom with their implementation.


What you are thinking about there is perhaps implementation defined behavior, which is distinct from undefined behavior; and the sequence was the other way round - hardware with different behaviors already existed, and not specifying them in the language allowed C to succeed because it wasn't tied to a particular machine.

The answer is different for different kinds of undefined behavior, but spatial memory safety violations are basically always possible in a language you can write an OS in, since you need to convert from hardware buffers to higher level types. temporal memory safety wasn't possible to enforce at the time on a low level language, it's taken decades for it to be implemented in a mainstream non-garbage-collected language. Integer overerflow is still not caught by default even in rust for efficiency reasons (it would take all the processor vendors to implement an efficient way of catching it)


> Integer overerflow is still not caught by default even in rust for efficiency reasons

Honestly, I don't think that was the right default, but it is configurable at the project level for release builds. If I were deploying tools, I would certainly enable them, just like the Android team does.


I'd say that's more of an annoyance than a feature. The reason vendors use C is because it is there, comes with compilers, ides, tools, operating systems, etc. that will all work out of the box as soon as they get the compiler going on a new hardware platform. Just a critical mass of stuff that they need that conveniently is right there. And when you need to add just a few tiny things, you are going to stick with what's right in front of you instead of rebuilding all of that from scratch.


Isn't the other way around? Vendors already had picked C and they wanted to claim compliance to the then brand new C89 standard so anything they didn't want to give up on was deemed "implementation-defined behavior" and anything they couldn't give up on[0] was deemed "undefined behavior".

[0] because their machines worked very differently from the others' machines


That doesn't really explain how stuff like not ending a source file with a newline is UB.


I just wish C had better compilers

Errors produced by current mainstream compilers are terrible.

"Unresolved symbol" being the best they can do is some joke


Uh? gcc (11) produces, eg

  /usr/bin/ld: main.o: in function `main':
  <path>/cptutils/src/xycpt/main.c:148: undefined reference to `xycpt'
You get the file, the line, the name of the missing symbol ...


Everything looks fine if you use hello world examples

Yet when reality kicks in, then you can spend 15mins trying to figure what the hells is going on


Not a hello world example, I renamed a function in a program which is part of a 13K LoC project to get that error message.


that's the linker, not the c compiler


Is this important? I treat compiler as *full* toolchain

I run compilation and expect sane error messages - saying that "oh, it's because of linker yada yada" doesn't solve my problems nor is valid excuse

Other language (compilers) do better.


> I treat compiler as full toolchain

well, you may do that, but it doesn't make it so. the linker has much less information to go on than the compiler - basically (depending on how you compiled) machine code. actually, the GNU linker does quite a good job, given what it has to work with.

> Other language (compilers) do better.

a language is not a compiler


>well, you may do that, but it doesn't make it so. the linker has much less information to go on than the compiler - basically (depending on how you compiled) machine code. actually, the GNU linker does quite a good job, given what it has to work with.

but it was their decision to split the tools like that, wasn't it?

nothing technically prevents you to model it in such a way that you can have access to the data you need, right?

You probably don't even need to have linker at all.

>> Other language (compilers) do better.

>a language is not a compiler

I wrote compiler in parenthesis.

Also, since "language" has two definitions

1. syntax

2. whole ecosystem

Then it is valid either way


all compiled language systems must support a separate link stage, if they are to be of any practical use


But all of them must have low quality error messages too?


as i said, the linker only has so much information - whatever is supplied by the (possibly various) compilers and/or assemblers. which is minimal. so a linker cannot produce very accurate error messages. i'm afraid you are going to have to live with this, and understand it.


What error message do you expect the linker to produce when there is a reference to a symbol that it doesn’t find in any of the object files provided?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: