The standard POSIX specification describes the vi line editing mode for the shell command line. They tried to add an emacs mode but... [1]
"In early proposals, the KornShell-derived emacs mode of command line editing was included, even though the emacs editor itself was not. The community of emacs proponents was adamant that the full emacs editor not be standardized because they were concerned that an attempt to standardize this very powerful environment would encourage vendors to ship strictly conforming versions lacking the extensibility required by the community. The author of the original emacs program also expressed his desire to omit the program. Furthermore, there were a number of historical systems that did not include emacs, or included it without supporting it, but there were very few that did not include and support vi. The shell emacs command line editing mode was finally omitted because it became apparent that the KornShell version and the editor being distributed with the GNU system had diverged in some respects. The author of emacs requested that the POSIX emacs mode either be deleted or have a significant number of unspecified conditions. Although the KornShell author agreed to consider changes to bring the shell into alignment, the standard developers decided to defer specification at that time. At the time, it was assumed that convergence on an acceptable definition would occur for a subsequent draft, but that has not happened, and there appears to be no impetus to do so. In any case, implementations are free to offer additional command line editing modes based on the exact models of editors their users are most comfortable with."
#embed is a great feature and should have been added ago.
Knowing how much memory your numbers take up is important for many applications, so I find things like "auto i = 5" to be questionable.
Fancy compound literals seem like a solution in search of a problem.
Removing ancient unused misfeatures is good.
I don't have strong feelings about the rest. But I think people are reacting to the process more than the specific features. There's always a good reason for new features -- that's how feature creep works. Over time, adding a few features here and a few features there is how you go from a nice, simple language like C to a gargantuan beast like C++. C has around 35 or so common keywords, almost all of which have a single meaning. C++ has many more keywords (80+?), many of which have overloaded meanings or subtle variations in behavior, or that express very fine distinctions -- const vs. mutable or static/dynamic/const/reinterpret_cast, for instance. All of this makes C++ a very large language that is difficult to learn.
In a world of constant feature additions and breaking changes in programming languages, C is largely the same language it was in 1989. For some applications, that's a good thing, and there isn't another language that fills that niche.
> Knowing how much memory your numbers take up is important for many applications, so I find things like "auto i = 5" to be questionable.
Automatic variables[0] don't take up any defined amount of memory. They certainly don't take up exactly sizeof(i) memory.
Struct members and global variables are more likely to do what you say; in that case it will be either not be allowed or will be sizeof(int). Conversely, `long i = 5` is two different sizes (long vs int) which could be a latent bug.
unreachable. No when the optimizer compiler can f*#k up my code if you combine it with unintended undefined behaviur. I just use asser(0) in debug builds. Not kidding.
Unreachable is a quite important optimization hint (note how the 'blub()' function removes a range check because of the unreachable in the default branch):
And you can easily do a macro check and define a custom thing that's either assert(0) or unreachable() depending on the build type. But you still need unreachable() to exist to do that. (and under -fsanitize=undefined you get an error for unreachable() too)
I'd rather not have a basic feature be put behind needing to define an arbitrary NDEBUG; having to define your debugging setup around NDEBUG would not fit some things - e.g. in my case I'd end up having to always define NDEBUG, and continue with my own wrapper around things. (with <assert.h> you have the option of writing your own thing if you don't like its NDEBUG check, which is what I do; with unreachable(), if you literally cannot get its functionality other than NDEBUG, you're stuck with NDEBUG).
unreachable() is just the standardized form of __builtin_unreachable() (gcc/clang) and __assume(0) (MSVC).
I often have a macro called UNREACHABLE() that evaluates to assert(0) or __builtin_unreachable() depending on NDEBUG.
It improves the generated code a bit.
One trick one can use is to define ASSERT() as a wrapper around assert() or something like
do { if (!x) unreachable(); } while (0)
This is a really nice way to tell the compiler about invariants -- and generate better code (and better warnings!).
There are no fuck ups involved. None.
constexpr is great because it reduces the need for macros. There are three features that make almost all macro use unnecessary. They are enums, inline, and constexpr. Good enough inline support has only really been available for a few years -- by "good enough", I mean "doesn't slow down CPU emulator code".
C doesn't have that version of constexpr. In C2x, constexpr is just a way to define constants, like
constexpr unsigned long long kMyBigNum = 0x1234123412341234ull;
Previously, you had to #define. Using enum causes problems when it doesn't fit in an int. And const doesn't mean the right thing:
const int kArraySize = 5;
void MyFunction(void) {
int array[kArraySize]; // NO!
}
The above function will work if you have VLAs enabled, or if your compiler specifically allows for it. It's nice to have a standardized version that works everywhere (VLAs don't work everywhere).
Type inference only make the code harder to read. You ended doing mental compiler work when you could just write the damm type.
And the people who say "I Just hover the variable in my IDE" It doesn't work in a terminal with grep, you can't hoved a diff file and not even github do the hover thing.
Combine that with the implicit type promotion rules of C. Have Fun.
> I think the loop iterators are the biggest user-visible thing, but
> there might be others.
Also interesting:
> Of course, the C standard being the bunch of incompetents they are,
> they in the process apparently made left-shifts undefined (rather than
> implementation-defined). Christ, they keep on making the same mistakes
> over and over. What was the definition of insanity again?
- if it's defined, then people will rely on it, which means UBSan can't report it as an error if it sees it.
- IIRC x86 defines the overflow behavior differently for scalar and vector ints, so x86 compilers that want to do autovectorization would probably leave it undefined.
C's original sin here is that the numeric promotion rules are wrong ('unsigned short' should not promote to 'int'!) but as long as you can't fix that, you can't get rid of UB and still be performant.
C syntax is already too rich and complex. What it needs is not more but less and some fixing:
Only sized primitive types (u8/s8...u64/s64, f32/f64); typedef, enum, switch, all but one loop keyword (loop{}) should go away; No more integer promotion, only compile-time/runtime explicit casts (except for literals, and maybe void * pointers); explicit compile-time constants; no anonymous code block; etc.
That said, if risc-v is successful, many components will be written directly in assembly, and trash-abl code will be written in very high-level languages with risc-v assembly written interpreters (python/perl5/lua/javascript/ruby/php/haskell/tintin/milou).
The variable-size int, unfortunately, made a lot of sense in the early days of C. On processors like the x86 and 68000, it made sense for int to be 16-bit, so you don't pay for the bits you don't need. On newer systems, it makes sense for int to be 32-bit, so you don't pay to throw away bits.
The variable-sized word made more sense when writing code to work acoss machines with 16-bit and 18-bit words or 32-bit and 36-bit words. This is also why you get C99's uint_least32_t and friends, so you're not accidentally forcing a 36-bit machine to have 32-bit overflow behavior everywhere.
Before the mid-late 1990s, programmers rarely needed to worry about the difference in size between 32 and 36 bit words.
Problem is simple—there are still systems out there like that, and people are still buying them and writing C code for them. They're just in the embedded space, where day-to-day programmers don't encounter them.
Then those systems are maintained, then then could correct their legacy C code to "fixed-C" (which should be not that much in-real-life anyways).
It would be possible to do a quick-and-dirty work with preprocessor definitions. The real thing is when you want to write "fixed-C" with a legacy compiler: you realize than it does so much things without telling you, you would need a really accurate warning system in order to catch all those things. I was told gcc can report all integer promotions, true?
You shouldn't fix it by making users choose what bit width their ints are. That's not the right answer for performance (they don't know what's fastest) or correctness (the only choices are incorrect).
If you have a variable whose values are 0-10, then its type is an integer that goes from 0-10, not from 0-255 or -127-127.
> Only sized primitive types (u8/s8...u64/s64, f32/f64);
C's success came from portability, so that would have killed it. Certainly you need fixed-size types occasionally to match externally-defined structures (hardware, protocols) but if you write u8 loop counters and u32 accumulators you're screwed on a DSP with only u24.
> That said, if risc-v is successful, many components will be written directly in assembly
There are already too many RISC-V extensions for code to be portable between different RISC-V chips without using a higher-level language.
> Certainly you need fixed-size types occasionally to match externally-defined structures (hardware, protocols) but if you write u8 loop counters and u32 accumulators you're screwed on a DSP with only u24.
This "portability" argument always rings hollow. How often are you actually reusing code between a DSP and a desktop? When real sizes don't matter, but just a minimum range, there's `(u)int_least8_t` (could be renamed as `u_min8`). On a DSP with only, say, 16-bit "bytes" (like the TMS320), that would internally be a `u16`.
C is not a "portable assembler" anymore. That's a myth. Almost no one writes C in a portable way. Every library or program has their on version of <stdint.h>. glibc, for example, uses gu8 for `unsigned char`. Put that on your 24-bit DSP, and `gu8` just became the same as `gu16` and the nonexistent `gu24`.
C is a language designed around a PDP/11 and inherits that legacy baggage. The C committee that refuses to accept the reality for "purity" reasons holds back the language.
Yep, that's why you would have had a "fixed-C" compiler with an explicit u24 and "portability", if it has any meaning here, would have to be done at the user program level.
The C committee is just adding stuff, to make even a naive C compiler more and more costly, which kills many "real life" alternatives, and in the end, does promote planned obsolescence more than anything else.
We have to remove stuff from C, not add stuff to C, make things more explicit and not more implicit (the abomination of the integer promotion...).
The stdlib has nothing to do in the C syntax even though not having memcpy and co internal to the compiler feels really meh on modern CPU.
Most of the C related cost of a new* compiler seems to come from declarations (they are a lot nastier to figure out than they look) and the preprocessor. The rest of the language doesn't seem to be that expensive to implement.
And then there is the other stuff: calling conventions, per-CPU code generator, general IR optimizations. This can be very cheap if we accept poor quality of the generated code.
I wish the C Standard Committe stopped smearing all C++ bullshit in to C. Now that many of the C++ people who promoted those features are abandoning the ship.
It's what you get when your C compilers are implemented in C++.
Why "bullshit"? I looked at the article, and everything looks extremely reasonable, and desirable in C.
* nullptr: fixes problems with eg, va_arg
* better enums: Who doesn't want that? C is a systems language, dealing with stuff like file formats, no? So why shouldn't it be comfortable to define an enum of the right type?
* constexpr is good, an improvement on the macro hell some projects have
* unprototyped functions removed: FINALLY! That's a glaring source of security issues.
Really I don't see what's there to complain about, all good stuff.
nullptr is an overkill solution. The ambiguity could have been solved by mandating that NULL be defined as (void*)0 rather than giving implementations the choice of (void*)0 or 0.
Are there any mainstream implementation where NULL is not typed as (void *)? That seems like a choice that would cause so many problems (type warnings, va_arg issues), i wonder why would anyone do that.
That would have been my preference as well. Either force it to be (void*)0 or, maybe, allow it to be 0 iff it has the same size and parameter passing method.
-constexpr is not anything like constexpr in C++.
-It makes no guarantees about anything being compile time.
-It in no way reflects the ability of the compiler to make something compile time.
-It adds implementation burden by forcing the implementations to issue errors that do not reflect any real information. (For instance you may get an error saying your constexpr isnt a constant expression, but if you remove the constexpr qualifier, then the compiler can happily solve it as a constant expression)
-All kinds of floating point issues.
We should not have admitted this in to the standard, please do not use.
nullptr is the third definition of null. one should be enough, two was bad. why three?
Im in the WG14 so ive been involved in the discussions.
It fails for 2 reasons:
In order to make it easy to implement it had to be made so limited, that it in no way useful.
The second reason, and the real killer, is the "as if" rule. It states that any implementation can do what ever it wants, as long as the output of the application is the same. This means that how a compiler goes about implementing something is entirely up to the compiler. This means that any expression can be compile or execution time. You can even run the pre-processor at run time if you like! This enables all kinds of optimizations.
In reality, modern compilers like gcc, llvm and MSVC are far better at optimizing than what constexpr permits. However since the specification specifies exactly what can be in a constexpr, the implementations are required to issue an error if a constexpr does something beyond this.
>Why is that a problem? It sounds like a benefit. It means that the optimization can't break anything, which to me is kind of the point
As-if is great! but the problem is that constexpr tricks people in to thinking that something is done at compile time, and the as-if rule overrides that and lets the implementation do it when ever it wants. constexpr is a feature over ridden by a fundamental rule of the language.
Maybe 'const' can't be fixed without breaking existing source code?
I don't really have a problem with adding a new keyword for 'actually const', maybe calling it 'constexpr' means C++ people have wrong expectations though, dunno.
For me, having constexpr being explicit about only accepting actual constant expressions, and creating an error otherwise (instead of silently falling back to a 'runtime expression') is fine, but the existing keyword 'const' has an entirely different meaning, there's just an unfortunate naming collision with other languages where const means 'compile time constant'.
It has a lot of costs to add to the C language, even if it's just the increased complexity in the documentation, and doesn't effect c99. Every processor, OS, programming language needs used in business needs to fully support a C standard. So adding to C effects every processor and computer architecture, every new OS, every new language.
If you look at CPPreference you can see how much complexity has been added to the C standard in the last few years.
constexpr is also ridiculously simple to implement -- because the existing compilers already do something similar internally for all enumeration constants.
(Enumeration constants are the identifiers defined inside enum xxx {...})
...and most compilers also already silently treat compile time constant expressions like constexpr, an explicit constexpr just throws an error if the expression isn't actually a compile time constant.
A lot of these ARE relevant and useful improvements to the C language itself; constexpr reduces the need for macro-constants (which is nice), ability to specify the enum storage type is often helpful and clean keywords for static_assert etc. are a good idea too.
And getting rid of the "void" for function arguments is basically the best thing since sliced bread.
const is sufficient to eliminate the use of macro-constants with the exception of the use of such constants by the preprocessor itself (in which case constexpr is also inapplicable).
I don’t understand why anyone would use the “auto” variable type thing. In my experience it makes it impossible to read and understand code you aren’t familiar with.
Well, the obvious (?) reason is to type less, and also reduce the risk of doing the wrong thing and using a type that is (subtly) wrong and having values converted which can lead to precision loss.
Also it can (in my opinion, brains seems to work differently) lower the cognitive load of a piece of code, by simply reducing the clutter.
Sure it can obscure the exact type of things, but I guess that's the trade-off some people are willing to do, at least sometimes.
Something like:
const auto got = ftell(fp);
saves you from having to remember if ftell() returns int, long, long long, size_t, ssize_t, off_t or whatever and in many cases you can still use the value returned by e.g. comparing it to other values and so on without needing to know the exact type.
If you want to do I/O (print the number) then you have to know or convert to a known type of course.
This was just a quick response off the top of my head, I haven't actually used GCC 13/C2x yet although it would be dreamy to get a chance to port some old project over.
> you can still use the value returned by e.g. comparing it to other values and so on without needing to know the exact type.
No no no no nonononono. No!
Loose typing was a mistake. I think any sober analyst of C and C++ knows that. The languages have been trying to rectify it ever since.
But dynamic typing was an even bigger mistake. Perversely, it's one caused by a language not having a compiler that can type check the code, which C does.
I want to actually know what my code is doing, thanks. If you want "expressive" programs that are impossible to reason about, just build the whole thing in Python or JS. (And then pull the classic move of breaking out mypy or TypeScript half way in to development, tee hee.)
The only time `auto` is acceptable is when used for lambdas or things whose type is already deducable from the initializer, like `auto p = make_unique<Foo>()`.
There is nothing "dynamic" about what I suggested. There is a real, concrete, static and compile-time known type at all times.
In this case it would be long. I fail to see the huge risk you're implying by operating upon a long-typed value without repeating the type name in the code.
const auto pos_auto = ftell(fp);
const long pos_long = ftell(fp);
I don't understand what you can do with 'pos_long' that would be dangerous doing with 'pos_auto' (again, disregarding I/O since then you typically have to know or cast).
`const long pos_long = ftell(fp);` contains a potential implicit case in the future if the return type of `ftell()` changes.
That's one reason type inference is safer than not inferring. Your program doesn't include semantics you didn't actually intend to be part of its meaning.
Also, I think lambdas would be annoying without it.
I'm not, although it apparently came off that way.
I meant that, to a person reading the code, `auto` tells you about as much about the type you're looking at as no type at all (like in a dynamically typed language).
This is where tooling can help.
An IDE could display the type next to all auto symbols if you want. Or allow you to query it with a keyboard shortcut. This gives the best of both worlds, rather than forcing everyone to write out the types all the time. Sometimes we simply don't care what the exact type is, e.g. if it's an object created and consumed by an API. We still want it type-checked, but we don't want to be forced to write out the type.
There is this guideline of "Almost Always Auto" (https://herbsutter.com/2013/08/12/gotw-94-solution-aaa-style...) and I have been following it for yeears both in my job and my personal projects and I have never been very confused by it or had any sort of bug because of it. I felt very reluctant about using it at all for quite a while myself, but in practice it just makes stuff easier and more obvious. A huge reason it's useful in C++ is generic code (stuff that depends on template parameters or has many template parameters) or deeply nested stuff (typing std::unordered_map<std::string, MyGreatObject::SomeValue>::iterator gets annoying), but it's nice almost everywhere.
Most types are repeated and getting rid of those repetitions makes refactorings a lot easier and gets rid of some source of bugs. For example sometimes you forget to change all the relevant types from uint32_t to uint64_t when refactoring some value and stuff breaks weirdly (of course your compiler should warn on narrowing conversions, but just to illustrate the point, because it is very real).
`size_t` may not be helpful. You can argue that some other specific typedef should have been used in this case, but it's kind of water under the bridge already.
As an example of this, a generic `MAX` macro that doesn't evaluate its arguments multiple times, would be (using said GNU extension of statement expressions):
#define MAX(A, B) ({ auto a = (A); auto b = (B); a>b? a : b; })
As-is, for such things I just use __auto_type, as it's already in the GNU-extension-land.
Here, probably not (with proper arguments at least; without the parens something like `MAX(1;2, 3)` would compile without errors though), but I'm just used to doing it everywhere.
You are forgiven. It must be like pushing water up a mountain. At least we got #embed, typeof, auto, constexpr, and sensible enums this time around.
How many of you guys have half of a C compiler lying around in a directory somewhere on your machines?
(And how do you find the time for WG14 AND writing code AND doing research?
My cousin is in roughly the same field as you and you publish more than he does.)
It is Sisyphean, but there also many synergies (I work on real-time computational imaging systems). And the specific things you mention were mostly not my work (I made typeof in GCC more coherent, which helped getting it standardized). But yes, I have a semi-complete C compiler around and many unfinished GCC branches...
Isn't auto already a keyword for specifying scope? I know it's never used and kind of redundant, but something like `auto x = foo(y);` is a terrible direction for C. Type being abundantly clear at all times is a huge feature.
The accepted proposal does address and preserve `auto`'s use as a storage class, so `auto x = foo(y);` means type deduction based on the return type of `foo`, and `auto int x = 10;` declares an integer with automatic storage duration.
You're getting downvoted but i think it's a fair point that revisiting win32k in any language might come up with some opportunities for optimization along the way. Performance best practices have changed a lot since that component was written.
I suspect there reason the parent is being downvoted is that "So code optimized for 80486 machines in the 90's" isn't something we actually know. Do you really think Microsoft hasn't touched this code since the 90s? It comes across as sour grapes.
My understanding of what the presenter meant here was "this is gnarly code that's been around for a very long time." Not that it literally remains solely optimized for the 486.
I seem to recall hearing from MS affiliated people in my network that someone revisited GDI circa 2015? It also wouldn't shock me if there's more that can be done.
Also worth noting that various parts of MS have declared GDI as legacy at various points in time. Yet it remains a critically important component.
A rewrite of an old code base with today's hardware characteristics in mind should yield better performance, in Rust, C/C++ or any other system language.
Fan made websites were very popular in the nineties. This one is really well made and recently updated which is really cool. This particular anime has also a cult follwing in the west.
https://github.com/mirror/busybox/blob/master/shell/ash.c