Hacker News new | past | comments | ask | show | jobs | submit login
OpenBSD has two new C compilers: chibicc and kefir (briancallahan.net)
265 points by hucste on June 30, 2022 | hide | past | favorite | 84 comments



Kefir author here. Quite surprised to see it on HN. A few notes in response to Brian's post and comments:

- Kefir name is simply a reference to milk drink, no other connotation is intended. Updated project README with this information.

- The compiler is indeed quite primitive, especially, in terms of code generation. My main goal was implementing a C compiler that is reasonably compliant with language standard and platform ABIs, so I decided to simply ignore any performance considerations as I wouldn't compete with well-established compilers anyway.

Usage of threaded code is also caused by the same reason -- it is very simple to obtain assembly from intermediate representation when the assembly is mostly composed from references to runtime routines. Even without considering more sophisticated schemes of code generation, current approach is not optimal -- threaded code encoding scheme is very wasteful in terms of generated code (which was also noted in the blog post) and has awful runtime performance.

- I have tried to keep compiler compliant with the standard as well as compatible some of widespread C extensions (with some exceptions which I listed in README). Will try to address compilation errors found by Brian. Unfortunately, currently I do not have much time to work on the compiler. Identifying and fixing such bugs might also be quite tedious, so I expect that there are enough unnoticed compatibility problems there.

- Patches implementing OpenBSD support are appreciated. I plan to integrate those into the main code tree at some point.


bumping the author's chain and bluntly suggesting that others (like myself) without much/any stake in openBSD's C compilers should stfu with the name flaming now, because the fact that it's still continuing in replies at this point is more a matter of narcissism and disrespect for the submission than anything else.


Heh, when I saw a C compiler named "kefir", I immediately thought that it must be made by someone from the former USSR.


Summertime: refrigeration is nowhere, so kefir is everywhere...

Good times.


> Kefir also says in all bold letters in its README.md:

"Usage is strongly discouraged. This is [an] experimental project which is not meant for production purposes."

> That was all the encouragement I needed.

I love this


"Usage is strongly discouraged, not meant for production" is a clue for "I actually put care into this thing, production use is probably fine"

Garbage-quality projects don't bother putting such warnings. They might even not be aware they are garbage.

(/s)


This is sarcasm but there is a lot of truth in it unfortunately.

I'm sure there must be a term for this... known incompetence > unknown incompetence.

Doing your own research on complicated products is hard, that's why we as a community seem to move on to things via trial and error.


Considering that the developer is saying "not for production" but the users are saying it works great in production, I'd say the comparison should be "unknown competence > unknown incompetence"


Isn’t that a variant of Dunning-Kruger?


Ah, yes, love those, cobbled together embedded compilers. Not a word in the Release.Me about the quirks. Just a "STABLE!!!".

Like binary C-Operators will only work on the first 8 bits, the rest is up there and needs to be shifted in and out.. basically work it out yourselve, once it wont work.

Then explain to the manager, that his hot project tooling from the megacorp upstream was basically license-brokken copy pasted garbage from some hobbyist half way around the world). And get a no, when asking for at least posting the patches back.

Or the "Working feature" which is just some api header, going into a inlined binary blob returning some constant. Which is just some flytrap to get you to drive by develop it for them. Twelve angry part time devs, make up one full working project.

The only professional in some industries is looking at you every morning from the mirror, begging for a mercy killing.


> The only professional in some industries is looking at you every morning from the mirror, begging for a mercy killing.

That got dark quickly. Otherwise, so true


for prod you'll want mushrik or neocon


Wrong root word; you're thinking of kaffir. Kefir is a fermented milk drink.


I think GP meant “ayran” or “doogh.”


not for the joke it isn't; i am not, in fact, thinking of the racial slur; we all have google (but not all of us will idly assume to know the etymology of a project's name based on this); semicolons are awful punctuation


Hey now, wait a minute. We can argue all day about kefir, kafir, and kaffir, but don't you impinge on the dignity of the semicolon just because the previous poster used one incorrectly.


I don’t think it’s an incorrect use of the semicolon. Semicolons are out of fashion these days, but using a semicolon to join two related clauses is perfectly fine, as far as semicolon usage goes.


A semicolon should link two independent clauses; using one to link a clause missing a verb to another clause is arguably incorrect.


But the verb in the second clause is not the elided verb in the first clause, so it is not linking in that fashion. The two clauses are independent of each other.


It's debatable, but they could be linked by [You're thinking of] wrong root word.

I agree that if you think "Wrong root word. You're thinking of Kafir." is acceptable, so too would be the semicolon usage, which is why I said "arguably incorrect" rather than just "incorrect"


At least for me, i read it as

"[That is the] wrong root word; You're thinking of Kafir".

Since in the first clause, it seems like the intention is to point to the previous comment that is the matter at hand, which was actually written not just thought about. However, other readings might be possible.


Could you point out where do you see such wrong usage here?


> Wrong root word;...


I don't see the wrong usage there, assuming that "Wrong root word. You're thinking of kaffir." would have been correct. Which it seems to be the case to me since both "Wrong root word." and "You're thinking of kaffir." seem to be correct sentences and juxtaposition is OK as well.


epistemic crutch token


This.


> because the previous poster used one incorrectly

The previous poster’s usage of a semicolon in this instance was, in fact, correct.


> semicolons are awful punctuation

I'm glad that someone finally admitted that C is crap. Clearly we need fewer C compilers, not more of them! Embrace Smalltalk!


Who said anything about "racial slur"s? In Urdu, one of the languages I speak, "kaffir" means "non-believer". Your suggestions 'mushrik' and 'neocon' seemed to fit pretty perfectly with that interpretation.

----

Aside: semicolons are pretty great once one learns to wield them. Try it; you'll like it. Just think of it as a shorter pause than a period, but longer than a comma.


the typical romanization of arabic كافر is "kafir". the typical spelling of the slur associated with apartheid-era south africa is "kaffir".

> one of the languages I speak

why would i care which languages you speak

you are not me

i am not you

neither of us are everyone else

>try it

perhaps instead of me doing any of that, you could simply stop arrogating reflexively in your internet comments and presuming to know the inner contents of others' minds (you do not)

>semicolons

the purpose of your original reply was to condescend in typical nerdsplaining fashion. the reason my reply to you, in turn, was written like it was, including the side mention of semicolons, was to escalate abruptly and unambiguously, such that there was no ambiguity as to what i thought of the situation. i was expressing contempt. it's merely a circumstantial convenience that semicolons do happen to be dogshit punctuation used as epistemic crutch notation in prose


Perhaps he meant mursik [1]

1. https://en.m.wikipedia.org/wiki/Mursik


Thanks for this. I don't know anything about compilers, but I really enjoyed reading this. Besides the technical insights it provided, I also loved the positive attitude of this article. Two excerpts that stood out for me:

> [...] if the code kefir produces is correct, then it is amazing that one person was able to create a complete C17 compiler and that fact should be celebrated.

> I'll admit this is not something I would have thought of but it appears to work just fine.


chibicc is great and also a very useful tool to do different kinds of C source code analysis and processing, such as https://github.com/rochus-keller/c2obx/


I use chibicc to generate the documentation for cosmopolitan. https://github.com/jart/cosmopolitan/blob/master/third_party... I added many of the GNU extensions too, including an assembler. https://github.com/jart/cosmopolitan/blob/master/third_party... Needless to say, I've already ported it to OpenBSD. I ported it to NetBSD, FreeBSD, Windows, and Mac too.


That looks like a very useful C library, thanks for the hint; as it seems it is much more than just a standard C library in that there are also threads, unicode and a lot (all?) of posix features; I do not have the full grasp yet, but it seems that I could use it as a runtime library for my Oberon+ language (which has a built-in FFI and can also transpile to C).


Is it just me, or is the first time that "OpenBSD" and "New" were in the same sentence? I love OpenBSD, but in fairness, its biggest virtue is usually its rock solid stability and security, derived from everything in it generally being long battle tested before ever going into OpenBSD. The idea of bleeding edge C compilers seems a bit strange, given that reputation.


The "new" in the headline refers to these compilers being newly ported to OpenBSD. OpenBSD also sometimes gets new features, sees new releases, etc. There's nothing weird about seeing "new" and "OpenBSD" in the same headline.


> OpenBSD also sometimes ... sees new releases

They do two releases per year on a schedule.


This is an experimental port and not about replacing clang.

That said, OpenBSD does introduce new stuff often. Over the years, they've periodically rewritten a handful of crusty old daemons for example.


I would say that for a lot of daemons OpenBSD ends up with more interesting and cleaner versions than some of the ancient BSD ilk or with some of the complexity of the GNU replacements. It's often a lot easier to port part of/some of an OpenBSD daemon with just a little bit of hacking around instead of figuring out a bigger autotools chain.


The default install is stable and secure; the ports tree is much less so. People experimenting with things on their personal workstations is a complete free-for-all.


It got a brand-new filesystem a couple years ago.


UFS2 from FreeBSD? Brand-new??


Not really, no? Plenty of bleeding-edge goes into OpenBSD, most of it never makes it out to the rest of the world before being battle-tested there.


It's probably Debian you thinking of


Thanks to the author for oksh (portable OpenBSD ksh). It is a pleasant alternative to Bash on Linux. 288KB vs 1.3MB. Wonder if oksh will compile on mips74Kc. I started to do something similar for NetBSD ash, for personal use only. Dash and busybox are OK, but I want both fc and command line history.


I have managed to compile dash with fc and history. I also want autocomplete.


All I want for Christmas is a compliant C++11 compiler/transpiler, written in plain portable ANSI C.


cproc?


> Here are the number for chibicc:

    text    data    bss     dec     hex
    753670  40034   29848   823552  c9100
> And here are the numbers for kefir:

    text    data    bss     dec     hex
   2374884 12071   30120   2417075 24e1b3
These are behemoths; especially the latter. Only compared to the grotesquelly aberrant code sizes of GCC and its ilk does this look "small".

(The numbers in fact give me a dollop confidence that these are might be substantially more than toys.)


One of my main complaints about the BSDs is that they include the entire source trees for LLVM/clang and GCC. It seems like it's the only way for them to maintain the two golden BSD rules: "You shall have in your codebase everything needed to build a complete BSD system" and "You shall run on at least 30 or 40 different CPU architectures". I wish git submodules were more modular(!), so I could clone OpenBSD and LLVM/gcc would be an replaceable submodule, and instead drop in something like smallc.


why cannot we have the same with c++ ? I wonder.... (very ironically)


Nice


Do they compile the kernel, or even just fragment .o for safe inclusion in the kernel?


[flagged]


It's the name of a weird fermented milk drink in a lot of Slavic and other languages (I first encountered it in the Czech Republic):

https://en.m.wikipedia.org/wiki/Kefir


There are shelves and shelves of it in UK supermarkets, it's very popular here. I live a walk away from a dairy that produces it (including kefir soap bars) https://www.instagram.com/wildcroftdairy/


Not just europe. I'm in canada and its very common to see it in grocery stores (maybe not the most popular drink, but popular enough people know what it is).

I fully knew what Kefir is before this post. I had never heard of the slur everyone is talking about until reading this comment section.



I would argue ducks are more mobile than your average glass of kefir.


Oops, sorry, I always confuse the two. The correct link:

https://en.wikipedia.org/wiki/Kefir


Kefir is one of the most popular milk-based drinks in Poland.


it's a CVCVC word that most people can pronounce and the primary author likes for whatever reason. it's completely fine.


Geez guys, all I'm saying is the name is unfortunate in the same way coq https://en.wikipedia.org/wiki/Coq is bound to be awkward when pronounced in other parts of the world.

I'm not implying there is anything nefarious about it.


I didn't downvote you. There are only so many phonologically near-optimal patterns to recycle into things that sound kinda-sorta like words, so you're bound to run into these things. You just pick your poison and hope that you don't end up being shipwrecked by a gale-force meme like "fedora" as a pejorative with time.

fwiw, the concern you're thinking of tends to be sidestepped either by a pseudo-initialism or a vowel shift in actual practice. In this case, the first options / paths of least resistance in an arabic-speaking community would seemingly be:

- to raise the first vowel to more of an [ɪə]¹ or

- to pronounce the /kəf/ - /kɑf/ then "IR" as single letters.

¹ which afaik would turn it into a long vowel that might act like a geminate? dunno


I mean, I've got my feelings about formal methods, so the implications about people going to technical conferences to talk about Coq are spot on.

As a relatively unenlighted person on the topic of nasty slurs, I apprechiate the tipoff that using the name of a cultured milk drink sounds uncultured. Although, I tend to name things with overworked puns, so there's that.


I don't get why, can someone explain it to me?

I'm from the US, Kefir is healthy fermented yogurt-like drink that is in every grocery store pronounced "kee-fur".

Does it mean something else in another language?


> I'm from the US, Kefir is healthy fermented yogurt-like drink that is in every grocery store pronounced "kee-fur".

Pretty unfortunate. Take a look here at a pronunciation in English more closely resembling the original Caucasian / Slavic: https://www.lexico.com/definition/kefir (/kəˈfɪə/ in IPA). There's audio in the link, which should hopefully clear up confusion.

This is also the common way I've heard it pronounced in the UK, where the product appears on most major grocery store shelves.


"Kafir" is an Arabic word meaning unbeliever or infidel, and is used in the Islamic word as a pejorative for non-Muslims. It's also been adopted elsewhere, such as a nasty slur in South Africa for blacks.


Kafir or kaffir is not kefir. Done. Nothing to see here. Move along.


I think they are talking about kEfir not kAfir. I presume they don't mean the same thing.


This must be culturally specific since where I live there's absolutely nothing even just unfortunate about the word "kefír".


I'm sorry but what is wrong with the word 'kefir'


What's unfortunate about it?


It sounds similar to kaffir which is a derogatory term used in the Middle East for people who don’t practice a certain religion. It is also used in South Africa in a manner similar to the n-word in America.


Here in the US, I can go to many grocery stores, and have my choice of kefir.

As an actual Hindu who has lived in a Muslim majority Middle Eastern country, I'm not the tiniest bit offended, because a fermented yogurt drink happens to have a somewhat similar name as an insult based on religious bigotry..


Here in Iran, the slur is pronounced as Kaafar, and the drink as kefir. They sound distinct enough.


In many european languages it definitely does not sounds similar. A is a and e is e.


Why is it derogatory? And what is the non-derogatory alternative to that word in Arabic?


I choose not to answer you because you're either a troll or ignorant. Hopefully it's just the latter, in which case feel free to use google to educate yourself.


Not many people in the US or the EU speak Arabic. If I'm ignorant of it and you're a speaker of Arabic, then why don't you just enlighten me? Since when is removing one's ignorance not the point of asking questions?


In South Africa it’s equivalent to using N word.


I'm quite obviously asking about Arabic as it is being used in the Middle East. I've even spelled it out, and so did the comment I was responding to. The irrelevant South African fringe usage was also already mentioned in the comment I was responding to (hence your repetition of it bringing zero new information) but of no interest to me (unless you believe that Arabic is a widespread language in South Africa), hence me not referring to it.


If you weren’t allergic to Google you would have learned that the South African word was picked up from Arabic. So yeah, it is relevant.

Maybe people don’t spoon feed you explanations because you’re rude even to the people who try to explain.


No, the fringe usage half a globe away still doesn't make it relevant if I'm asking how some things are called in a Middle-Eastern language. Likewise, the Czech word "Polák" for a citizen of Poland with no alternative to it in that language does not become suddenly offensive or inappropriate just because the cognate "Polack" happens to be offensive in the US. And as one can clearly see, I'm not even being rude to people with inconsequential segues.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: