Hacker News new | past | comments | ask | show | jobs | submit login
Kirc – A tiny IRC client written in POSIX C99 (github.com/mcpcpc)
213 points by mcpcpc on Sept 16, 2020 | hide | past | favorite | 102 comments



A little sloppy with the error handling, but pretty neat.

E.g. log_append() doesn't check fopen return value, malloc() return isn't checked, and a write() can return a partial write, thus needs a loop. fcntl().

Also write() should check for EINTR&EAGAIN.

And also there's no handling for nonblocking write.

If the user types too much while the network glitches it seems that this could cause the client to exit().

Probably the correct way is to not read from stdin unless poll() returns that writing to the socket is safe, and vice versa.

connect() errors should print where they failed to connect.

And: $ ./kirc Nick not specified: Success

And in my first test I got: Write to socket: Resource temporarily unavailable

And I see a lot of NULL pointers being printed when I connect to e.g. freenode.

And so on, and so on…

For these things I recommend re-reading the manpage for every libc and syscall you call, and check how they can fail, and consider how you can handle that failure.

307 lines of pure C is a pretty neat minimal actually usable client, so I'm not saying it's not well done. But that thing about "… and do it well" (from the readme) means doing all of the above.

The problem, of course, is that fixing these problems well is "the other 90% of the work", especially when coding in C.

And this is the main reason I avoid C when I can. You can't just call "write()". You have to write a 5-10 line wrapper function, and then all callers need a few lines of error handling. And so it goes for everything, until your program is no longer the nice 300 line neat thing you can almost write from memory.

So it's a good start. But it's pretty fragile in its current form.


i appreciate the honest feedback! definitely a work in progress and i have learned a lot since i started ~1mo ago on this. will take your suggestions and add them to my ever growing “todo” list. ;)


> malloc() return isn't checked

small digression, but I thought that at least on Linux, malloc() never returns an error because the actual allocation happens lazily, when the memory is first used?


Cases where memory allocations fail on linux include: hitting memory resource limits (ulimit), overcommit behaviour is set to stricter than the default via sysctl, kernel heuristics with the default overcommit settings end up failing allocation, container / cgroups limits, 32-bit virtual address space is exhausted - i'm sure there are more.

The default overcommit-within-reason algorithm is designed to deny allocations that are obviously unrealistic I think.

Using semi conservative ulimit settings is pretty common in interactive use to catch runaway / swapped to death situations.


Take a look at vm.max_map_count. A lot of systems have that set artificially low and that will force a memory allocator to fail and return NULL. I can confirm this is the case for jemalloc.

Also, on Linux, only 48 of 64 bits are available for VM addressing. That leaves you with about 250TB of address space. Seems like a lot until you start using VM for disk mmaps. I have actually seen this limit hit.


> on Linux, only 48 of 64 bits are available for VM

That is on amd64, not linux. And arm does the same thing. (It's also 256tb, not 250tb.)

Newer intel cpus use 5-level paging[1], which gives you 128pb VM.

1. https://en.wikipedia.org/wiki/Intel_5-level_paging


Also it's 48-bit just on current amd64 chips, it's not an architectural amd64 property - your binary is compatible with >48bit VA space of the future (unless you yourself misguidedly bake the 48-bit assumption into it). But I guess the point was to point out lower bounds.


Right; hence the mention of intel 5-level paging with 56-bit virtual address space.


It's sloppy coding and not portable. If you really don't want to check for malloc() return value because you consider that it shouldn't fail (or that you won't be able to recover from it anyway) just implement an xmalloc() that just aborts on failure and do that. At least it'll crash "cleanly" this way.


will try to address the malloc concerns asap


On 64 bits malloc basically always succeeds, because you are very likely not running out of virtual memory. You can totally go ahead and malloc terabytes of RAM.

On 32 bits malloc is going to fail once your virtual memory space is full (~3 GB).

Error checking is somewhat redundant for most applications, since you're going to abort anyway, and if there are no pages available on access - well you're getting SIGSEGV anyway, just like you would accessing a null pointer (except on embedded devices). But beware of the exceptions. In C especially it's common to use null pointers as a flag... one of the reasons why new/delete in C++ is preferable; it throws and thus aborts when it fails, which you don't care to handle, so that's about perfect for a C++ exception.


In addition to what others have said, which basically boils down to "it'll never return an error… unless it does", there's also the aspect where if you start assuming things about the kernel environment you're not coding in "POSIX C99" like the title says, but to "Debian GNU/Linux circa 2020".

And that's part of the "and do it well". A thousand years from now, if the C99&POSIX spec survives, will support correctly written code.

Dereferencing a null pointer can also be a security issue (mostly for kernels, though) so one should assume a malicious general environment.


IRC in 307 rules of C code without dependencies, pretty cool. Of course for that to work they had to sacrifice secure TLS support. Based on the "do one thing well" I was thinking you could set up a TLS tunnel from localhost:6667 to <irc-server>:6697

Not really perfect but my WIP attempt using `socat`:

`socat -v tcp-listen:6667,reuseaddr,fork,bind=127.0.0.1 ssl:<irc-server>:6697`

then connect to it:

`./kirc -s 127.0.0.1 -c 'channel' -n 'name' -r 'realname'`

not sure what TLS version socat uses by default, might be something horrible :)


oohh interesting. will throw this in as an “example”. thanks!


stunnel is a more dedicated SSL prox as well.


> stunnel is a more dedicated SSL prox as well.

New to stunnel, could you show an example command line?

EDIT: done with a .conf file but somehow failed with a command line...


FWIW I’ve had great success using ghostunnel[0] instead of stunnel in prod. Big fan.

[0] https://github.com/ghostunnel/ghostunnel


What was the need for an stunnel "replacement"? Looks like it has additional access controls and was written for use at Square, Inc.

stunnel supports TLS1.3; from the client side IME it works really well

ghostunnel appears to have a large number of dependencies on Go libraries written by a variety of third party authors

stunnel is a long-running project dating from 1998, managed by the same single author over the entire period


Any chance you know how to do encrypted dcc sends too?


Thanks, this will work for SIC too.


The IRC protocol is both text-based and simple enough that you can use something like netcat as a client (and I have, many times.) The line-based format fits IM perfectly, and the overhead of the protocol is a tiny fraction of the bloated proprietary ones that have filled this use-case today (except MSNP, which in its earlier versions was also delightfully simple and easy to write a client for, but definitely beyond the threshold of being usable "raw".)


The problem with that is that IRC networks generally send you a ping every 5 minutes or so, and if you don't send a PONG back in time you are disconnected.


It varies on network but the timing on that was pretty generous. Seen some networks where it gave you entire minutes to respond


I've used telnet as the client wait in the distant pass. Those were the days.


You could use netcat (or something akin to it). Back in the days I tunneled that via stunnel to get TLS [1] support. One could do the same with POP3 and IMAP.

One problem with this workflow is that in some clients (e.g. IRC client supporting TLS) the user had no way to identify/verify the certificate. If a client just automatically accept self-signed certificates, its just snake oil.

[1] Everyone still called it SSL back then. Oh, wait...


> If a client just automatically accept self-signed certificates, its just snake oil.

It forces attackers to use a active attack rather than a passive one. Which is the only security most IRC can have anyway, since the attacker could just join the channel and listen in that way, since most IRC networks are public.


> It forces attackers to use a active attack rather than a passive one.

The MITM or eavesdrop can happen on a bridge. If the client doesn't check the certificate and accepts any, its about as good as plaintext. It could be worse, even, due to the false sense of security.

> Which is the only security most IRC can have anyway, since the attacker could just join the channel and listen in that way, since most IRC networks are public.

IRC network private or public is irrelevant.

There were, for sure, private channels back in the days (90s). Back then you could set a channel secret (hidden) and set a password on it, effectively making it a private channel (would not show up in /whois or /list). Bots could kick people who are unknown based on filters. For example, without an auth to an Eggdrop, you could get insta kickbanned even _with_ the correct password.

Then there's PMs which are one on one (except for server(s)).

If one of the IRC servers is compromised though (or tapped, or whatever), that makes sniffing a channel or PMs child play.

There's also the problem of data integrity. If you are asking for (or giving) help in #linux and someone can change the data on the fly, [...]

FWIW, UnrealIRCd, even back then, innovated (or invented) a lot of new features on top of IRC. Some of these added security, though I don't know examples out of my head.


I never did SSL from telnet for sure, in fact I didn't know IRC supported encryption back then. There was also the ident response that was needed - can't recall the particulars as it's been 20+ years.


I was talking end of 90s. UnrealIRCd (mainly Carsten Munk / stskeeps) was one of the first to support TLS/SSL. At some point in start of 00s the popular IRC networks slowly but surely started to support it. There's also the case for cryptography between servers, something UnrealIRCd also was quick to adapt.

Some servers required ident(d) response which required a server running on privileged port 113.


The one issue for the longest time was that networks would use self signed certs and often different certs per server in the network, so it was hard to have any kind of trust. I did see a few using Let's Encrypt in recent years but moved over to Matrix these days


telnet for MUDs too. Such a simple interface for such complex worlds is something that still fascinates me.


And Nethack/Slashem servers :D.

So you can play RPGs remotely from any crap built from the 80's with serial support and a 80x24 display (WIFI232), on the display, a Spectrum +3 would suffice.


Like books


IRC has the issue that it lacks end to end encryption (and sadly OTR is outdated garbage)

If you use netcat you also do not use tls, try openssl s_client -connect server:port instead.


I generally support end-to-end encryption for everything, but I'm not sure that it makes sense in the context of IRC. IRC networks are usually public, so anyone could join your channel and listen in, even with end-to-end encryption. It seems like E2E would make for a lot of complexity and overhead without tangibly increasing the privacy of the users.


Before E2EE was used in IM clients, IRC already had IRC over TLS, and also OTR (which was also used in Gaim/Pidgin).

On IRC, IRC over TLS doesn't have the same threat model as E2EE. With IRC over TLS, the server(s) can read the data plaintext. With proper E2EE (not the marketing version) that's not the case; only clients can read the data. I'm talking about actual data/content here; not metadata.


> IRC networks are usually public, so anyone could join your channel and listen in, even with end-to-end encryption.

Yep, and all they'd see is encrypted garbage, unless they have encryption keys, if the messages are end-to-end encrypted. That's the whole point.

There are ways to do this on IRC (e.g. libfish), but no idea how that crypto actually stacks up by todays standards.


> and all they'd see is encrypted garbage, unless they have encryption keys, if the messages are end-to-end encrypted.

Yep, and they would have the encryption keys, for most channels, if the channels are to remain public, no?


There are private IRC channels (password protected or invite-only) as well as private messages.


Hence the "usually" public, I presume. While this doesn't invalidate your point that IRC could use E2E encryption, I personally only use IRC for communication on public channels, where it would be largely pointless, unless you're assuming a really paranoid threat model, in which case public group conversation is probably not a good idea anyway.


There's dcc chat, but then you trust the network in general in place of the irc network.

Fortunately, there's OTR, but client support is limited.

I wish the new ircstandarization efforts did work something out about e2e, at least for private messages.


exactly this

if it's public: who cares, and if it isn't: why are people trusting random IRC server admins

especially when there have previously been leaks from places like EFNet where admins have been caught running tcpdump or ircsniff.pl


> why are people trusting random IRC server admins

e2ee means that you do not have to trust anyone

> if it's public: who cares

IRC also lacks end to end authentication, the server owner can pretend to be you.


Cool project. I would relax the requirements from C99 to C89. You get more portability to cool retro systems that way and C99 doesn’t really add that much. Also C++ compilers are generally more able to compile C89 in C++ mode than C99, e.g. msvc



I thought MSVC added support for C99 at long last a few years ago? IIRC because it was effectively needed for a newer C++ standard, but still.

Also it's 2020, I wasn't even coding when C99 came out and I'm now a "senior" developer, whatever that means. You'll have to pry C99 from my cold, dead hands. Whatever compilers don't support it by now, I don't want to support them.


What do you think C99 adds that's so important? The only thing that comes to mind is declaring dynamic arrays in which the length is known at the call of the function on the stack. But using malloc isn't so hard regardless.

I haven't programmed in C in over 10 years though, so I could be missing something.


Designated initialisers for structs and unions are a really nice readability improvement. You can use comments if you have to of course.


stdint.h, inline, real booleans, designated initializers, snprintf etc...


I don't think so. I think they added (or will add?) support for C11 and C17, but not C99.


definitely a good idea. will add that to the “todo” list


To the contrary, I wouldn't.

Using C99 + POSIX is the defining fact about your project.

I would let somebody else write a C89 irc client, instead.


Agreed, also "retro system compiler" doesn't mean it only supports old C standards, e.g. SDCC is C99 and C11 compatible. IMHO strict C89 is a much less enjoyable language to read and write than C99, for instance designated initialization and compound literals are massive improvements.

Also MSVC's C99 support is pretty good since ca. VS2015.


cc65 ( https://github.com/cc65/cc65 ) doesn't do C99 "and never will" according to https://cc65.github.io/doc/cc65.html but it's probably the most developed C compiler for 6502-based systems.


That's a shame.


When I mentioned retro systems, I had in mind obsolete compilers in stock installations of IRIX, AIX, Solaris, BeOS, NextStep, Amiga Unix etc. Not modern compilers that target 8-bit microcontrollers like SDCC.


To the extent required by ISO C++ compliance, although they just announced at CppCon that C11 and C17 are getting full support.

Naturally like everyone else in the C world, with exception of gcc/clang, all C99 features that became optional in C11 aren't planned to be supported.


The MSVC C compiler supported the full C99 designated initialization and compound literal features in VS2015, both are not part of the common C/C++ subset, but exclusive C99 features. So all in all the C99 support in MSVC hasn't been that bad since ca 2015, it just wasn't complete enough to be called "standard C99".

But yeah, those C99 features weren't consistent at all with Herb Sutter's 2012 blog post about MSVC only supporting C features that are needed for the C++ compiler.

C and C++ are more strictly separated in the Microsoft compilers compared to gcc and clang (which both support more modern C features in C++ mode via non-standard extensions), I think that's what's confusing many people. It's not a problem in mixed-language projects though, just put all the C code into .c files and all C++ code into .cpp and you're set, compiling C code with a C++ compiler isn't such a great idea anyway, since it limits you to a ca. 1995 version of C.


I was similarly thinking vbcc.


Why not write it in betterC?

https://dlang.org/spec/betterc.html


Because despite the name this "betterC" is not a C dialect, but a D dialect which is an entirely different language than C99?

Could just as well ask why not write it in Nim, Zig, Rust, Swift, Kotlin etc... This means a different audience, different target platforms, different trade-offs.


suckless did something similar: http://tools.suckless.org/sic/


as well as a file-based tiny IRC client in C: http://tools.suckless.org/ii/


the suckless IRC clients are awesome! in fact, `sic` was my “go to” before writing `kirc`. I’m definitely not trying to compete with those, especially their file-based approach (which is great for users that work across channels) but rather offer a lightweight and “clean-looking” solution for the casual user.


This has probably a repeat of what would have been said a decade ago about c89/c99, but why use c99 over c11?


Not trying to be snarky. Honest question: Why use C11 over C99? Or even why use C99 over C89? What significant advantages do the new standards provide that cannot be done in plain old C89?


> Why use C11 over C99?

C11 gives you noreturn and alignas. Alignas can be pretty useful for low-level development in particular. Just hope you don't need variable-length arrays because those got changed to optional.

> Or even why use C99 over C89?

Several very big things: Native bool, stdint.h (fixed-width int types with known sizes ahead of time), long long, snprintf, not having to declare all variables at the top of the block (and now you can do for (size_t i = 0; i < sizeof(strbuf); ++i) because of it).


Designated initialisers.


For C99, don't forget // C++ style comments!


snprintf is pretty great for any sort of logging or error messages.


You don't need any "significant" advantage. Even a very small advantage ("an anonymous struct would be handy here") is enough, why would you _not_ use it when it's free? For the fun of the constraint? I'm not a C expert but I don't think there's any downside to using the C11 standard compared to C99


One downside is reduced portability. If you're writing in C that's often a motivation.

The number of platforms with a C11 compiler is lower than the number of platforms with a C99 compiler.


...which is lower than the number of platforms with a C89 compiler. A lot of popular projects known for their high portability are C89 for this reason.

There's also the fact that there are far more compilers for C89, and it is easier to write one than for the newer standards. This becomes important if you are interested in avoiding Ken Thompson attacks.

Personally, I still stick to C89 and the only newer feature that I've found to be useful is mixed declarations and statements, but it's no big loss as it both avoids the "variable proliferation" that some codebases seem to be afflicted with, and blocks can be used to start a new inner scope if you really need a new set of declarations anyway.


I'm so glad to not be the only weirdo out there just sticking to plain old C89. I concur to all your reasoning. C89 is simple, readable, gets the job done.

My only pain point is indeed stdint.h. Though it's often available everywhere even if not standard per se.


What are Ken Thompson attacks? I searched online but did not find anything informative. Can someone explain what these attacks are?


Which search engine? DuckDuckGo had a useful result right at the top https://duckduckgo.com/?q=Ken+Thompson+attacks > https://softwareengineering.stackexchange.com/questions/1848... (although the answer to the linked question https://softwareengineering.stackexchange.com/questions/1947... explained better to me).


They're actually called trusting trust attacks (the original paper on the topic is "Reflections on Trusting Trust" if you want a guarranteed search term); I'm not sure why userbinator used a eponym instead.

I'm also not sure why they would be relevant for a general project, since the source language being easy to write a alternate compiler for only matters for the compiler itself: once you have non-infected compiler, you can bootstrap gcc or whatever and compile everything else at whatever C standard you like.


I'm not sure why userbinator used a eponym instead.

It's the first thing that came to mind when I thought of the concept. Perhaps this discussion may yield some additional insight: https://news.ycombinator.com/item?id=24385389


See "Reflections on Trusting Trust"


Is there an equivalent of babel.js for C, which would translate C99 into C89?


When you start using new features you break backwards compilability. For C11 that means distros as new as Ubuntu 10.04 (which I still use as my main desktop) and the like are going to have problems compiling (GCC it ships with only supports parts, as in C1X). This will also apply to older embedded systems where a tiny client would be useful.

In the past a compiler and ecosystem would last a decade before it couldn't compile something. These days changes are coming out, and being used, every 3 years. It's future shock and the major cause of container usage on the desktop and in academia. Sticking with a well established older standard means everyone can avoid the massive increase in complexity and problems that containers bring.


Out of interest, why are you using a 10 year old OS that went out of support _7 years ago_? That must have horrible security implications, surely?


>That must have horrible security implications, surely?

Lets just say it's a matter of taste. I keep my attack surfaces to a minimum, backport what I can "patch and statically compiled deps for userspace"-wise. On the otherhand, I browse the web with javascript disabled so my old box probably has less "horrible security implications" than a completely up to date distro with the user blindly executing all code they're sent. Security is behavior more than software.


Older standards typically have a larger pool of people who can contribute because the standard has been around longer.

A programmer might have more experience with an older standard due to the length of time has been out or because the toolchain they use elsewhere (personal projects, embedded comes to mind, or work) hasn't updated to the new standard.

Coming up to speed with the new standard is not free. The tooling may be free for the most common targets (embedded usually lags), but taking the time to learn isn't free.


It certainly is free. The standards are generally backwards compatible and the changes are simple. You do not even need to be aware about the differences between C89 and C11 to contribute to a C11 project.

> or because the toolchain they use elsewhere (personal projects, embedded comes to mind, or work) hasn't updated to the new standard.

gcc and clang both support it.


The latest gcc and clang might not be available on a particular platform.


I'm currently porting some "C99"-ish code, and I had a few instances for which I would have loved just use C11's `_Generic` to replace a macro-hell with statement expressions and accompanying `typeof()`s all over the place. Fun fact: `typeof` is a GNU extension, and the target compiler doesn't have that.


In addition to other's comments, C11 also gives you:

  - static_assert, for ensuring things at compile-time without ugly macros.
  - atomics, for use in multi-threaded systems.


C99 over C89: designated initialization and compound literals are the biggies, plus all the small accumulated improvements that had been added to C during the 90's (e.g. variable declaration anywhere, for (int...), winged comments...)


good question! no real reason, other than C99 being the standard that i started development in. With that said, I see no reason not to switch ;)


This source code seems tiny - just ~300 lines. What features of c11 do you think would be useful to use there?


I actually feel bad about pointing out issues with this code since the project is very neat. But, I'm still going to do it.

There is a problem where the code uses explicit escape sequences for colour instead of using terminfo. This is a pet peeve of mine, because it prevents things like controlling whether or not to use colour by setting TERM to the appropriate values. Or to completely disable highlighting by setting TERM to "dumb". Or even use a completely different terminal type, like if you have an old vt52 hooked up to your computer.

Terminfo is a really nice library that abstracts away all the terminal codes. It's really what should be used here.


interesting argument. will look into it


Looks cool. I wonder whether users can overflow your buffers inputting commands in sscanf. Also, why malloc/free cmd_str in raw? You're automatically or statically allocating all the other buffers.


will fix that. thank you!


Great client, I have been using it for about a week. I would suggest to add a channel indicator before the nickname so you can see from what channel the message being sent from but otherwise great work.


thanks! i should note that channel indication exists already. it will appear, however, only for channels other than the one connected to initially.


144 lines IRC client on PicoLisp: https://picolisp.com/wiki/?ircClient


And most likely CVE free.


Any love for mIRC?


oh! i may be trying this hover the week-end :)


"Written in C" should be considered a warning




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: