Seems a bit painful. HTTPS on the webserver itself is a fair bit more painful to setup and administer than HTTPS in your reverse proxy / loadbalancer (I prefer nginx). Web servers should support plain HTTP.
It is not a run time option by design, but it is there.
I want Kore to have sane defaults for getting up and running. That means TLS (1.2 default by only), no RSA based key exchanges, AEAD ciphers preferred and the likes.
RSA with DHE or ECDHE is a sane handshake. I would avoid DSA and ECDSA based key exchanges because they fail catastrophically with bad random number generators. For most APIs session caching is more important than a faster initial handshake.
The HTTPS only choice would annoy me a lot because I run most HTTPS services in behind a reverse proxy in a FreeBSD jail on the same host. HA proxy and nginx are still superior to most applications in regard to reliable TLS termination.
Using HTTPS by default a the right choice for a new project but offering no HTTP support (outside of a benchmark) patronizes the user.
All in all this looks like a nice way to export C APIs through HTTPS.
Everytime something like this comes out, I think the following : Wouldn't it be great if I could develop and debug this under Windows with VS and deploy my release with Linux. The network libraries are always Linux(and family) only.
I'm not trying to antagonize here, but what do you find so bad? I would call it a little bit more painful; three lines to add to an nginx config, along with generating the cert. Maybe 10 minutes of work? Thirty if you're getting a CA to sign your cert for you. I could see pain if you need to wait for finance to approve, or if you're trying to get domains validated on behalf of a customer. And I suppose it adds another setup step to wireshark (if you need to debug neat bugs), but that's a set-it-up-once-and-forget-about-it thing.
I'm curious- why C? Strings, scoped objects and C++11 move operators seems much safer and clearer from an API perspective.
The complaints about C++ seem to mostly be around the ability to abuse the language, not specific issues that C solves. Something like https://github.com/facebook/proxygen seems like a better API.
And I don't quite buy portability- if it's not a modern compiler with decent security checks then I'm note sure it should be building web-facing code.
I've been building an HTTP/1.1 server in C++11. Along with a C++ wrapper around SQLite, I've been having a lot of fun putting some lightweight forum software together. I definitely enjoy the code structure and compile-time safety over PHP.
Using a threaded model with tiny stacks, and std::lock_guard for atomic operations.
The biggest downside is you have to run the same OS your server uses on your dev box (which is what I do); or you have to upload the source and compile the binaries on your server directly. (or have fun with cross-compilation, I guess.)
To answer the inevitable "why?" -- for fun and learning. Kind of cool to have a fully LAMPless website+forum in 50KB of code. Not planning to displace nginx and vBulletin at a Fortune 500 company or anything.
Still wishing I could do HTTPS without requiring a complex third-party library.
Go would solve the problems you mentioned (cross compilation and HTTPS support), and would also offer first-level support to many web concepts and protocols that you need to implement from scratch in C++ as the ecosystem is not there. Of course if it's just for fun, then anything goes :)
I am so glad that my company chose to write the majority of our web oriented frameworks in Go. It's simple to the point of boring for most of this stuff.
I find this so much more pleasant to use than the alternatives. There's a portable distribution of it which you can put in your project and statically link.
I use OS X for my main machine, and can happily deploy something to a server running 'not OS X' and be confident that all libraries and dependencies are exactly the same in production as what I was using in development.
Sure Docker is running through a stripped down VM on the Mac, and so technically it's "the same OS my server uses", except it's abstracted away so that for all intents and purposes I'm using OS X for development (and email, browsing, and other things) and deploying hassle free to Linux.
That's not true; Docker runs great on OSX natively. You can't run the containers there, but the client itself is fine and largely abstracts away the whole "where am I building" issue.
No, the docker program itself does indeed run natively on osx. You do not need to use boot2docker or any other VM.
You can't run any containers on osx, but docker itself runs fine as a client binary to docker running on a linux server. I use this configuration daily using a native osx docker binary on my workstation speaking TLS to a docker service running on a CoreOS server.
C++ is really horrible, everything feels like an after thought. The new unique_ptr and shared_ptr ends up with really ugly code, the concept is good but wow...
Can we move away from this horrible language already?
I don't know about your case, but most of the time I found complains like yours , is from people who never actually tried the language, or just heard a bad story or have one bad experience reading a bad legacy code.
I think Torvalds and Stallman fall in that category actually. Stallman even mention generics which C++ doesn't have and it is very different to the templates mechanism that C++ does have.
As for the people who actually uses the language (specially after C++11), most of them would tell you that there is a great subset of the language that works for them. A valid argument I have heard before is that subset is different with each people and the problem arises when maintaining somebody else code. Well that happens with every language (Have you ever had to debug a memory corruption bug in an old C code with void* all over the place? Hint: it's not pleasent )
C++ is a language that allow you to use abstractions with a very reasonable performance and very reasonable resources, and in that niche there is not a real alternative.
Yes, you can have a fine tuned VM running Java or .Net code that might be comparable with C++ but with a cost in memory. I don't think Rust or D are ready yet to be a real competitor.
So... no... we can't move away from this "horrible" language yet.
On the contrary I always heard several years ago how good it was and it was the language to rule them all. Now those years have passed and I used it extensively from time on and off, even jumping on the C++11 standard as soon as it was available to start working in. Now I know the dismal truth about the state of C++ projects.
Now the people who defend C++ or push it to every project (like in this thread) is almost always someone who sit exclusively in C++. You know, the people who think they are experts in C just because they know C++ (like in this thread)? There is the real incompetence.
I also fail to see how hunting down a memory corruption bug in C will make C++ look better, and since you mention it is legacy code it is the same in C++.
It does not matter what language I suggest, the real fact is that C++ user will use C++ for ANY project regardless. So no, we can't move away from this horrible language yet, but not for the right reasons.
> "I also fail to see how hunting down a memory corruption bug in C will make C++ look better, and since you mention it is legacy code it is the same in C++."
It doesn't, I meant it as just an example that every laguange has its own caveats.
>It does not matter what language I suggest, the real fact is that C++ user will use C++ for ANY project regardless.
That just not true in my experience. Actually none of the C++ developers that I know uses just one programming language for everything. They pick the language depending of the requeriments of the project (and that's how it should be isn't it?)
I use it a lot, and find it to be quite tolerable once you get it. Its verbosity is a wart, but it's understandable when you understand what it is. ObjC has [[similar syntactical] issues].
The trick to it is not to over-use or abuse the language's many features. I tend to write C++ that is a fairly thin layer on top of C, using the STL for data structures and algorithms but using things like my own templates very sparingly.
I like Go, but it's really only well supported server side. Rust is promising but not ready for prime time.
... easy to distribute HUGE statically linked binaries. Go solved the DLL hell and/or VM platform hell problem by just punting it.
Of course, I have wondered for quite a while if it might not be interesting to do away with DLLs in favor of static linking and then use both disk and memory deduplication to handle the efficiency issues.
Of course the problem here is: what happens when a major bug like Heartbleed is found in a library used by 500 things?
C++ is a swiss army knife that works with C, which I have learned very well and have great tools for. I understand the concern, but moving away is not so simple.
That being said, I prefer well-designed C-style libraries in a lot of cases, since C++ libraries often have opinionated styles (the language is huge, after all).
Actually C with a pool allocator is an excellent language for writing a web server. I wrote one in C 15 years ago, and the code is very elegant and simple:
It seems that people are so intimidated by the infamous complexity of C++ that they don't even want to bother getting more familiar with it.
So, although technically the existence of C doesn't make sense, as it is superseded by C++ (except couple of things), C is winning in the branding department.
I don't know if this is the reason why people choose C over C++ for some projects, but if language complexity is the reason, it isn't that people are just "so intimidated by the infamous complexity of C++ that they don't even want to bother getting more familiar with it".
In the 90s, C++ was much more popular than it is now. It was used as the go-to general purpose language for all kinds of "serious" software (not addressed by VB or Delphi). Early on, almost everyone was very impressed with the power C++ brought, but after a few years, as codebases aged, it became immediately clear that maintaining C++ codebases is a nightmare. The software industry lost a lot of money, developers cried for a simpler, less clever language, and C++ (at least on the server side) was abandoned en-masse -- almost overnight -- in favor of Java, and on the desktop when MS started pushing .NET and C#. So while today C++ is servicing its smallish niche quite well, as a general-purpose programming language for the masses it actually proved to be an unmitigated disaster. It is most certainly not the case that C++'s infamous complexity is "intimidating"; C++'s complexity is infamous because of the heavy toll it took on the software industry. Which is why, to this day, a whole generation of developers tries to avoid another "C++ disaster", and you see debates on whether or not complex, clever languages like Scala are "the new C++" (meant as a pejorative) or not.
I also feel like a lot of people are biased because of Stallman and Torvald's stance on C++. I kind of have an irrational distaste towards the language, mostly due to their influence.
I find funny that the by investing into JITs instead of optimizing compilers for Java and .NET, C++ managed to get a spot in HPC.
Now both Java and .NET eco-systems are getting there AOT optimizing compilers. .NET with MDIL and .NET Native, Java targeted for Java 10 (if it still goes to plan).
Also, thanks to Oracle's disregard by mobile platforms by not providing neither JIT nor AOT compilers for Java, C++ became the language to go for portable code across mobile OSes when performance matters.
> I find funny that the by investing into JITs instead of optimizing compilers for Java and .NET...
But that's because Java is meant to cover a very wide "middle" -- plus maybe a few corners where possible -- rather than every possible niche.
> Also, thanks to Oracle's disregard by mobile platforms by not providing neither JIT nor AOT compilers for Java
Well, it didn't start out that way, did it? But mobile platforms -- because they're rather tightly controlled -- are, and have always been, much more driven by politics than technical merit.
I am (or at least used to be, at some point) fluent in C++ and have built large applications with it. But I'd still rather write C, because its infamous simplicity is of a lot of use to me -- whereas I found that much of C++'s complexity tends to solve a lot of problems UML designers think we have, and very few problems that we actually have.
Not really. I'm sure this happens some of the time, but I suspect that's more common with developers who haven't written either.
In my experience, a lot of the developers who prefer C to C++ are developers who wrote a lot of C++, found that it only improved productivity in solving problems that it created, and went back to C and realized how much easier it is to write software in C.
C++ gets you caught up thinking about problems that don't even matter.
I would recommend you to first read "Tour of C++" or another book that explains better the new features on C++11. Meyers explain very well some of edge cases of move semantics but if you don't understand the concept first then you might not have a good experience reading the book.
>"says that 'don't use malloc and free, unless you want to debug 1980s' problems."
The problem with malloc and free is that they left to the developer to keep track of the references and memory corruptions bug are not pleasant to debug. RAII idiom actually helps a lot in that context, and that is what Strouptrup was referring to.
Also that example uses a Factory pattern which is actually verbosing the example, but it's like comparing apple and oranges since that's OOP which you actually wouldn't be doing in C.
Of course that is a stupid example; However things gets more interesting when you have pieces of code sharing the same structure (threads maybe?) and you don't know exactly which code should be the one in charge of releasing the pointer.
instead of a simple malloc and free, you will invariably end down the rabbit hole learning about rvalues. perfect forwarding, move semantics, lvalues - the list goes on - these topics arise from the simple concept of RAII.
>i think you misunderstand.
>clearly, it is C++, not C.
I got that, what I said is that you wouldn't have that problem in C because it arises when you are doing OOP, and most C devs would not use OOP. Also, keep in mind that it is perfectly fine not using OOP in C++.
> instead of a simple malloc and free, you will invariably end down the rabbit hole learning about rvalues
That's not true. Actually you can be a very decent C++ developer without knowing the notion of what an rvalue is. Move semantics is just an optimization to avoid extra copy of objects, so it is completely optional.
At the moment of writting this, there is an entry on the front page with an example of a modern C++ piece of code. You would notice that there is not a single explicit heap allocation nor any other crazy stuff.
i think moving data around in memory is not an exclusive feature of OOP and i hesitate to guess why you think that operations similar to those exposed by the 'move semantics' are not something undertaken (regularly!) by c developers using malloc and free.
> instead of a simple malloc and free, you will invariably end down the rabbit hole learning about rvalues
truth is subjective to me, in my experience, when you program C++ you will end up exploring a myriad of vast expanses of language features. you may feel that someone can be a very decent C++ developer without knowing that, plenty others would be aghast that someone ignorant of rvalues would describe themselves as 'very decent'.
remember the ostensible creator of the language rates himself at 8/10.
i think that front page example is rather interesting, given the number of #using directives - it gives an indication of the time that developer has spent learning the language. most of them are not trivial to understand to the degree that this guy has. also reading his background, i'm quite sure he knows what an rvalue is.
as you can imagine, i don't really mind explicit allocations - as an awful lot of knowledge is required in C++ to deal with implicit allocations.
in my opinion, there's a lot of 'crazy' stuff there, the short linecount is a product of the author having done his homework.
i get the impression that you and i would use very different subsets of C++. mine would be far smaller! i usually confine myself to whatever idoms the libraries i pull in use and go no futher.
The exietence of C makes perfect sense thanks, it's a relatively small and simple language with masses of flexibility.
C++ adds masses of complexity and implicit behaviour. While development in C++ can be quicker and might be 'safer' it can also produce all sorts of unexpected problems.
It also encourages all sorts of nested template types that can make existing codebases incredibly hard to read.
Further, in embedded situations, you may not have space for its standard library.
And a lot of complexity, obfuscation and implicit behaviour.
I have developed and enjoyed developing in both. C has an elegant simplicity about it and you can do literally anything. C++ can be quicker, and it has a bunch of useful standard stuff, but it does have some downsides and quirks. There's room in the world for both.
> So, although technically the existence of C doesn't make sense, as it is superseded by C++ (except couple of things), C is winning in the branding department.
Programming Languages are in the domain of UX. Type systems, syntax, RAII -- they all just a serve a means to an end, which is useability.
That is unfortunate, but in those cases, if C or C++ is the only option, I would advocate writing cross-platform code and running the tests under a Valgrind-supported platform.
This is a bit more difficult for kernel-level code running on an RTOS, as there might be a lot more unique APIs to stub out, but it can still be done.
You are no longer testing the same code under the real execution conditions, thus missing out possible leaks on the original code path.
Although you can add that better test that than nothing, assuming that the respective OS vendor doesn't offer tooling similar to Valgrind, which I fully agree.
Embedded systems. For example a surveillance camera could have a small web interface for configuring it and allowing remote access. Nowadays, even cameras have enough power to run real web servers with Ruby on Rails, but for smaller embedded systems, like a pacemaker, a web app written in C could make sense.
My impression has mostly been that on weird embedded platforms you'll just have some bad C compiler available, and that's the reason. Is it more than that?
C is ripe for integrating with other higher level languages, and the no-fuss license encourages this. I'm looking forward to checking this out further. Good luck Kore!!
C++ is just one of the many failed attempts to improve on C. So, yes, C has its issues, but C++ is certainly not the solution.
The long story. There are three dominant Turing-complete axiomatizations: numbers (Dedekind-Peano), sets (Zermelo-Fraenkel) and functions (Church). The Curry-Howard correspondence shows that all Turing-complete axiomatizations are mirrors of each other.
If "everything is an object", it means that there must exist a Turing-complete axiomatization based on objects.
Well, such axiomatization does not exist at all. Nobody has phrased one. Therefore, object orientation and languages like C++ are snake oil. They fail after simple mathematical scrutiny. C++ is simply a false belief primarily inspired by ignorance.
You may not only care about portability. And "safety" sometimes also means being able to meet realtime requirements, and also often means having the ability to carefully account for resource use in ways that many interpreters does poorly (and no, kernel enforced system limits are not always an option).
Many interpreters also make assumptions (e.g. expecting a POSIX'y system) about the host system that many embedded platforms doesn't necessarily meet.
I've more than once looked at interpreters for embedding and found most of the alternatives sorely lacking. Very few interpreters are well suited for embedding on constrained platforms at all (Lua, admittedly is probably one of the more solid exceptions). And once you start having to write lots of support code in C to port or sandbox your interpreter of choice, the reason for considering an interpreter quickly becomes less compelling.
Writing a web application in C sounds like a good trigger for an utterance from the Jargon File: "You could do that, but that'd be like kicking dead whales down the beach."
We've advanced the state of the art quite a bit with dramatically more expressive languages than C that are sufficiently efficient in terms of memory and CPU. This is especially true when communications are occurring over HTTP and not direct socket-to-socket comms.
Why use C instead of D, Rust, Go, C#, Java, Perl, Python, Ruby, Scala, Clojure, Erlang, Elixir, Haskell, Swift, OCaml, Objective-C...?
I didn't miss C++, it just seems a worse alternative than C.
> Why use C instead of D, Rust, Go, C#, Java, Perl, Python, Ruby, Scala, Clojure, Erlang, Elixir, Haskell, Swift, OCaml, Objective-C...?
Because C runs pretty much anywhere? There are plenty of platforms where C is available where I doubt you'd find any of the others above (e.g. C64; yes there are C compilers for them; yes, I'm mentioning it tongue in cheek)
Because you can generate small, compact static executables? E.g. I used to write network monitoring software and an accompanying SNMP server for a system with 4MB RAM and 4MB flash, the latter of which had to include the Linux kernel and a shell on top of the application in question. The system was so limited we did not run a normal init, and couldn't fit bash - instead we ended up running ash as the init...
There are plenty of use-cases where "web application" == "user interface for a tiny embedded platform".
I always hear this argument, but as time has progressed, for better or worse 'anywhere' has become a much smaller target. If your language runs on Intel and ARM then it's good enough. There are a lot of reasons I might choose C for a project, but 'run anywhere' is not one of them.
C is a good solution for things like realtime multiplayer with lots of state, lots of side effects, etc. A lot of the modern abstractions actually get in the way, for example:
* List traversal order matters a lot when it's something like a list of monsters getting struck by a spell and the spell has complicated side effects. Brushing it under the rug with abstract iterators or functional Array.map's is a recipe for not knowing how your own game works.
* Realtime is an illusion, it really means "fast turn-based", you don't want players with fast connections to get an advantage by spamming commands and having them executed the instant they're received. You want to queue commands and execute them fairly at regular pulses. So much for all your abstract events infrastructure!!
* Certain object-oriented idioms become eye-rollingly silly when your application actually involves _objects_ (in the in-game sense). Suppose it's a game where players build in-game factories, suddenly the old "FactoryFactory" joke just got a million times worse.
I'm not saying C is the best for those sorts of applications, but it's certainly not bad, and a lot of modern language features just aren't appropriate.
Why use C instead of D, Rust, Go, C#, Java, Perl, Python, Ruby, Scala, Clojure, Erlang, Elixir, Haskell, Swift, OCaml, Objective-C...?
C and Rust are not in the same play field of Ruby, Python or PHP. These languages are typed, compiled and MUCH faster.
You'll obviously build 99% of your application in Ruby, but you might need C or Rust for high-volume calculations.
An example that happened to me a few weeks ago. Scaling a financial application to make millions of calculations. The core App is made with PHP, and the difference between 0.1sec and 0.000764sec gets important here.
This seems to be an adaptation of the earlier SafeStr lib. SafeStr is nice except that the C community is still too busy arguing about their awful lstr* and str*_s functions to take any notice and incorporate something genuinely secure like SafeStr into the standard lib. And the NVD just keeps growing...
So... you guys added something random and silly like <complex.h> to the standard, but still couldn't get around to a working string implementation? OK. Well, good luck with all that.
libsrt, sds, SafeStr... How many times will this problem be solved independently before something like this makes it into the standard lib? Libsrt looks nice, but if you can't use it to interface to libraries that use the same types, you're still futzing around with char buffers.
What does Kore use? libsrt? Let's have a look at how you're supposed to program a web app in its examples:
Yes, Kore don't use libsrt, although you could use it in your Kore user-modules (not all services are going to be trivial operations). And I agree, C strings is far from being a solved problem. In the case of libsrt strings, in addition to have embedded length (fast concatenation, search, etc.), it does support UTF-8 operations, include case conversion, without requiring OS support (i.e. without "locale", nor using custom hash tables, but in-code efficient hardwired character range selection), and most typical string operations.
P.S. Before implementing libsrt strings I did a wide study of many string and generic C libraries, implementing the best from all, and adding things that were not still covered (I'll investigate the Kore string/buffer implementation, too):
Kudos for the library design - it looks quite nice. We're looking for a safe, cross-platform, C string library at my work now - I'll do an evaluation of libsrt.
As for ANSI C, maybe someday this will get folded into the standard and we can pass around ss_t* 's rather than char* 's whenever we use third-party libraries.
Thank you for the consideration. Although the library targets safe and cross-platform code, I don't recommend you using libsrt on production code, yet. Rationale: the API is not yet finished, and it could have some changes that could break your build. Suggestions are welcome :-)
P.S. I don't expect any standard committee adopting that, not even wide usage (I'm glad just having some feedback! :-D).
This looks pretty neat actually. A ton of effort clearly went into it and it looks like the code is really well written with pretty well thought out interfaces
"Its main goals are security.."
Is it actually?
I also don't really see an advantage of using something like this over something like the Go net/http package.
Web-type API stuff is usually high enough level that something like C doesn't make sense. Go has nice enough standard packages for system things that even if I was doing a lot of system-y stuff I would be alright. I don't really see the type of work I would be doing where I want to use this.
> I also don't really see an advantage of using something like this over something like the Go net/http package.
C runs "everywhere". Your old C64 from the 80's? Has C compilers.
Go does not, even with gccgo.
I'd bet there are also still likely at least two orders of magnitude more programmers that know C well, and still more programmers with in depth experience of embedded development in C than have even tried Go.
I've still yet to meet anyone outside of the startup devops bubble that have written any Go, and often what language the developers have experience with matters more.
Yeah, but the idea to use them back then was like trying to use Ruby for real time applications in modern days, given how shitty they were.
Home computers might have had implementations of C and Pascal dialects, but we all used Assembly when stepping out of the built-in Basic and Forth enviroments.
While you are correct in that picking a higher level language doesn't shield you from writing insecure code, insecure C/C++ failure modes are usually quite a bit worse than other languages used for that purpose. I don't trust myself to write code that handles memory 100% correctly and is also network-facing. If much better developers than I manage to screw that up, what chance do I have?
Mmm, it can be difficult to take advantage from C exploits (in the context of a webserver). On a well written C system, you might expect most bugs to lead to crashes.
PHP bugs tend to be more exploitable, because you're doing something supported by the language.
But what's stopping someone from writing an app-level vulnerability in C vs any other language? Most of them are because of horrible handling of strings, which is something C is also not that great at. I'm not seeing the security benefit here.
My gut feeling is that if you really want security in C, you have fewer constructs to misuse and so you get less unexpected behaviour. At the same time, you get more protection mechanisms; guard pages etc. I.e. harder to be secure, but if you really want to harden, you can get harder than in higher level languages.
I'm not putting any kind of weight behind that, though; I just feel that it's a bit odd for people (not specifically meaning you, just the whole thread) to criticise this purely on language choice and not put any substance behind their criticisms that actually relate to the software in question.
None of your standards for what a high-level language should be have any bearing on what a high-level language is, according to the definition that people actually use, and those standards might exclude Python, Ruby, C++ and Java among others.
Actually yes. For security, even PHP is a better choice than C (for certain versions of the idea of "security"). There's entire classes of security problems that are literally impossible in PHP.
Given that this project is implementing a web server from the ground up, that site doesn't really have advice. The milestones for "Are we web yet?" include a web server, someone writing a new web server in Rust would contribute toward the goal, rather than relying on the dependencies being discussed.
"Are we web yet?" is about whether you can effectively build web apps in Rust, not whether you can effectively build core networking infrastructure in Rust.
That's not to say I'm arguing this should have been written in Rust (if it were me, I might have done so, but it's not, so I don't get a say).
Also, you can't possibly argue that C satisfies all, or even most, of the milestones given for "Are we web yet?"
"Also, you can't possibly argue that C satisfies all, or even most, of the milestones given for "Are we web yet?""
C has all the libraries. Like, all of them, ever. I don't like it necessarily, but it's true. It has all the HTTP servers, all the database drivers, all the email and all of the "misc".
(Yes, not literally all. But it's a closer thing than we'd like to admit!)
Except for the fact that it's basically completely unsuitable for use in a network environment due to its design flaws, and one "security-focused" framework can't change that because you've still got all the rest of C's problems and large set of libraries that also really shouldn't be put on the network (and frankly I still trust the "security-focused" framework about as far as I can throw its immaterial self, because it's still written in C), C would be the perfect web programming language, and passes, yes, darned near everything.
Also, I don't know if you got here after the message was deleted. I think my reply makes more sense in the original context it appeared in.
The site and documentations looks well done, great job!
Architecture looks pretty interesting too. Wonder why was there a need for an accept lock? Ordinary accept() socket call already allows for simultaneous threads/process wait on a single socket.
The accepting socket is shared between multiple workers which each have its own fd for epoll or kqueue. Because of this a form of serialising the accepts between said workers is needed to avoid unnecessary wakeups.
If you are the author, thanks for sharing the project. You did a great job and made the right choice of having per CPU worker processes each with their own epoll loop.
Some fantastically quick points from a very cursory glance at the code. Feel free to ignore this.
- The code uses the convention to put the argument of return inside parentheses, making it look like a function call. This is very strange, to me.
- It treats sizeof as a function too (i.e. always parentheses the argument).
- It is not C99, which always seems so fantastically defensive these days.
- It's not (in my opinion) sufficiently const-happy.
- I saw at least one instance (in cli.c) of a long string not being written as a auto-concatenated literal but instead leading to multiple fprintf() calls. Very obviously not in a performance-critical place, so perhaps it's not indicative of anything. It just made me take notice.
I see you picked out the few things that I consistently hear on the coding style I adopted which is based on my time hacking on openbsd. I have no real points to argue against those as it is based on preference in my opinion.
I am curious why you arrived on it not being sufficiently constified however. I'll gladly make sensible changes.
As for the multiple fprintf() calls ... to me it just reads better and the place it occurs in is as you stated pretty obvious non performance critical.
Right. I could have guessed these were based on some coding style guide from somewhere.
I still don't see the point, or why any sane guide would prefer to treat return as a function. It just never seems helpful to me, and always wasteful/more complicated. I realize it's just two tokens, so it's probably not "important" in any real sense of the word, but it irks me. I like to point it out since it can help others cargo-culting this.
It's not sufficiently const if there are places where a variable could be const but still isn't. :) To be super-specific, the variable 'r' here: https://github.com/jorisvink/kore/blob/master/src/cli.c#L542 is one such case. It should be declared inside the loop, i.e. as "const ssize_t r = write(...);" since once assigned the return value from write(), it's read-only.
Of course, many ancient-smelling style guides seem to outlaw declaring variables as close to their point of usage, too. Note that declaring variables inside scopes other than the "root" one in a function isn't even C99, but many people seem to think you can't do that.
That's fair. Parenthesising return is a matter of readability and flavour to me. It tickles my spidey sense if it is missing.
I strongly dislike declaring variables anywhere else but the function root, but I agree with you on the example you provided that those kind of variables could be constified to be sane.
Was quite excited to try out a little websocket server with Kore till I saw it fork's per connection. I don't really want 20k processes for handling 20k connections, I was really hoping for an event loop.
Evented io is great for extremely high concurrency, but that isn't always the right thing to optimize for. A forking web server might be faster for users depending on the application.
Lastly, you can't just have an event loop without also creating an entirely async platform. For an event loop to work well, all operations from file reading to network requests need to be completely async.
Out of curiosity in what scenarios do you see a forking web server being faster than a evented server that balances requests across cores and can direct a request to the core with the best cache for the request?
I completely agree with need to async. The hard part is that many operations are async without an async interface. For example memory allocation, or even memory usage if the memory was not truly allocated by malloc.
What does it mean to balance requests across cores? To run T event loop threads/procs, where T is tied to the number of CPU cores? So like, a pre-forking, multi-proc, evented server?
I actually can't think of a case where a multi-threaded/forking-only web server would be faster than that. Again, assuming complete support for async libraries used throughout the web application.
Are there any web servers that have this architecture? NodeJS obviously doesn't. *
* Actually, for maximum absurdity, it looks like Kore, the web server we are currently discussing, has this architecture
1) Client libraries you might need to use in your web service might not be available in asynchronous versions.
2) Writing blocking code is much easier to write than asynchronous code.
3) Your server code is CPU bound, so there's no benefit to an asynchronous model.
4) If your web app runs in an asynchronous server and your app crashes, it'll crash the whole server. On the other hand, in a forking model, only the client that the child is serving will be impacted; the other workers will be unaffected.
5) Memory leaks are easier to contain in a forking model, assuming the child can exit or be killed after N requests.
Unless you are optimising for space, there is no real reason to use C in IO bound processes (for which event loops are ideal); you may as well use Python (or even JS if you must) as your performance will be dominated by IO time.
There's always Vibe.d from the D programming language. Granted you would have to write in D, but most of the libraries of C are available. Concurrency is definitely accounted for since D supports it internally in the language. If you're seriously considering a 'native' approach to web development.
Writing high level C applications can be easy, if you use a library that frees you from using dynamic memory on typical data structures (e.g. strings, vectors, sorted binary trees, maps). I'm developing a C library for high-level C code, with hard real-time in mind, is already functional for static linking: https://github.com/faragon/libsrt
When an HTTP API is just an additional feature of a larger project, it may make sense to keep using C: a toolchain available everywhere and well known (including cross-compilation and full bootstrap), a small memory footprint, easy use of any library needed for the project.
I am doing a lot of that and will keep a look at Kore. Unfortunately, HTTPS only and non-evented core is a no-go for me.
I am currently relying on the web server embedded in libevent, as well as wslay for websockets and some additional code for SSE. To easily start a project, I am using a cookiecutter template: https://github.com/vincentbernat/bootstrap.c-web
Probably it would be better to have also a lib or dll, not just a program that render c/c++ servlets. It seems that all (that the kore executes from the feeded code) is running in servlet threads or at least started from servlet thread. Or it would be cool to add some "application" framework, not only "C-servlet" framework
If you think Kore is interesting, then also check out Tntnet <http://www.tntnet.org>. I've checked it out a few years ago and it felt good - stable, complete, easy to use etc.
Process per connection does not scale no matter how light weight, even with COW. kore's connection handling model is the reason apache2 mpm_prefork fell out of favor many iterations ago.
The only valid argument to avoid a single event based I/O is some sort of hard blocking I/O such as disk or non-queuing chardev.
However I'm still not biting, this is solved...and as usual the answer is somewhere in between. For example, in RIBS2 there are two models for connection handling, event loops for connections and "ribbons" for the non-queuing bits [1]. RIBS2 is also written in C for C.
A clone is a clone. The main inplementation difference between a thread and process in Linux is COW, for all intensive purposes a worker process and a fork are the same in this use case. Neither have led to scalable web servers.
Except you are basing yourself on the fact it creates a single worker process per connection. It does not.
Workers are spawned when the server is started. Each of them deals with tens of thousands of connections on its own via the listening socket they share.
This is a common technique and scales incredible well.
> apache2 mpm_prefork fell out of favor many iterations ago.
By whom, exactly? There are still plenty of reasons to use a forking web server (see my other comment in this discussion). Saying it "does not scale" is misleading; even with a event-driven model there are only so many CPU resources that can be used to serve responses to clients.
Event-driven webservers are fantastic compared to forking ones, for keeping open many thousands of relatively idle connections (if that is your definition of "scale"). But many web services simply don't do that.
Preforking webservers, like event-driven ones, still have a rightful place in this world. As with all things technology, you have to pick the right tool for the job.
var kore = require("kore");
kore.on("request", http_request);
function http_request(req, resp) {
var statusCode = 200;
resp.write("Hello world", statusCode);
}
> Only HTTPS connections allowed
Seems a bit painful. HTTPS on the webserver itself is a fair bit more painful to setup and administer than HTTPS in your reverse proxy / loadbalancer (I prefer nginx). Web servers should support plain HTTP.