I have used this library before, it's a powerful library indeed but the documentation is a bit lacking. There isn't a straightforward guide to do the most common operations, even the most simple ones like connecting a client to a server, and you're left with studying the various examples provided, each of them do things a little differently. There are also a lot of undocumented functions/constants/structs.
Check out the "Secure Streams" of libwebsockets, it hides details of protocols specifics stuff (in the JSON policy) and easier to program network application (just deal with the abstracted states and payload in the callbacks).
We use libwebsockets in Ardour (a cross-platform digital audio workstation) to provide the ability to create control surfaces (GUIs) within the browser. We mostly treat it as a transport layer for OSC messages, which could otherwise be transferred via UDP (if the endpoint wasn't a browser).
Something close to UDP APIs for browsers (QuicTransport + datagram extension) was in development for a while but the proposal ended up getting rejected/withdrawn.
Like most websocket library, uWebSockets works by calling some sort of send/write function. You give the data you want to send, which is a sequence (a string or a vector).
The underlying library will take care of buffering that, and the size of that buffer is unbounded. In most of them there is no way to know how much is buffered.
But with libwebsocket, it will call a callback written by you, it will call it every time it is ready to send. And then you have to call a write function that does not guarantee you it will write the whole buffer you give it.
This can be useful for web based games, you keep the last version of the position of each object, and send them one by one when the client is ready to receive. The memory usage will then be bounded to: size of message * number of objects + number of players (each player can have a message that is currently kept in memory until it is entirely sent, since multiple calls to the callback may be required to send a whole message.
The libwebsocket site claims that it is the receiving side that decides when it is ready to send. I don't know exactly how it is determined but I think the sender uses TCP ACKs, the window size...
I wrote something similar, except instead of providing a library (which Libwebsockets already does a fine job of), I created a server/framework accepting shared objects as backend plugins running as dedicated threads interacting with spsc lockless ringbuffers. In other words, more or less the inverse of a library: https://github.com/wbudd/ringsocket
I haven't been putting much time into it anymore lately, but I intend to create a bunch of language bindings for it soon so you can write plugins in other languages too such as Python, C++, Rust, etc. Should be interesting.
I personally prefer QtWebSockets.
I've used libwebsockets before, but I found the API error prone and crusty, and ended up eventually converting it to QWebSockets. That turned out to be a good move.
I remember hearing about it many years ago, and am surprised to find it alive, and well now.
I always felt that the web people were never about performance, since there were many "pure C" webdev attempts before without much success.
Nginx can very realistically handle 1-2M requests per minute on commodity hardware, and no customisation, and that's from the disk.
Were somebody really serious about web performance, I think going from millions of requests per minute, to millions of requests per second is 100% possible.
I worked on this problem around 6 years ago, when I had a task of squeezing HTTP, and network perf on some API servers close to hardware limits. The task was mostly about gluing DPDK to popular software: nginx, memcached, postgres.
I am very enthusiastic to see Libwebsockets getting glib support. Glib is a one of a kind piece of software in the C ecosystem with which you can adopt modern programming methods, and in general approach it as you do it in a big "platform" like environment like NODEJS. Glib is really undeserving neglected, and overlooked.
You can get within 20% the performance for 1/10th the cost by using a modern fast language (rust, d, nim, go), and within 50% for even cheaper by using c# or java.
Of my short encounter with "modern" webdev, I found that running out of RAM was far faster than running out of CPU, even with every trick possible thrown to increase GC aggressiveness.
RAM is by far more discriminately priced on all these new "cloud" hostings, and has the most unpredictable performance change with size. Even on real hardware, going for high RAM servers is quite expensive.
While CPU, or I/O saturation naturally throttles itself, RAM exhaustion is rarely pretty, and hard to proof your software against. Most disconcerting about this is that your RUST, GO, or the TRUE ENTERPRISE JAVA®, don't really use that RAM at all. Most of "modern languages" RAM content is just zeroes, and empty buffers.
Also as shown by TechEmpower benchmark, in "webdev" field, C/C++ doesn't have proper/official driver to Postgres with pipelined support, thus even lose to Java in the fortunes benchmark.
Nowadays, they are using a fork of libpq with batch API that has not been merged for 6 years in order to compete.
So, lacking of good library support in "webdev" field will put C/C++ at extreme disadvantages compare to other "webdev"-friendly languages.
> The system [LMAX] is built on the JVM platform and centers on a Business Logic Processor that can handle 6 million orders per second on a single thread. The Business Logic Processor runs entirely in-memory using event sourcing.
Your GP didn't just mention Rust, they also mentioned C# and Java, so when your parent refers to GC the more charitable interpretation is that they were using C# or Java and responding to that part of the message.
Is there an example of a task like that? Specifically, ones that don’t talk to a database/files or another service, in which case the talker’s performance becomes irrelevant, unless it is really crawling. Most code “we” write is scheduling queries, glueing datasets together and jsoning the results into a socket through some stream library. IO takes 98% anyway, 2% rerouting and checks. Personally I’m fluent in a range of languages, but wouldn’t ever think of writing networking in C or a similar low-level environment. A mountain of work and skill for something expressable in just a few lines of python/perl/js/ts/lua/sql, zero economy. (Okay maybe an nginx plugin in a critical case when multiplying in instance costs doesn’t help.)
That's the crux of the issue. Most of webdev is just managing a huge amount of very simple pieces of code, where I/O from somewhere, to somewhere dominates.
One particular issue I remember when talking with Alibaba engineers when I worked on a subcontractor for a custom DC project was "1 second kill"
That's a phenomenon when some super good deal is posted onto the front-page of Taobao, and they get squished when people from all over China smashing F5 click on the deal.
Purchasing looks like a lock, and write task from the DB side, and the whole of Taobao.com was tied onto a single point of failure MySQL cluster in 2016 abused to the maximum.
They went to Computer Science people which only said that there is no way around locking, and a single DB write origin.
External contractors wrote a super-duper performant "database gateway" which organised, and queued purchase reservations to the stock database at around 50hz.
People confuse performance (CPU) and throughput (IO). The reason CPU usage is undervalued is that it's simply not the bottleneck typically. Very few people actually have the challenge of having millions of requests per minute. The people that do, can scale horizontally by just throwing load balancers and cheap vms at the problem. That gets you both bandwidth, memory, and CPU. Scaling horizontally is very competitive with whatever cleverness engineers can provide to reduce the need for that. Engineers cost a lot more to scale than VMs.
We all love to optimize but there is a point of diminishing returns. Scaling a cheap vm with CPU credits (i.e. it's not even supposed to have double digit CPU usage other than for short bursts) costs nothing compared to the salary involved with e.g. making something like that 2x or 3x faster. And the yields are terrible too, even if you succeed.
Say you are running 12 vms costing about 20/month. That should get you some CPU and modest amount of memory and would be something appropriate for a web server. Twelve of these means you have fail-over, multiple availability zones, and possibly even regions. Going from 12 to 6 would be a nice cost saving of about 120/month. Except now you have less bandwidth to go around and maybe a bit less resilience. That's a decent but not amazing freelance rate per hour. If you spend a week on implementing this, the return on investment (i.e. your time) would be 40 months (assuming an unreasonable 40 hour work week), or about 3 years a and a bit before you earn back the expense of your time. Now say that instead of spending that time, you simply scale to 24 servers: i.e. you double your cost from 240 to 480 per month. Or about four hours of your hypothetical hourly rate. Or about half a day. So a week of your time still adds up to nearly a year of simply running at 2x the capacity.
If you are any good, your rate might be higher and the tradeoff is even worse. Not even worth having meetings about. Using C for this stuff means hiring more expensive C developers and making a bad deal even worse. The smart way to get performance is to have those C developers work on the OSS infrastructure we all love to use. It will trickle down and get us decent performance elsewhere. Nodejs is actually built on lot of C/C++ libraries and benefits from a lot of cumulative optimizations that have gone into these libraries. That's why it is so competitive in this space. There are a gazillion other languages to consider with similarly good enough performance and throughput. C is a last resort when performance wins over security and stability concerns. Sometimes it does, but mostly it doesn't make sense from a cost point of view.
According to TechEmpower benchmark, C# and Java perform better in some tasks than C/C++ [1]
Even in categories where C/C++ are more performant, other languages are not that far behind.
If at all possible, we should not use memory unsafe languages for anything, especially something exposed as a server. No mater how careful you are, and with all the tooling available, majority of exploits in popular software happen due to memory unsafe languages.
Benchmarks are useless if you don't understand what it is you're measuring. A contrived "Hello world" benchmark tells you nothing because your bottleneck is in the system calls, which has nothing to do with what framework you use. If you run a web service that requires a compute or memory intensive task, like a game server physics engine, or multimedia streaming, or anything that requires large-scale postprocessing, you're going to heavily rely on C or C++ based libraries to do the heavy lifting.
Memory safety issues are exaggerated. It's definitely harder to keep big, complex software memory sanitised, but it's not something completely insurmountable.
You use C judiciously on the most performance demanding tasks, while trying to bring the overall task itself closer to some simple algorithm, on which you can later throw heavy verification, like formal verification, valgrind it to death, fuzzing etc.
The current wave of "new age" computer languages like Koltin, Go, Rust have a very noisy activist userbase which tend to extol some very simple, obvious things as ultimate virtues.
Memory safety issues are bugs. Do you know any programmer that does not occasionally create bugs? Don't forget tight schedules, low budgets, ...
Also rust is just what you propose that - a programming language integrated with heavy verification of safeness built-in. Because occasionally someone writes c code without using all available tools to verify the code it is better to have it built in.
Memory safety is not an issue if you actually learn to take advantage of the C toolchain. I've caught memory leaks and buffer overflows to great effect just by using Valgrind and ASAN. And for most applications, you can limit the attack surface by only writing C for the performance-sensitive areas and using FFI to call into those routines. As a bonus, it becomes much easier to unit test for logical corner cases.
This just isn’t true in practice. Can you point to a popular c project that’s accomplished this? I bet there are a few tiny ones that make such claims but haven’t received scrutiny.
I used to roll my own code for websockets since the protocol is so simple. Switched to libwebsockets when I had to support TLS since I did not wanted to fiddle with openssl directly.
Overall the experience was good. It's simple, to the point. Configuration of the library is a bit like dark magic though, not much documentation.
> Has anyone used this in a multithreaded implementation
It's an event loop based library, so the point is more to have a single asynchronous main thread that fetches the messages, which you can dispatch to threads for processing if you wish.
Have you tried libkj? It’s the c++ runtime that underpowers Cap’n’proto’s RPC layer as well as Cloudflare Workers. Has many of the same features - curious how you find the documentation.
It's actually fairly simple to use openssl directly if you have a good handle of sockets in general, it's basically just a case of doing some initialisation and using SSL_write/SSL_read in place of whatever write/read functions you were using before to write to the socket
Yes, you run it on its own thread and output messages to a threadsafe queue for other threads to consume as is the usual practice. Or did you ask for something else?
When the entire OS is written entirely in Rust or Go from the kernel on up, and all applications are also written in the same language.
Oh, and the silicon itself becomes adapted to the paradigms presented by those programming languages, since C was designed to work on the existing silicon. Forcing entirely new hardware designs to meet an evolving and always-changing software paradigm is an expensive proposition in a commodity market and it will take either a lot of central control and will power or a lot of time.
Up to 1972, computing world managed without C, and even afterwards plenty of systems until the early 1990's kept doing quite well without any trace of C code.
C was created in response to then dominating practice to use assembler for all kinds of coding tasks (not just “system programming”). I wouldn’t characterize the situation as “doing quite well.” On the other hand, C didn’t take off on IBM System 370 until much later, due to the availability of PL/I.
Jovial, ESPOL/NEWP, PL/I, PL/S, BLISS and a couple of others did exist and were in active use outside Bell Labs.
Even Multics actual history was only a failure from Bell Labs perspective, as they went on and were even considered more secure in a DoD security assessment.
Even IBM did all their RISC research in PL.8, before deciding to create what would be AIX, as by then it was all about UNIX workstation market.
Had AT&T been allowed to sell UNIX from day one at the same price as competing OSes, I bet C wouldn't be around.
C is effectively a _predictable_ and portable assembler. You can't do that with Rust because everything is catastrophically moved, which makes it harder for humans to predict emitted code, and Go afaik has a runtime.
Rust uses references all over the place and reuses the same memory addresses within stack that once belonged to another object because the compiler can guarantee it, but that makes it much less reasonable to write.
If you can reason with -O0 you can take it that your code will remain correct in later levels.