If you have to handle that many connections, I would suggest looking at Erlang. Its default heap of a process is < 3K.
Usually that shouldn't matter as much as both have small process sizes compared to OS threads or processes. But at millions of connections it starts to matter. Whatsapp for example, was running on FreeBSD and Erlang with 2M+ connections.
Go needs optional arguments for goroutines and have global defaults because recompiling is ludicrous.
Erlang is an ok language with a great runtime. Elixir is much higher level language and hence more productive for teams that already know Erlang but want to write less code and build faster. Also with Erlang, static type checking not used much in Erlang and big loss unless you use dialyzer. Advantage go for compiling and running test cases laser-blindingly fast. There is a native QuickCheck framework for Erlang, and I'm sure someone wrote a go one.
My understanding is the second page of that 8 KB is allocated in a virtual memory sense, not a physical one. As the stack grows once it is larger than 4 KB there will be a page fault and the 2nd page will become backed. So the memory situation should be no worse then before this change was made...
The guy having issues in that thread was trying to handle major load with a machine with < 8 GB of memory. All my production boxes have 128-256 GB... I had some 16 and 32 GB for a long time with no issues.
All I see are a few perf graphs that show a 20% runtime reduction in a few cases and > 50% reduction in one case. This gives me no insight whatsoever what is going on under the hood.
Is it that for these dozen or so benchmarks, we end up using > 4K and < 8K of heap? So the extra 20% time is just going into a memory allocation?
P.S. Interesting that I got two snarky comments asking for a basic question about Go. Does not bode well.
How much minimum overhead (kernel and userland) per socket is there with a single process watching as many sockets as reliable via select/epoll/kqueue? Say a setup in C for simplicity's sake. Would be an interesting experiment.
While I do opt for better performance, this change particularly was a bit shocking to me, as in reality it broke the compatibility promise. Sure your stuff will compile fine, but it also meant things that worked fine before now died. I'm not ultimately against the change, but just feel other changes that were more deserving were rejected on grounds this one indirectly violated...
No, everything that worked before, still does (on 64-bit). A stack segment uses 8kB of virtual address space. It will only use a page (4kB) of physical memory if you don't need that much stack. The operating system will map the second page only when it faults.
If you need more stack, the point is moot because you used more physical memory before as well, it was just split into more stack segments.
I'll refrain from making a smartass comment, just state I understand the difference perfectly. This causes a jump in physical memory usage. See reply to your parent.
Interesting. Definitely important when I get around to finishing a mosquitto (mqtt) client/server framework. For anyone that isn't familiar, mqtt is used on a massive scale for telemetry data and log shipping. This makes it great for IM (Facebook msgr uses it IIRC).
How is any of that relevant? The discussion is about carrier grade messaging maximizing connection density per server, not connecting a watch to a pair of sunglasses.