I noticed that the minimum goroutine stack size was raised from 4KB to 8KB betwe...

enneff · on March 3, 2014

The 8kb size was demonstrated to be optimal:

https://codereview.appspot.com/14317043

http://swtch.com/~rsc/gostackamd64.html

http://swtch.com/~rsc/gostack386.html

turnip1979 · on March 3, 2014

It is problematic if you want to have millions of tcp connections serviced from the same host.

https://groups.google.com/forum/#!topic/golang-nuts/8cXGTWGU...

rdtsc · on March 3, 2014

If you have to handle that many connections, I would suggest looking at Erlang. Its default heap of a process is < 3K.

Usually that shouldn't matter as much as both have small process sizes compared to OS threads or processes. But at millions of connections it starts to matter. Whatsapp for example, was running on FreeBSD and Erlang with 2M+ connections.

ballard · on March 3, 2014

Go needs optional arguments for goroutines and have global defaults because recompiling is ludicrous.

Erlang is an ok language with a great runtime. Elixir is much higher level language and hence more productive for teams that already know Erlang but want to write less code and build faster. Also with Erlang, static type checking not used much in Erlang and big loss unless you use dialyzer. Advantage go for compiling and running test cases laser-blindingly fast. There is a native QuickCheck framework for Erlang, and I'm sure someone wrote a go one.

enneff · on March 4, 2014

Have you tried recompiling Go?

This is building the tool chain and standard library on my modest machine with a cold cache:

    $ time ./make.bash
    real	0m42.763s
    user	1m11.655s
    sys	0m13.411s

Hardly a colossal burden.

ballard · on March 4, 2014

I've run it a zillion times. That's not the point, and that doesn't solve the problem. The point is per goroutine configurable stack sizes.

voidlogic · on March 3, 2014

My understanding is the second page of that 8 KB is allocated in a virtual memory sense, not a physical one. As the stack grows once it is larger than 4 KB there will be a page fault and the 2nd page will become backed. So the memory situation should be no worse then before this change was made...

The guy having issues in that thread was trying to handle major load with a machine with < 8 GB of memory. All my production boxes have 128-256 GB... I had some 16 and 32 GB for a long time with no issues.

f2f · on March 3, 2014

too bad, it can't be changed now. that would require editing a number and recompiling!

turnip1979 · on March 3, 2014

Can't reply to the f2f's snarky comment.

I'm not saying I don't know how to change the value. I want to understand why the designers made the decision to do so.

4ad · on March 3, 2014

Then read the three links pointing above?

turnip1979 · on March 3, 2014

All I see are a few perf graphs that show a 20% runtime reduction in a few cases and > 50% reduction in one case. This gives me no insight whatsoever what is going on under the hood.

Is it that for these dozen or so benchmarks, we end up using > 4K and < 8K of heap? So the extra 20% time is just going into a memory allocation?

P.S. Interesting that I got two snarky comments asking for a basic question about Go. Does not bode well.

voidlogic · on March 3, 2014

>Is it that for these dozen or so benchmarks, we end up using > 4K and < 8K of heap?

stack, not heap. We are talking about changing the default stack size.

>So the extra 20% time is just going into a memory allocation?

Yes and the book keeping overhead involved at the OS and runtime levels.

ballard · on March 3, 2014

How much minimum overhead (kernel and userland) per socket is there with a single process watching as many sockets as reliable via select/epoll/kqueue? Say a setup in C for simplicity's sake. Would be an interesting experiment.

silisili · on March 3, 2014

While I do opt for better performance, this change particularly was a bit shocking to me, as in reality it broke the compatibility promise. Sure your stuff will compile fine, but it also meant things that worked fine before now died. I'm not ultimately against the change, but just feel other changes that were more deserving were rejected on grounds this one indirectly violated...

4ad · on March 3, 2014

No, everything that worked before, still does (on 64-bit). A stack segment uses 8kB of virtual address space. It will only use a page (4kB) of physical memory if you don't need that much stack. The operating system will map the second page only when it faults.

If you need more stack, the point is moot because you used more physical memory before as well, it was just split into more stack segments.

voidlogic · on March 3, 2014

+1 thanks for saying this. I am consistently amazed how many people on HN don't understand the difference between physical and virtual memory.

silisili · on March 7, 2014

I'll refrain from making a smartass comment, just state I understand the difference perfectly. This causes a jump in physical memory usage. See reply to your parent.

ballard · on March 4, 2014

Even fewer know what TLBs do.

silisili · on March 7, 2014

That's fine for a single segment, but these are reused. So overtime, this can cause a spike in rss, see - https://groups.google.com/forum/#!msg/golang-nuts/hAWuOP5MY8...

4ad · on March 8, 2014

All of this is now moot since Go now has moving stacks.

silisili · on March 9, 2014

Not really. My point still stands, regardless of whether it holds true in the future.

4ad · on March 11, 2014

The future is now. Go, now, today, has moving stacks.

ballard · on March 3, 2014

Interesting. Definitely important when I get around to finishing a mosquitto (mqtt) client/server framework. For anyone that isn't familiar, mqtt is used on a massive scale for telemetry data and log shipping. This makes it great for IM (Facebook msgr uses it IIRC).

turnip1979 · on March 3, 2014

Are you keeping up with the Paho mailing list?

I've used mqtt for a project in the past. It worked really well but I didn't have very high messaging load for that application.

ballard · on March 4, 2014

How is any of that relevant? The discussion is about carrier grade messaging maximizing connection density per server, not connecting a watch to a pair of sunglasses.