So I read through this, and the conclusion was to not use fibers. However, most ...

pcwalton · on Oct 12, 2019

Goroutines and fibers are the same thing: M:N threading.

It is true that many of the fiber issues presented here are C++-specific. However, what a lot of the comments here are missing is that C++ issues have a way of becoming your issues whenever you use an FFI, even if you aren't using C++. Go's solution is generally to try to avoid using cgo as much as possible, because of these performance issues. That can work for the areas Go is generally used in today. But, as the article points out, that does not work for all applications. For example, I would not want to write graphics code in any system with M:N threading due to FFI cost, including Go.

jchw · on Oct 12, 2019

Please give an example from the document that applies to Goroutines. (The best I can see is the bits about issues with split stacks, but it was resolved.) I think my reading of the document holds up.

sseth · on Oct 12, 2019

Not only that, but in the context-switching overhead, the numbers provided are not appropriate to golang - because go's function calling convention assumes fewer registers are saved, thus reducing context switching overhead.

From : https://codeburst.io/why-goroutines-are-not-lightweight-thre...

"In Go, this means only 3 registers i.e. PC, SP and DX (Data Registers) being updated during context switch rather than all registers (e.g. AVX, Floating Point, MMX)"

Perhaps a better title would be "Fibers require compiler / language support to be viable"

pcwalton · on Oct 12, 2019

Section 3.6, page 8, talks about the FFI overhead of Go.

jchw · on Oct 12, 2019

Sure, but if that is truly the only part of the document that contains reasoning to not use goroutines, I can’t imagine how one could read the conclusion as suggesting goroutines are unsuitable for scalable software. In fact, I’ve now worked at multiple companies doing exactly this in Go. With Docker it was often preferably to explicitly disable CGo. It would be abnormal in say, C#, to dock points because of C interop.

It’s also worth noting that FFI is not the only way to have Go and C++ interop. For many use cases a lightweight RPC layer between two apps will give better throughput, something that also is done in production to great effect.

pcwalton · on Oct 12, 2019

> It would be abnormal in say, C#, to dock points because of C interop.

Not really. WinForms is a lot of the reason for C#'s existence, and WinForms is just a wrapper around pinvoke'd Win32. You're crossing the boundary a lot.

> For many use cases a lightweight RPC layer between two apps will give better throughput, something that also is done in production to great effect.

I have a hard time believing that RPC can possibly be faster than cgo. You have the overhead of message serialization and deserialization, two message copies (into the kernel and out of the kernel), two context switches, and a trip through the OS scheduler.

jchw · on Oct 12, 2019

> Not really. WinForms is a lot of the reason for C#'s existence, and WinForms is just a wrapper around pinvoke'd Win32. You're crossing the boundary a lot.

That is an implementation detail. I also don’t know many who consider WinForms to be particularly high performance.

(Additional note: though I have not explicitly said it prior, I believe that PInvoke actually was quite slow for a long time, at least certainly during the WinForms era. For all I know, it might still be.)

> I have a hard time believing that RPC can possibly be faster than cgo.

I can’t find a solid reference, but the issue is that Cgo is simply not ideal for heavy applications. It makes scheduling slower. If you are doing expensive work in C++, such as phoning out to the network or decently heavy computation to the point where Cgo overhead is not the concern, then you are unlikely to have much issue with the cost of RPCs. If you are doing tiny amounts of work with no IO one must wonder why you would not just port those bits to Go.

(Example of scheduler issue: https://github.com/golang/go/issues/19574)

I continue to contend that considering this to be a show stopper to be unfair or at least not very honest.

mwcampbell · on Oct 12, 2019

> Not really. WinForms is a lot of the reason for C#'s existence, and WinForms is just a wrapper around pinvoke'd Win32. You're crossing the boundary a lot.

I wonder if that's why WPF does so very much on the managed side. And I wonder if using UWP XAML from C# is less efficient than WPF in some scenarios because of this FFI overhead.

pjmlp · on Oct 12, 2019

According to React Native for Windows team it is hardly noticeable.

On their benchmarks comparing XAML/C++, XAML/C#, RN and Electron, it is hardly a few percentile more than C++.

It is Electron that goes sky high in performance loss.

hinkley · on Oct 12, 2019

We have already forgotten that Sun spent a lot of time on M:N threading and abandoned it.

Linux seems to have gone the other direction, supporting thousands to tens of thousands of threads.

gpderetta · on Oct 12, 2019

For a small window of time, after everybody agreed that LinuxThread needed rewriting, one of the most promising candidate was an M:N library (from IBM I think, I forgot the name). In the end NPTL won and the rest is history.

pcwalton · on Oct 12, 2019

It was called NGPT.

zzzcpan · on Oct 12, 2019

"Fibers (sometimes called stackful coroutines or user mode cooperatively scheduled threads) and stackless coroutines (compiler synthesized state machines) represent two distinct programming facilities with vast performance and functionality differences."

It's pretty clear that fibers cover goroutines, green threads, M:N threads, etc., this is just Microsoft-specific name for them.

jchw · on Oct 12, 2019

It does yes. However, they still do not seem to be making the claim that goroutines are a bad idea. The name conflation is unfortunate, but almost all of the problems are C++-specific, and their conclusion fails to be precise about this.

pcwalton · on Oct 12, 2019

The paper states that fibers, including goroutines, are a bad idea. The part that is not specific to C++ is the FFI cost (cited as 160 ns in the paper).

jchw · on Oct 12, 2019

We already have a thread discussing this. If this is truly a problem, then I shall ask why C’s Go interop is so bad - if it were better, we’d have access to a much nicer TLS library!