Hacker News new | past | comments | ask | show | jobs | submit login

So I read through this, and the conclusion was to not use fibers. However, most of the reasoning seems to surround issues with things like TLS, allocators, and stack memory usage in C++. There is no explicit recommendation here to not use goroutines for scalable, concurrent software as far as I can tell, just to not use fibers.



Goroutines and fibers are the same thing: M:N threading.

It is true that many of the fiber issues presented here are C++-specific. However, what a lot of the comments here are missing is that C++ issues have a way of becoming your issues whenever you use an FFI, even if you aren't using C++. Go's solution is generally to try to avoid using cgo as much as possible, because of these performance issues. That can work for the areas Go is generally used in today. But, as the article points out, that does not work for all applications. For example, I would not want to write graphics code in any system with M:N threading due to FFI cost, including Go.


Please give an example from the document that applies to Goroutines. (The best I can see is the bits about issues with split stacks, but it was resolved.) I think my reading of the document holds up.


Not only that, but in the context-switching overhead, the numbers provided are not appropriate to golang - because go's function calling convention assumes fewer registers are saved, thus reducing context switching overhead.

From : https://codeburst.io/why-goroutines-are-not-lightweight-thre...

"In Go, this means only 3 registers i.e. PC, SP and DX (Data Registers) being updated during context switch rather than all registers (e.g. AVX, Floating Point, MMX)"

Perhaps a better title would be "Fibers require compiler / language support to be viable"


Section 3.6, page 8, talks about the FFI overhead of Go.


Sure, but if that is truly the only part of the document that contains reasoning to not use goroutines, I can’t imagine how one could read the conclusion as suggesting goroutines are unsuitable for scalable software. In fact, I’ve now worked at multiple companies doing exactly this in Go. With Docker it was often preferably to explicitly disable CGo. It would be abnormal in say, C#, to dock points because of C interop.

It’s also worth noting that FFI is not the only way to have Go and C++ interop. For many use cases a lightweight RPC layer between two apps will give better throughput, something that also is done in production to great effect.


> It would be abnormal in say, C#, to dock points because of C interop.

Not really. WinForms is a lot of the reason for C#'s existence, and WinForms is just a wrapper around pinvoke'd Win32. You're crossing the boundary a lot.

> For many use cases a lightweight RPC layer between two apps will give better throughput, something that also is done in production to great effect.

I have a hard time believing that RPC can possibly be faster than cgo. You have the overhead of message serialization and deserialization, two message copies (into the kernel and out of the kernel), two context switches, and a trip through the OS scheduler.


> Not really. WinForms is a lot of the reason for C#'s existence, and WinForms is just a wrapper around pinvoke'd Win32. You're crossing the boundary a lot.

That is an implementation detail. I also don’t know many who consider WinForms to be particularly high performance.

(Additional note: though I have not explicitly said it prior, I believe that PInvoke actually was quite slow for a long time, at least certainly during the WinForms era. For all I know, it might still be.)

> I have a hard time believing that RPC can possibly be faster than cgo.

I can’t find a solid reference, but the issue is that Cgo is simply not ideal for heavy applications. It makes scheduling slower. If you are doing expensive work in C++, such as phoning out to the network or decently heavy computation to the point where Cgo overhead is not the concern, then you are unlikely to have much issue with the cost of RPCs. If you are doing tiny amounts of work with no IO one must wonder why you would not just port those bits to Go.

(Example of scheduler issue: https://github.com/golang/go/issues/19574)

I continue to contend that considering this to be a show stopper to be unfair or at least not very honest.


> Not really. WinForms is a lot of the reason for C#'s existence, and WinForms is just a wrapper around pinvoke'd Win32. You're crossing the boundary a lot.

I wonder if that's why WPF does so very much on the managed side. And I wonder if using UWP XAML from C# is less efficient than WPF in some scenarios because of this FFI overhead.


According to React Native for Windows team it is hardly noticeable.

On their benchmarks comparing XAML/C++, XAML/C#, RN and Electron, it is hardly a few percentile more than C++.

It is Electron that goes sky high in performance loss.


We have already forgotten that Sun spent a lot of time on M:N threading and abandoned it.

Linux seems to have gone the other direction, supporting thousands to tens of thousands of threads.


For a small window of time, after everybody agreed that LinuxThread needed rewriting, one of the most promising candidate was an M:N library (from IBM I think, I forgot the name). In the end NPTL won and the rest is history.


It was called NGPT.


"Fibers (sometimes called stackful coroutines or user mode cooperatively scheduled threads) and stackless coroutines (compiler synthesized state machines) represent two distinct programming facilities with vast performance and functionality differences."

It's pretty clear that fibers cover goroutines, green threads, M:N threads, etc., this is just Microsoft-specific name for them.


It does yes. However, they still do not seem to be making the claim that goroutines are a bad idea. The name conflation is unfortunate, but almost all of the problems are C++-specific, and their conclusion fails to be precise about this.


The paper states that fibers, including goroutines, are a bad idea. The part that is not specific to C++ is the FFI cost (cited as 160 ns in the paper).


We already have a thread discussing this. If this is truly a problem, then I shall ask why C’s Go interop is so bad - if it were better, we’d have access to a much nicer TLS library!




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: