Hacker News new | past | comments | ask | show | jobs | submit login
The Go Compiler Needs to Be Smarter (lemire.me)
159 points by hactually on June 5, 2020 | hide | past | favorite | 117 comments



Sure, if your goal is to build a compiler that outputs the fastest possible code.

If your goal is to apply the "principle of least surprise" to the generated machine code, then that eliminates or constrains most interesting optimizations, including procedure inlining, dead code elision, and compile-time evaluation.

Go is a "principle of least surprise" language, based on a vision of simplicity for C that hasn't existed among actual C implementations (outside of maybe Plan 9) for decades now. It's yet more suckless idiocy: sacrifice everything, including utility and friendliness to end users, to buy "simplicity" for hackers. No wonder the suckless folks love Go so much.


> Go is a "principle of least surprise" language, based on a vision of simplicity for C that hasn't existed among actual C implementations (outside of maybe Plan 9) for decades now. It's yet more suckless idiocy

Have you considered that this Go opinion might not be “idiocy” but desirable qualities in some use cases?

It seems quite inappropriate to bash on Go for realising a vision that solves such a use case, in turn assuming no such use case exists, just as it is inappropriate to vilify Ruby or Python for for having a very dynamic object model and method calling/message passing, witch results in non-optimisable call paths because they have to be resolved every single time.

Go’s consistency, reliability, and ease in debugging and exploring is definitely a really good feature that I gladly trade a few microseconds for.


> Ruby or Python for for having a very dynamic object model and method calling/message passing, witch results in non-optimisable call paths because they have to be resolved every single time.

The Ruby implementations don't really do that. They use "inline caches" so this lookup normally only happens once per call-site.


If there's only one backing type being looked up. If there's more than one, the inline cache thrashs like they're saying.


JRuby has had polymorphic inline caches for like a decade and CRuby will actually cache multiple types in the method cache as long as they cache the same method since https://github.com/ruby/ruby/pull/2583


JRuby is JVM and not really what I'm talking about.

And still on CRuby if you have any dynacism, you thrash your cache.


Well but that's not what's at play here

Inlining basic functions shouldn't be surprising to anyone since the 80's? 90's? Unless you compile with -O0 or has a good reason to not inline it. Compilers are usually good at it.

Same with the runtime checking every time if a given processor flag exists. What I suppose most runtimes do is replace a function pointer or have a function shin then add a call dynamically to the correct function. Not calling a function repeatedly to check if a processor feature exists.


> Go is more about software engineering than programming language research. Or to rephrase, it is about language design in the service of software engineering. [1]

This isn't suckless philosophy applied for the sole benefit of compiler writers.

It's a compiler whose primary customers explicitly include tool developers.

[1] https://talks.golang.org/2012/splash.article


Suckless is promoting C99 and disregards anything with a GC.

https://suckless.org/coding_style/


> All variable declarations at top of block.

And they say they want their C code to suck less?

Why voluntarily open the door to read-before-write undefined-behaviour? The lifetime of a local should be as short as it reasonably can be.


My take on this:

pro:

- Storage concerns are separated and grouped at the top.

- Logic code is more compact, helping readability.

- Stack use can be determined at a glance.

- Looking up a variable's type is faster in cases where a variable is used throughout the code block. This is good to optimize, as you constantly need to look up variables when reading C code.

- Reflects that object lifetime of any stack storage object is the same regardless of where it is declared in the code block.

cons:

- For long functions, the distance between use an declaration lines can impair looking up a var's type. (having to scroll to get to the top of your function is a sign you need to break it up though)

- In many cases, a variable is only used in one location in the block and has a meaningful static initialization. Local declarations in this case readability, as it does not have to be looked up.

- "-Wuninitialized" exists, but passing pointer and defining 'default' value style can still cause read-before-write.

I tend to an in-between style, where code is grouped where it makes sense with group-local variables declared in front. Variables that are used throughout the entire code block body are declared at the very top.


> Storage concerns are separated and grouped at the top.

That's rather circular. That's what defines the style, it's not in itself an advantage or a disadvantage.

> Logic code is more compact, helping readability

It's slightly more compact, but it harms readability, as I describe at [0]. Using the new style, especially with the const keyword, can make code much more readable.

> Stack use can be determined at a glance.

It can't, that's up to the compiler. Perhaps all locals will be enregistered. Even if it were true, it's extremely rare to need to do this.

> Looking up a variable's type is faster in cases where a variable is used throughout the code block.

In certain circumstances this may be true, but there may still be multiple levels you may have to check. For short-lived locals, it will be quicker to look up the type using the new style, as that keeps the declaration close to where it is used.

Using a modern IDE, it's a draw - just hover over a local and it will tell you its type.

> Reflects that object lifetime of any stack storage object is the same regardless of where it is declared in the code block.

That doesn't sound right. That will be platform-dependent. I see no reason a compiler couldn't generate incremental pushes, rather than an upfront block allocation. Also, depending on what the compiler generates, different locals may end up occupying the same place in the stack (at different times), and a given local may be held in different places in the stack at different times. And that's if we're ignoring registers. No matter what we do, we can't rely on the generated assembly looking anything like the source code.

Beyond that, it's 'dishonest' about locals' real lifetimes, in a way that can be harmful. Compare:

    { // Old style
        int i;
        int j;

        i = j; // Undefined behaviour
        j = 7;  
    }
 
    { // New style
        int i = j; // Compile-time error. 'j' is not in scope.
        int j = 7;
    }
> "-Wuninitialized" exists, but passing pointer and defining 'default' value style can still cause read-before-write.

I'm afraid I don't get what you're saying here. Are you referring to the danger of passing a 'default' value to a function, say, in a way that might be dangerous?


Doesn't C99 warn you if you make such mistakes?


On some compilers (gcc and clang at least) only when the warning is enabled (e.g. with -Wall).

But I agree that it's a bit silly to on hand recommend C99, and on the other hand, not take advantage of C99 features. It seems like the rules have been written against C89, and the "use unextended C99" only slipped in later.


But can't the compiler work that bit out? Source code is for humans. Put all the vars at the top to tell the humans "here's all the scratch space I'll be needing in this block" and then let the compiler observe "var x is only used twice, so let's optimize that lifetime..."


> Source code is for humans

The reason why making the scope as local as possible is important is for humans, not compilers.

> Put all the vars at the top to tell the humans "here's all the scratch space I'll be needing in this block"

Why would humans care about how much scratch space is needed? That's for the compiler to know.


We're not talking about scope. The scope is already decided. "Vars at the top of the block" doesn't mean "take those extra vars out of the while{} scope and elevate them to the next enclosing scope."

You've taken my "scratch space" too literally. Very few people need to count bytes for local vars. I'm talking about future maintainers reading and understanding code. Grouping the current block's variables at the top says nothing about how the compiler might organize the resulting code and storage. But it does inform future readers of the code.


The scope of a variable is from where it is declared to the end of the block. Moving a variable to the top of the enclosing block means that it can be referenced from more places in the code, which increases its scope.

Warnings about uninitialized variables help, but don't catch everything. For example, you don't usually get a warning for passing the address of an uninitialized variable to an external function (since it might be an output parameter), but that would be undefined behavior if the function expects the variable to be initialized. Initializing variables at the point where they are declared ensures that they can't be referenced at all in an uninitialized state.

Rust has a slightly more nuanced (and IMHO superior) system: non-mutable ("const") variables can be assigned exactly once, possibly but not necessarily at the point where they are declared, and all variables must be initialized before use, including passing references to other functions. This permits more flexibility in how the code is arranged while simultaneously offering stronger guarantees against undefined or otherwise erroneous behavior.


> Why would humans care about how much scratch space is needed?

In some contexts, it's important. For instance, each thread within the Linux kernel has a very limited fixed-size stack space (used to be 4K bytes, IIRC it's been increased to 8K and then 16K), which resides in physical memory (cannot be swapped out or lazily allocated). Avoiding large stack frames is necessary.


1. That's a rare use case 2. There are ways of measuring that without hampering readability. I mean, are programmers supposed to add all the variables' sizes up in their head??


That wouldn't work anyway. You could rewrite code to use fewer variables without changing the resulting assembly.

    i = get_thing();
    j = do_stuff(i);
vs

    j = do_stuff(get_thing());


Also, unless your thread has only a single function that has local variables or arguments and that function doesn’t contain sub blocks that declare variables (say inside a while block), All variable declarations at top of block doesn’t help much in gauging stack space usage of a thread.

Actually, not even that helps, you would also have to know how much stack a function call takes (might be non-trivial in the presence of stack alignment rules), and which functions get inlined. Edit: if you declare all your locals at the start of a function, chances are the compiler will check whether it can make some of them share memory, so you’d have to take that into account, too.

If you’re concerned about stack overflows in your threading code, it is tooling is what you need, not manually counting stack usage.


> if you declare all your locals at the start of a function, chances are the compiler will check whether it can make some of them share memory, so you’d have to take that into account, too.

And that's ignoring registers. Not every local ever needs to reside in memory.


> can't the compiler work that bit out

Technically no, not in all cases, due to the halting problem. In practical terms, read-before-write issues do happen in real C code, so it makes sense to take steps to avoid it. (Languages like Java force the programmer to write code where the compiler can guarantee the absence of read-before-write errors, sometimes just synthesising an assignment of zero, but it's still possible the programmer will assign a dummy value and accidentally end up using it.)

> Source code is for humans.

Yes, that's precisely my point. It's about making the code readable and easy for a programmer to reason about. It's unlikely there will be any performance impact either way; decent compilers should be good at lifetime-analysis and register-allocation.

It's more readable to declare a short-lived local on its first use. This makes its precise type more apparent, as you don't need to scroll up to its declaration. This is particularly important in C, where using the wrong type can have especially nasty consequences.

The new style also makes it immediately clear over what scope the variable is relevant, as the local does not exist in scope until it is declared and assigned. That is to say, it only exists when it should. I expand on this in my other comment in this thread.

Related to this, the new style helps prevent undefined behaviour by making it less likely you'll accidentally introduce a read-before-write. Again, those errors do happen in real production code. It's the kind of error static analysers pick up in long-trusted codebases.

The old style makes your code less dense, artificially increasing the number of lines in a function.

The new style also enables you to use const, which of course requires assignment at the point of declaration. If you use const with your locals, you do not have to scan the code to determine if the local is modified later on, you know at a glance that it will not be. This lets you reason about values, rather than the current state of a local. If you can access the local, you know it holds the right value. [0]

If it turns out the lifetime of a local needs to be broadened, you can move the declaration up to a broader scope, but in my experience this is surprisingly rare.

It's not exactly relevant, but in C++, with RAII, you don't really have a choice, and you pretty much must use the new style rather than the old-school C style. But that doesn't tell us much here. In a similar vein, Java and C# programmers could use the old-school declare-at-the-top style, but none of them ever do.

It's just a style that used to be necessary in old versions of the C standard, which people got accustomed to. For what it's worth, the Linux kernel seems to use both styles. [1] [2]

> here's all the scratch space I'll be needing in this block

For the reasons I've given above, I don't think this is a good way to approach locals. It makes sense to leverage scope and constness to improve readability, not to just introduce a free-form set of uninitialised locals with overly broad lifetimes. That approach opens the door to avoidable bugs, and needlessly burdens the reader with having to scan the code to determine basic properties of the locals (which they may then get wrong).

[0] https://www.infoq.com/presentations/Value-Values/

[1] https://github.com/torvalds/linux/blob/master/init/do_mounts...

[2] https://github.com/torvalds/linux/blob/master/kernel/sched/c...


I hope they at least use C11 atomics, or only write single-threaded single-process systems.


Indeed. My impression was that go was meant not to "produce the quickest code" but to "produce the code the quickest". The compile speed is one huge use case.


For the popcnt example, I think that, if a compiler takes it upon itself to detect, at runtime, whether the CPU supports it, it also should make sure it does so reasonably efficiently. After all, the reason this was added must have been because it’s faster than a hand-coded version.

The alternative of doing neither would be fine to, and could better fit that “principle of least surprise”, but, of course, also would leave more low-level work to do for go programmers.


That seems rather unwarranted attempt at vilification of the language, worded in an odd way: if something sucks, it's bad, so suckless would actually be positive?

But why? What's Go done to you? A few if err != nil { ... } statements too many?


Look up suckless. It has a specific meaning and history.


Indeed. suckless refers to https://suckless.org/philosophy/


Thanks. I had never heard of it.


Right, but OC still called it "suckless idiocy"...

Also not sure how Go sacrifices utility and friendliness to end-users...


Does it really, though? Go really isn't meant for number crunching, or any CPU-bound code, really.

I think it's just fine for a language to not strive to emit the fastest possible code. I don't mind having an ecosystem where build times are fast in return for less optimized code. We already have plenty of languages that do the opposite (Rust, C++, Haskell, ...), and I would personally hate to see Go's compilation/link times creep up that level.


Why am I using a language with pointers if it doesn't care about being fast?

Another problem with not having a smart compiler is the gigantic binaries it produces. For wasm, an officially supported target, this makes it a non-starter.


> Why am I using a language with pointers if it doesn't care about being fast?

Because, I assume, you want to differentiate between pass-by-value and pass-by-reference - and pointers are a familiar, simple (if not simplistic) way of expressing that.


> Why am I using a language with pointers if it doesn't care about being fast?

I think there's a big difference between 'I dont care about speed' and 'fast enough'. Go sits pretty firmly in the 'fast enough' category for a lot of people, but of course if you lose sight of speed concerns entirely it'll quickly slow down and then won't be fast enough for a lot of people.


I think Go does care about being fast, but only to a degree. Having the compiled binary being as fast as possible isn't actually one of the listed pain points that Go was designed to solve [0].

[0]: https://talks.golang.org/2012/splash.article#TOC_4.


Hmm I disagree, Go is a language that should have workload that is CPU bound, Go is not Python or Ruby it's more C++ in that case. It can do heavy computation, I think it's just a balance of compiler optimization vs compile time.


My take on this has long been that Go created a lot of confusion by initially selling itself as a better C++ when it is actually much closer to a better Python (outside of the data science use cases).

Edit to add: Or a better / easier to deploy / less reviled / hipper Java.

Go is a good and valuable language, but I think perhaps more than any language I know of, a lot of people go into it expecting it to be something that it is not.


Go is a Google project. You may disagree all you like but it's their project and hence their priorities.

I would have loved for Go not to copy in the "null mistake", to have generics available to all and to have proper sum types built in. But it does not and Google is not keen on fixing this. So I look elsewhere: Rust, Kotlin, Reason/OCaml, ...


I write Go and a bit Rust. I don't think compile time could be a huge problem, because:

1. There are ways to reduce it. For example incremental compiling, only recompile what's needed.

2. Many programmers already using CI when they're test/deploy/release their projects. And CI itself takes a long time to initialize and execute, basically throws away almost all the benefits of a fast compiler.

3. Let's don't forget the compiler could have different compile mode (Debug/Release mode), if Go wants to introduce it of course.


I find it hard to fathom that many programming languages still don't use partial evaluation. That a popular language avoids inlining is even beyond that.

Inlining us hard to do in ways that makes sure you always produce the optimal code, but simple conservative inlining is probably among the simplest and most efficient optimisations you can do.

Edit: one of my favourite quotes is about the hardships of the heuristics of inliners: "I don't know how many of you have tried to write a good inliner. It is like barely being able to restrain a rabid dog on a leash".

I probably mangled that, but originally it is a quote by Andy Wingo when talking about the improved optimisations of guile 2.2.


Go does inlining and their inlining heuristic has been tweaked over time.


The example in the article is a typical function where I would expect inlining. Overly shy seems almost like an understatement.


You can see the reason actually:

  go build -gcflags="-m -m" .
  # _/tmp/inline
  ./main.go:7:6: cannot inline sum: unhandled op RANGE
  ./main.go:15:6: can inline fun as: func() uint64 { x := 
  []uint64 literal; return sum(x) }
  ./main.go:3:6: can inline main as: func() { _ = fun() }
  ./main.go:4:9: inlining call to fun func() uint64 { x 
  := []uint64 literal; return sum(x) }
  ./main.go:7:10: sum x does not escape
  ./main.go:4:9: main []uint64 literal does not escape
  ./main.go:16:15: fun []uint64 literal does not escape

 package main

 func main() {
  _ = fun()
 }

 func sum(x []uint64) uint64 {
  var sum = uint64(0)
  for _, v := range x {
   sum += v
  }
  return sum
 }

 func fun() uint64 {
  x := []uint64{10, 20, 30}
  return sum(x)
 }
Go does not inline range operation.


> Go does not inline range operation.

do you know why?

[not super familiar with Go, just curious]


I'm not sure I think it's just a current limitation: https://github.com/golang/go/wiki/CompilerOptimizations#func...

They do improve the compiler every release, so we might see some of those operations get inlined in the future.

More information there: https://github.com/golang/go/blob/master/src/cmd/compile/int...


I may have missed it, but this discussion seems to be lacking a key motivation of go. That is, the go devs have preferred a fast compiler over fast code output.

I have a hard time imagining them adding a bunch of optimizations around code generation when those optimizations will almost certainly:

* Slow down compile times

* Increase compiler complexity

* Be worse than what something like LLVM would produce.

Maybe it makes sense to integrate a "release" build which targets LLVM? IDK. But I do know that so long as compilation speed is a major goal for the language the you simply won't see optimizations being seriously considered.


Did you catch the note at the end about how gccgo is slower? If so how does pivoting to LLVM solve the issues seen there?


You can read about the why here.

https://stackoverflow.com/questions/25811445/what-are-the-pr...

The short answer is that there are optimizations that the GC is doing which gccgo is not doing.

The long answer is that gc will do escape analysis which can avoid allocating on the heap, gccgo doesn't do that.

Why might LLVM be better suited? Primarily because the framework also supports JIT and has efforts from the likes of Azul to make their JVM faster. Most optimizations that would benefit Java will benefit go.

https://llvm.org/devmtg/2017-10/slides/Reames-FalconKeynote....


Go mostly has been only picking up easy wins so far. I haven't seen any attempt to go further than that. Which is fair strategy for first few years since everything comes with a cost. But, people now expect to see more obviously specially the more it's being compared to more sophisticated languages and compilers.


> But, people now expect to see more obviously [..]

I don't. I mean, yay for improvements, but it's already working great for my usecases.


What are your usecases, if you don't mind me asking? I am considering replacing our HTTP-heavy processes currently written in C++ with Go. However, after reading this thread I'm not so sure. Compilation time isn't that big of an issue for us, but having a simpler way of doing networking would be a win. I can't tell if that ease of use would be trumped by poor performance.


It isn't clear to me you're looking at a clear win there. If you do want to try it out, I recommend the strangler pattern: https://docs.microsoft.com/en-us/azure/architecture/patterns... With HTTP it's really easy to rewrite one URL at a time. You just need some way to proxy things around, nginx if nothing else, and then you can swap things as you go along, so you can pick and choose which URLs to test the Go implementation on.

My rule of thumb for Go performance is that's roughly 2-3x slower that C/C++. While human loss aversion is probably kicking in and making that sound horrible, from an engineering perspective, it's likely you'll not notice it, speaking broadly from my position of ignorance. However, if you do have your code deployed to places that are routinely running the CPU at 50%+ all the time in your C++ code (as opposed to DB wait or whatever), and you are not interested in investing in more hardware, I wouldn't even consider switching to Go.


It's mainly binary to JSON conversion, and firing that out over HTTP a few thousand times a second. That's the only IO. Goroutines look very interesting, and as I said, ASIO (C++ async networking) is a real pain to work with. But there are latency requirements here. Something that previously took 1ms cannot now take 10ms.


Here's the classic story about replacing a C++ HTTP server with a "slower" Go one: https://talks.golang.org/2013/oscon-dl.slide

Mind you, in 2013 the Go compiler was at go1.1, and had practically no optimizers at all.


Pretty new to the language myself but it might be worth it to just dive in yourself and see. I imagine getting a proof of concept wouldn't take too long given how easy go is to learn, great builtin benchmarking, and the (from what I've heard) exceptional net/http library.


Same here. Go is now my daily driver and main language, and I'm perfectly satisfied with it. Quite the opposite of expecting more, I'm very fearful on what may change or be added in go-2 (which is actually a reason why I'm currently learning C, just in case).


Personally, if Go takes a bad turn, I'm likely to just go back to JVM languages.

Not Scala though, they went overboard with that one.


Disclaimer: I'm a Go fanboy so I'm biased.

> if Go takes a bad turn

The conservative nature of the Go team seems to recognize this worry. It's been bashed from the beginning for no generics but they are working on it. They just haven't found something that "feels right".

Of course there's just no pleasing all the people all the time.


The language should remain unsophisticated; it's meant for coding at scale, for reading at an even larger scale. Sophistication is not a target, and expressing logic in a more compact form will mainly make it harder to read and understand, which for the kinds of things they write with it, the code that still has to be understood 30+ years down the line, is super important.


A frustrating thing about inlining in Go is that there's a fairly arbitrary cost model and you sometimes end up fighting it (lookup George Tankersley's talks on YouTube). There have been tons of discussions about whether or not it should be user configurable, if inlining hints should happen at the call sites or function definitions etc.

It's quite a nice discussion that helps people understand the toolchain. It also goes to show that while a self hosted build system is nice, you forgo decades of gcc/llvm optimisations. https://github.com/golang/go/issues/17566


Those optimizations could be taken advantage of via gccgo.

Also many other languages keep having backend issues, because those optimizations are too focused on C and C++ code semantics.


> Also many other languages keep having backend issues, because those optimizations are too focused on C and C++ code semantics.

Attribute it to LLVM monoculture.


Attribute it to benchmarks people care about being written in C/C++ (with maybe a dash of Fortran). e.g., SPEC CPU.


"I do not mind that Go lacks exceptions or generics. These features are often overrated."

I agree 100%. I've used them both in C++. I hope Go never gets them.


C++ doesn't have generics. It has templates, which is essentially a macro system—there is no type checking, and a new instance of the code is generated at the source/AST level for every combination of template parameters. If that is your only experience with "generics" then it's no wonder that you have a negative opinion of them, but that is no reason to oppose the implementation of proper generics in other languages. Java has true generics with interfaces and such but they were retrofitted into a language with a less expressive type system, so there are gaps in the implementation. A better example would be traits in Rust, or typeclasses in Haskell.


This is going to sound a bit like "no true generics" but if you think of C++ templates as macros less so than generics you might be more open to seeing how other languages do it in a way that encourages better engineering.


Generics, I would certainly be ecstatic for if it was done in a way that is novel and feels Go-like, so not Java or C# or C++'s implementations. This has certainly been the stance of the Go team since the beginning of time. It has never been "anti-generics" it's always been "we've studied all the generics implementations out there, and didn't feel like any one of them were good enough, so rather than shoehorn them in, we are being patient." This approach needs to be celebrated more.

Exceptions, OTOH, I hope never see the light of day in Go. Curse them.


What's wrong with the way that Java, C# a, and C++ handle generics? I see this complaint fairly regularly, but I was never sure why. Is it how the compiler handles it, or how the language defines it that is the reason for this dislike? Genuine curiosity.


This is the lazy answer I know, but I wanted to at least answer your genuine curiosity: There is ample research available online on the subject that explains it in depth and far better than I ever could. However, I will leave you with a few quotes that help at least paint the picture.

From the Golang FAQ:

> Generics are convenient but they come at a cost in complexity in the type system and run-time. We haven’t yet found a design that gives value proportionate to the complexity

Quote from Russ Cox:

> The generic dilemma is this: do you want slow programmers, slow compilers and bloated binaries, or slow execution times?

Finally, one of the reasons the C++ or Java approach has never been palatable to Go core devs is summarized from this Rob Pike quote from his famous "Less is Exponentially More" blog post:

> If C++ and Java are about type hierarchies and the taxonomy of types, Go is about composition

So therefore it really is also a matter of finding a generics approach that lives up to that spirit as well. Contracts I think are approaching that.

All that said, user defined generics (since technically speaking, Go has many generic capabilities today already) are coming to Go, they just aren't being rushed. I think we will be happy with the generics implementation in Go within the next year.


The optimization I care about most is vectorization. A lot of ML and data science basically boils down to numerical linear algebra and optimization; vectorization is the key optimization for these kinds of problems. I bet if Go got a better autovectorizer it would compare more favorably against languages like Rust and C++ for numerical tasks. The trade-off, of course, is longer compile times, which I suspect will be unacceptable to the core Go team.


I like Go. I use it a lot. I've been called a Go shill on HN.

And I seriously do not understand the appeal of trying to use Go for heavily-vectorized code. The ASM support is sufficient for encryption kernels and some lightweight additions like bigints, but if you're writing the kind of code where you want the compiler to autovectorize a lot of math operations, why would you want to write that in Go at all? Why start from scratch in a language with designers that may not actively hate your use case, but certainly don't care much about it, when you can work in a language+runtime+library environment that deeply cares about that use case?

If I for some reason had a task like this handed to me at work, I'd drop Go in a heartbeat and go get something reasonable for the task.

Go will never be good for numeric or mathematical tasks. Those things involve richer type systems than Go has, or weaker type systems (dynamic types); Go's type system is arguably pessimal for this use case, being both too strong and too weak at the same time. Even if Go grows generics, they are very likely to still not be what a numeric system is looking for. Rich numeric types put a lot of strain on type systems (rather than being the easiest case which I think some people's intuitions might suggest, they are the worst case), and Go's type system is simply never going to be complexified enough for what you'd want in a numeric/mathematical system because it would be directly at odds with their stated use cases.

It isn't even that Go is kind of weak on this one point, but if you fixed it, you'd have a great numerical language... the whole language is shot through with little things here and there that work against numeric code, the explicitly-stated desire for a fast compiler that doesn't do much optimization probably being the most notable other problem. Making Go an even decent numerical language would pretty much involve a full fork of it.


Even Java and .NET do better on auto vectorization.

Intel has recently published an article on using Go, their advice?

Manually use cgo or Go Assembly for the AVX.


Apples to orange comparison. Java and .NET have different goals, around 2 decades of optimization and a lot more money thrown at them.

Also, hi again pjmlp! Another Go thread, another nonconstructive Go bashing coming from you. From your other comments here I see you got pretty combative this time.


Please don't ever start or feed flamewars like this on HN again. We ban accounts that do that, and what you did here was egregious.

https://news.ycombinator.com/newsguidelines.html


I assume full responsibility. I started it.

Regardless of pjmlp's behavior in Go threads it doesn't justify me lowering the bar like this.

You wont see this kind of content coming from me in the future.


Stating facts is hardly bashing.


Vague statements without sources are not facts.

And even without sources comparing Go to Java and .NET doesn't even make sense in the context.



Intel doesn't directly advice for neither of the 3 methods. It simply states that handcrafting assembly code is faster which is always the case anyway.

So everybody can see how dishonest pjmlp is towards Go, here is their interpretation of Intel's conclusion:

> Intel has recently published an article on using Go, their advice? Manually use cgo or Go Assembly for the AVX.

.

.

And here is the actual Intel conclusion from the link:

> The examples show how to use Intel AVX-512 with Go to improve application performance using three different methods: direct access with Go assembly, using the Go cgo interface with intrinsics for Intel AVX-512, and via indirect access with 3rd party libraries such as Gonum.

> Each method showed that using Intel AVX-512 improved Go application performance, with shorter execution times and improved date throughput overall. Using Go assembly with direct access to the CPU instruction set was faster than indirect access using cgo or Gonum.

> Clear Linus OS makes it easy to use Intel AVX-512 in Go because Clear Linux OS provides a deeply optimized software stack, including Intel AVX-512 enabled software.

.

.

To say pjmlp's interpretation was uncharitable is an understatement.


That's understandable. There's a large cost to doing cgo calls. Not to mention you get into a weird situation where the cgo calls take too long to execute and the Go scheduler/pacer can make things even worse by suspending execution.


True. These are valid concerns one should keep in mind when using cgo.


Please enlighten us where does Intel describe how to do auto-vectorization with Go, because writing Go Assembly, using cgo writing C intrinsics by hand, or link to a third party library hand optimized doesn't look very automatic to me.

Like I said, play with facts if that makes you feel good.


Who said it needs to support vectorization natively? It's not what Go was created for so it's fine to exist as a library or assembly code.

Go has stricter goals than Java or .NET. So comparing them favorably is like saying a Corolla sucks because it lacks features present in a heavy duty truck.

Again, Intel's article doesn't advocate for any method. It just presents results. Anyone can read the article and conclude for themselves that you're extrapolating to the worse interpretation possible as usual.


Wait until a webasm thread comes around again and you'll see a real rabbit hole of extreme conflation, false claims, bad faith arguments and willful ignoring of facts and evidence.


Glad not to disappoint the accolades of WebAssembly knighthood.


Please don't do any more flamewars like this on HN ever again. What a train wreck.

https://news.ycombinator.com/newsguidelines.html


Well it sucks for now, but Gorgonia and Gonum has a bunch of vectorized libraries for deep learning related use cases - they are "hand written" kernels (in actual fact generated by gcc -O3)

This kernels method of doing things makes it easy to upgrade code. Check it out: https://github.com/gorgonia/vecf64


Given how brittle autovectorizers can be (small changes to the code preventing the heuristic from firing), that is pretty much the only reliably fast way to write such a library currently, and perhaps indefinitely.


This[0] PR should fix some of it is my understanding but will have to wait and see -- not playing with /tip!

[0] https://github.com/golang/go/commit/fff7509d472778cae5e652db...


Another pain point in Go I vaguely recall is that functions written in assembly are never inlined, so you always pay some penalty for using such functions (this may have changed in a more recent version of Go).


No benchmarks or any other numbers. I wonder how much faster some typical Go code (some CLIs, Hugo or webservices) will run? Around 1%, 5% or 10%?

The Go compiler doesn't need to be smarter for Go's usecase. If it can be made smarter, fine. For best possible performance look at C/C++ or Rust.


"No benchmarks or any other numbers."

And yet plenty of logical argument. The author assumes a level of optimization experience on the part of the reader, for example that the reader appreciates the physical cost of branch mispredictions.


It's too bad Go has poor performance calling C functions, if it didn't I think most issues with the language could be worked around.


With its well defined AST, I think Go can "easily" accommodate optimizations as described, but obviously at the expense of compililation time. And I gotta tell you that keeping compile times down is the more important optimization for me. However, I happily add automated scans for safety and correctness into my release candidates, and I would gladly pay for optimizations at that point. As long as I can test rapidly with the knowledge that I can add an "-O3" flag just before release would be a perfect compromise for me.


Correct me if I'm wrong but comparing Go against Swift, C, C++, and Rust isn't really fair since one of Go's goals is compilation speed, in which it seems to shine against these other languages. From what I understand, you're going to trade performance against compilation speed, and Go is on the opposite side of these choices compared to the other languages.


Realistically, compilation speed is something every major language is going to try to improve upon to improve adoption.

I haven't looked at the internals of the go compiler, but it seems a bit simplistic perhaps in the interest of improving speed. For example, it stops outputting errors after a point when it can clearly go further after having demonstrated that it's parser can recover. The error messages have lots of room for improvement, particularly compared to Rust.


I think one of the pain points that Go addresses is the compilation speed of large C++ projects at Google, and it's one of the reasons it was made in the first place. From what I know C++ and Rust are in the same ballpark in terms of compilation speed, while Go gives a noticeable improvement.


We could compare it with D, Ada, Object Pascal or Delphi, and it would lose on compilation speed, language features and quality of generated code.


Claims without a proof are just - claims


Claims?!?

Anyone can get the compilers and try them for themselves (except for Ada, which you can only get the free GNAT one).

Language features are quite obvious, one just needs to look a language reference manual.

The only claim is being lazy to accept facts and acknowledge that even Turbo Pascal for MS-DOS did it before Go.


> In less fanciful languages like C or C++, the programmer needs to check what the processor supports themselves.

I think by now most C++ compilers can target multiple combinations of feature flags and select a supported code path during program startup. The Intel compiler could do it years ago, but the resulting code was crippled on AMD CPUs.


It will be a whole lot smarter on June 8th at 7am pacific, https://www.youtube.com/watch?v=Dq0WFigax_c


I believe it when I see it on https://golang.org/doc/devel/release.html

Until then, it is just yet another "we are working on it" like so many that have come up during the last 10 years.


It has Lambda Man and Griesemer on the case. I think this one will stick because it was requested by the Go team and there is precedence with featherweight java and that Wadler designed the generics for Java. [1]

There are prototypes you can play with now.

https://blog.tempus-ex.com/generics-in-go-how-they-work-and-...

[1] http://homepages.inf.ed.ac.uk/wadler/


I skimmed over the paper once. I could not get all of it but whatever I understood seemed nice. Hopefully it can be implemented to full(not subset) golang and I hope it can be implemented soon. I started learning golang recently. I already have few places of `interface{}`. Having done lot of Java and a little Rust, I appreciate the simplicity and quick compile of golang, but I do feel it much more verbose than both. If generics can be implemented nicely, it would be a huge boon to golang's usability.


More verbose than Java? Is that even possible?


I honestly don't understand how anyone thinks go is LESS verbose than java. Go is full of manually iteration loops and doesn't have anyform of pattern matching/switch expressions.


Well I agree and disagree. Agree in sense many individual expressions are same or more verbose in Go. Disagree in sense at a project level for roughly same amount of functionality idiomatic Java take at least ~5 times the code and directories than idiomatic Go code.


Nice try Pike.


Not for end users of the language for a good while, right?


Scroll down to "run in webassembly" and you can try out the prototype.

https://blog.tempus-ex.com/generics-in-go-how-they-work-and-...


This is a very interesting at a time when we're seeing what we can replace C++ with.

(yes, rusticles, I know rust exists)


I believe the correct term is 'rustaceans'.


I think go is mostly used to replace Python and Java & co.


And yet, it's worse than both.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: