I compiled to assembly in an experimental barebones compiler, it is easy to get started. If I could host GCC or LLVM in my process and skip the ASCII serialization and parsing, that would be an awesome API. This has given me an idea to blog about...
I'm surprised I couldn't find a library that stores opcodes for each architecture.
I have tried to decipher the Intel x86 manuals but I ended up relying on GCC to see what assembly is produced.
There was actually a very old and venerable tutorial on writing compilers in Haskell; it did not use ASCII serialization, but instead uses Haskell's FFI to directly invoke LLVM functions to do everything. That's very neat!
> One potential solution would be to allow attaching a pragma to the `IsLine` and `IsDoc` classes to request that GHC aggressively specialize all definitions that use them.
I'd love to see this as an actual feature request. As the article says C++ and Rust both already do this as part of the language definition, and I can see this is just a straightforward change to make abstractions more "zero-cost" so to speak (not including compile time costs).
Not sure what those classes are but that is called superspecialization and JHC can do it with a compiler flag. I thought GHC also could, in at least some situations or if you use {#- inline -#}
I know this is maybe reductive, but I wonder if just skipping SDocs on compile and just recompiling with SDocs enabled on failure, to get a nice error message, would be a viable improvement.
SDoc is an abstraction for generated formatted text including the assembly the compiler generates. The problem was not that the code generator was generating extra output, but that the text-formatting functionality it used was doing extra bookkeeping that made sense for human-readable text but was not necessary for machine-readable assembly.
Using the same abstraction for both seems a bit silly at first, but it helped the codebase stay more consistent and was already pretty well-optimized. The changes in the article optimized it further by splitting out some of the more expensive functionality while keeping the same general engineering advantages.
Would any reader who did not already know the name of the standard Haskell compiler have been any more enlightened if it had said "Glasgow Haskell Compiler" in the first paragraph? As it stands, the first paragraph of the article merely strongly implies it's an optimising compiler for Haskell, moderately strongly implies that it's the compiler for Haskell, and links to the GHC GitLab. I assert that the only reason one might want the words "Glasgow Haskell Compiler" here is if you wanted to search for it unambiguously, but "GHC haskell" is already entirely unambiguous. It's not like the name is descriptive; you're not going to do anything differently because it was born at the University of Glasgow.
I had to open the article to understand what it was about. The initialism is ambiguous; I thought someone might be trying to shorten GitHub Copilot and was curious to see how the team behind it was optimizing its speed. I would have been much more enlightened.
If you’re attempting to share something with an audience not steeped in your jargon, it’s usually better practice to lead with the full then shorten later.
Haskell is not necessarily popular enough to consider its jargon common knowledge. As the headline doesn't mention Haskell, it's understandable that people might not have enough context to make a guess at this TLA.
One could use the less accurate title "Making haskell faster at emitting code" I guess!
However, don't we have to accept sometimes that the title does not tell the full story, it's just half a sentence after all, it's not even a summary, it's the heading?
You make assumptions that your future interlocutors would associate the contraction "Glasgow Haskell Compiler" with the "Glorious Glasgow Haskell Compilation System" without any awkwardness on their behalf.
I'm surprised I couldn't find a library that stores opcodes for each architecture.
I have tried to decipher the Intel x86 manuals but I ended up relying on GCC to see what assembly is produced.