I know the point of this article is to demonstrate how to solve a problem but the premise is a bad one and must not be confused with sound software architecture at all. In fact it's slightly painful reading it and I really wish people would stop writing articles like this.
The functional requirement here is to take some HTML, parse it and emit slack flavoured markdown.
Involving WASM / Rust / cross compiling stuff and fucking around with a build tool chain should not even be being discussed anywhere as a solution for this. I mean it's fine as a toy but the problem is that half the industry doesn't have any idea what's a good idea and what isn't from an engineering perspective when it comes to solving problems and this will be taken as gospel on how to solve the thing.
What we have here is a Rube Goldberg machine, not a cleanly solved engineering problem.
The article is tagged cursed for a reason. This is not intended to be a good solution, it is intended to be _a solution_ that works better than you'd expect while also explaining the Unix philosophy and going through my entire thought process into making it.
The best kinds of hacks are the ones that look ludicrous but are actually somewhat fine in practice.
It's very clearly a toy example to demonstrate the idea, and it does so well.
> What we have here is a Rube Goldberg machine, not a cleanly solved engineering problem.
And what _we_ have here is unfounded indignation over a perfectly fine way to solve a problem. People have been using linked libraries to re-use code across languages since forever, it's fine. This solution isn't very different, it just makes shipping easier.
As a parting comment, "sound software architecture" is not a decided upon principle which can be empirically determined, and if it was we'd all be out of a job.
> And what _we_ have here is unfounded indignation over a perfectly fine way to solve a problem
Absolutely no way is this a fine way to solve the problem. That is crazy talk.
1. It introduces additional toolchains into the solution when it is unnecessary.
2. It now means you need multiple language specialists to maintain it and associated communications and context switching.
3. More interfaces and integration means more fragility when it comes to debugging, problem solving as well as increasing the problem surface.
4. It massively increases the dependency stack which means there are now multiple sets of supply chain issues and patching required.
This makes no problems easier at all! It's even a bad last resort if that's all you have left to try.
Sound software architecture is very very well defined and this is definitely not it. I have seen entire companies burn by taking this approach to problem solving.
I'm really getting tired of solutions before problems and this is a fundamental example of it. Give us a real use case not manufacture a problem for it.
> 2. It now means you need multiple language specialists to maintain it and associated communications and context switching.
Sometimes you have that and you need to accept it. In this case the function was really trivial. What if it was a mega secret algorithm that needs Rust speed which you want to really write in Rust because XYZ? Or C++ I don't care. But not Go, which you want to use exclusively for microservices, for example.
> 3. More interfaces and integration means more fragility when it comes to debugging, problem solving as well as increasing the problem surface.
This is inevitable when you want to go that low level. Some companies use C++ for everything, some mix low/high level: how do you do in that case?
This article is about Go, but I could imagine something similar for Python: use something superfast under the hood, python for the high level part. It's really a common solution, which requires a bit of bindings left and right but it's worth it.
> 4. It massively increases the dependency stack which means there are now multiple sets of supply chain issues and patching required.
This is how software is done today. Maybe 30 years ago you would write everything in C and be happy with it. Today companies use at least 3 programming languages - at least that's my experience.
> This article is about Go, but I could imagine something similar for Python: use something superfast under the hood, python for the high level part. It's really a common solution, which requires a bit of bindings left and right but it's worth it.
That's not a bad idea really, the wasm/wasi option could be a nice intermediate step between "we can't optimize the python any further" and "write a native extension" (or god forbid use cffi or the cursed horror that is ctypes).
> Absolutely no way is this a fine way to solve the problem. That is crazy talk.
I disagree. I see this in a similar way I see Electron and actually you can make the same arguments against using Electron.
But guess what, Electron wins on practicality. It makes creating GUI apps much easier. That wins over any problems associated with the extra baggage of shipping a whole browser. It doesn't win it for everyone, but it does for the majority of people who just want to get shit done.
> People have been using linked libraries to re-use code across languages since forever, it's a fine way to solve problems.
People have been doing it forever, but people have also hated it forever. Twenty years ago you saw SWIG in a build, you'd know you were probably in for a bad time in an exciting new way.
I think SWIG had a different purpose. The author here generates and embeds WebAssembly in Go, to avoid building the same lib for multiple platforms (+ the bindings to call the low level c code). Maybe the tool wasn't good enough? Right now this is just WebAssembly which is proven to work on multiple platforms.
If the API is clear and documented, I don't see why this would be an issue, except for the fact it might be a little bit clunky.
It's not the first solution I would come up with, but the question would be: why not? Just because we're used to older and more traditional patterns, why not just to embed webassembly for low level stuff in your code?
This appears to be a straw man. Nobody is trying to tell you to rewrite your software stack using this technique. The OP demonstrates a cool hack. The site we currently occupying is a place for cool hacks. I don't see the problem. As far as hacks go, it's far from the most egregious that I've seen, and even suggests a few thoughtful lessons about the future of FFI beyond the C ABI.
> The functional requirement here is to take some HTML, parse it and emit slack flavoured markdown.
That is solved already and not what the article is about
The nonfunctional requirement is a self-contained binary that is not dependent on machine's own libraries or any extra files. That's not just mental excercise but a feature.
That is what article is about.
The "proper" engineering solution might be very well "just rewrite that small part in Go", but this approach is nonetheless interesting.
> The nonfunctional requirement is a self-contained binary that is not dependent on machine's own libraries or any extra files. That's not just mental excercise but a feature.
Let me add more to this: speed.
You have Rust/WebAsssembly in the mix and automatically you gain in speed as shown in the article.
I honestly don't understand the negative comments.
If you build a lib in C and then you use standard binding mechanisms, oh it's ok. In this case you leverage a great tool (cargo) to do the heavy lifting for you and bam you get a safe binary that gives you better performance and overall better tooling -> it's bad.... why?
I read the original version of the poster's comment and it was way more aggressive and more about "sound architecture". Maybe he/she can explain what exactly is wrong with this approach and what other approach should be taken instead?
> The "proper" engineering solution might be very well "just rewrite that small part in Go", but this approach is nonetheless interesting.
Why? In this case it's a trivial function. What if it's a strong algorithm that needs speed? Rust helps there, and it's faster than Go.
For a system architect or senior developer it's very interesting to know I might glue a (not so trivial) Rust codebase into my Go program so easily. For people actually working on Go, Rust, WASM, etc., these are also experiments to evaluate the ergonomics, performance, etc. of all the tooling. For someone who wants to learn how FFIs work, this is a great tutorial.
But I'm certain I will now get at least one mid-level dev or interview candidate actually try to do this, in a similarly trivial case. And I will have to explain yes you already wrote the whole build pipeline but no we're not going to maintain it, and yes the blog post says it's fast but it's not really that fast, and etc. etc. It's bad enough every time I have to hear "it's just one Python script, what could it cost?", but the more complex it gets the more energy it takes to genuinely convince (rather than merely authoritatively declare) people not to do it, and the limit there seems unbounded.
This is a toy example to demonstrate a framework to build with. But I don't think it's not a cleanly solved engineering problem. Instead of reinventing xer Rust code in Go, Xe has simply taken the functional rust code and made go run it in a cross-platform and portable manner. Minimal effort from the engineer has been expended to solve this and C was not involved for FFI definitions either.
I don't care for the eccentricities of the Rust community, but this is a good article that demonstrates an effective approach to invoke code from one language to another. The problem is a toy one that's not the focus here. The focus is a better means of doing interop
> Minimal effort from the engineer has been expended
Is the definition of "how do I get promoted in a company where I don't care if it survives the next five years", not "a cleanly solved engineering problem".
The alternative is rewriting the code of Go into Rust or vice versa.
And if rewrite is Rust->Go way, also continual effort of porting any bugfixes in upstream lib.
I dare to say more complex toolchain is the easier and less time consuming part.
I wouldn't do it in this particular case (converting with mastodon powered html is probably simple enough) but it wouldn't be a terrible solution in some cases.
What's more interesting is if you use same method for app plugins, now you can compile anything in WASM and as long as it have right hooks it can be used in your app as a plugin
Depends on how well supported those stacks are. WASM is very well supported and likely to be getting tested/used/improved extensively as the years go.
I'd rather work on software that depends on three different tech stacks that are well understood and used by many, than software that depends on a single niche tech stack.
I'm not a "stick to a single stack for everything" kind of person, but here we're comparing Go or Rust, to Go + Rust + WASM. The first option is strictly and substantially less risky in this dimension.
I've dealt with pretty much everything from steaming nightmare creeping Cthulhu desktop applications right into back end fintech stuff written in the dark ages over the last 30 years. At no point have I found this solution being applied where it solved a problem. I have seen it applied many times where it created problems!
Author here. My article is not supposed to talk about a good idea. It's meant to bring a bad idea to the table and explain why it works. I designed the function in question with the understanding that it would fire once or twice per 10 minutes. This means that paying a cost like 3.1 megabytes per invocation is okay. If this was intended to run _constantly_ (such as if it was a core part of a run loop), that's different.
With the version of wazero I'm using right now, it's about 0.3 milliseconds to do a single call from Go to Rust and back in the best case on my hardware. I am told that a recent patch to wazero will increase this by a lot, so I'm going to try upgrading to that patch and see what differences it makes. I still think that it will be a bit slower than cgo, BUT the platform-independence and strict sandboxing makes up for it in my book.
I think the article is absolutely clear about it but also I think people will (are! in this very thread) ignore that part and make enormous messes that will hurt real users and someone else will have to clean up.
I’m not sure how to fix this but I’m also tired of pretending it’s not a chronic problem.
There is constructive criticism and then there is this. Once you use phrases like the above and follow it up by pontificating about how you think the author's house looks like, you have crossed a line, by far.
I know that the use of cat isn't technically required, but I still build bash oneliners step-by-step and find starting with `cat foo` to be a helpful reminder of the format of the file.
"Carcinization" is not a reference to creating cancer as you might first think, it is instead "an example of convergent evolution in which a crustacean evolves into a crab-like form from a non-crab-like form"[0] - in this case because the Rust mascot is a crab.
Though in this case I think it should be called "the gopherisation of Rust programs" as it is going to extraordinary lengths to impose a Go norm (single binary) onto Rust.
Odd. When I read the article, I thought "creating cancer" is a pretty good description of the idea that one embeds a whole f*ing JavaScript runtime and accept a 10x performance decrease on runtime just so that you can avoid copying one .so file around.
> I thought "creating cancer" is a pretty good description of the idea that one embeds a whole f*ing JavaScript runtime
wasm is not a javascript runtime.
> and accept a 10x performance decrease on runtime
Unless SIMD is involved (which is indeed a major issue around warm), the overhead of compiled wasm is generally observed below 50%, often closer to 15.
Obviously if you're using a wasm interpreter all bets are off.
It might involve a performance penalty today, but in the years to come vendors will be investing in making it fast. And Wasm has a much smaller surface area to optimise than JavaScript does.
The overhead I’m talking about is compared to native code, not to javascript. I’m not sure why you’re bringing up javascript, it’s nowhere in the discussion except in ggp’s misconceptions.
That's why I mentioned js. There's nothing in principal to stop wasm being implemented in hardware. Then it would be native code. Maybe this can be done easily with an FPGA?
Nice article! I think this is an exciting approach to cross-platform support. I took a similar approach for Trealla Prolog's Go library: https://github.com/trealla-prolog/go
The biggest roadblock at the moment is that Go WASM libraries are not in a very good state, except wazero. My assumption is that wazero will be slower than the cranelift-optimized wasmtime, etc. but I have not seen any benchmarks anywhere.
My impressions of the Go WASM libraries:
wazero: great Go-friendly API, not sure about performance, works on all OS (what the author chose)
wasmer: provides bare-minimum input and output for WASI, fast (it's what I chose for trealla-go). Doesn't work on Windows without significant pain, doesn't static compile the wasmer libraries so distribution is a pain. Seems essentially abandoned as its main contributor left the company.
wasmtime: basically impossible to get input and output in any reasonable way (unable to set stdin or read stdout; it can only inherit the FDs from the host), but might finally get buffers for I/O soon.
wasmedge: haven't investigated this yet, but it seems like it solves many of the problems above, promising
If anyone knows of a benchmark between these, I'd love to see it. I'm especially interested in wazero's execution speed. I think I'll try and run Trealla against it.
edit: I should also mention that wazero avoids the cgo overhead which could be a significant optimization for small functions as in this article, even if the runtime is slower. My use case is a long lived interpreter which is quite different.
(wazero contributor) thanks for the analysis. certainly dependency pain is something we hear about, as well as lack of maintenance on alternatives. Some don't release at all or rarely, and many break api constantly. We also break api, but then again we aren't 1.0 yet.
wrt performance, we do watch it closely, but we've not done benchmarks vs non-default settings in other runtimes. It is the case that the tradeoff of avoiding dependencies means any optimizing JIT would have to be written to make long running things faster over time... no one has contributed this, yet, and more folks are looking at short lived despite it being the case interpreters are a great example of long lived.
Concretely, we do keep track of relative performance (literally each commit in fact), and it isn't uncommon to find PRs including perf before/after.
here are some pointers if interested in digging in! there are certainly some things we'd lose at.
Thank you Syrus, appreciate your work with Wasmer. Congrats on the 3.0 release and Windows support! I just fixed guregu/trealla on WAPM to work with the latest changes. I think WAPM is very cool and I hope more people start doing releases on it.
I feel bad that I can't get over the fact that a Mastodon message is called a "toot".
It just keeps reminding me of the old rhyme about beans, beans, the magical fruit, the more you eat, the more you toot. (Although in the English of most former colonies that didn't violently rebel, it's pronounced "fart").
I suppose that's the point, it's a nice meta joke punning on tweet, but implying it's all just farting in the wind.
> (Although in the English of most former colonies that didn't violently rebel, it's pronounced "fart")
Also true in American English. You have two things wrong:
1. The rhyme calls beans "the musical fruit".
2. "Toot" is relevant in that context, being a way you can describe blowing a horn.[1] But obviously, the primary reason for using the word "toot" there is that it rhymes with fruit, not that it's the normal word for farting.
[1] Or sometimes another instrument; there's a tongue twister that starts "A tutor who tooted the flute tried to tutor two tooters to toot."
That's all nice and well, but while "toot" can be both a verb and a noun describing the message that is published, similar to "tweet", "publish" is just a verb - so how do you call a Mastodon message now?
As you've already done, you can call it a "message", I call it a "post". We don't need every social media site to strictly require their own bespoke terminology merely for the sake of branding.
That’s only a modern day theory and feels somewhat retrofitted interpretation of the facts (from what I’ve seen when researching this myself). There are various spellings of the surname, since names often didn’t have a formal spelling in the early days of book keeping. And some of the spellings begun with a ‘T’.
So the more likely scenario is the name was only picked because it sounded similar to their original German family name while being an English-styled spelling.
Given we are taking 30+ years ago though, the best we can do at this stage is speculate.
However even if the card game theory were true, it still doesn’t make Trump a powerful name in the U.K. because the fart connotation is still more prevalent.
If I read this correctly the whole point of the rigamarole was to be able to use the rust crate 'lol_html'. Is there nothing comparable in the go world? If it's that critical, why are you using go at all?
And couldn't you have also saved a bunch of effort at the end by tweaking the rust program so that it wasn't just a one-and-done and only need one call to exec?
Sorry, I should have said "one call to InstantiateModule" instead of exec. I don't know how go processes work, but surely you can do that in a different context so that the runtime stays alive and you just pass around a way to connect to it. Then on the rust side, you change it to work on lines of input, one line in means one line out.
The current way, your go code starts a new system process to process every message, the way I'm suggesting it'll only start one for the whole run.
This is really cool and shows WASM's power as a glue for different programming languages (that support building to WASM). What I wonder about is how well different Rust libraries support WASM, since this seems to be a requirement for this to work, do libraries need to be careful about ensuring WASM support or is it mostly free?
I think it depends on what your WASM runtime supports. The OP here is using a runtime called Wazero, which looks like it supports the first prototype of the WASI standard. You can think of WASI like libc for WASM, where it defines interfaces for operating with the underlying system. But WASI is still young and changing, and doesn't support everything that a program might want to do. And certain contexts, like web browsers, probably have no interest in providing runtimes that have full WASI support.
If you're interested, I'd suggest compiling a crate for WASI and seeing if you get compilation errors. You can do so by cross-compiling for the `wasm32-wasi` target (`rustup toolchain install wasm32-wasi`, then `cargo build --target wasm32-wasi`). If your program tries to use any stdlib facilities that don't exist on that target, it will tell you.
offtopic but I was looking at Xe's "salary transparency" page and they are paid quite a bit less than I was expecting. Hopefully they get RSUs too because their skillset is worth more than that.
I do get stock options at my current job. The price and amount are not listed because I don't know if revealing that breaks NDA or not. Startups that are pre liquid are weird.
If you think I'm worth more, contact me with a job opportunity :)
It is a binary, a WASM binary. However, your CPU can't execute WASM binaries. That's why you would either have to use an interpreter to execute them (which would be slow), or compile the WASM code into native machine code for your CPU first. Then you can run that machine code directly.
The wazero library supports both of those options AFAIK.
If I recall this is actually because of how ActivityPub works. It's a W3C standard. There's usually some cursed somewhere. ActivityPub just does the cursed in the message interchange format.
> How? Why? Who ever thought this was a good idea?
I guess it avoids needing to invent your own message format, but it means that every client needs an HTML parser in order to process and clean up statuses? That sounds... great.
My guess is they're using the same public API in the actual app.
Of course, I'm really hoping the HTML is added at runtime, and not stored in the DB. Imagine doing a data migration to modify an HTML stored in the DB.
> Of course, I'm really hoping the HTML is added at runtime, and not stored in the DB.
Looks to me like HTML statuses are part of the federation protocol.
In fact, there's an issue[0] indicating that the sanitizer of the reference implementation doesn't allow Markdown to federate. There is also an old status indicating that even if more elements are added, the reference CSS basically strips everything (I assume visually)[1].
TL;DR: Instead of building Rust shared library for every OS/arch combination to use it from Go program, produce WASM chunk from Rust and load/execute it in Go using wazero.
The person is a salaried employee. They are getting paid by their employer. The web site is a personal portfolio / blog /resume site. Traditionally you're paid in attention on that sort of thing and use it to bolster salary via opportunities.
Getting a few dollars here and there from a personal site's ads feels cheap and detracts from the article. Tip jar, fine. But ads no. It just feels dirty. Even if they are "ethical".
From my perspective (as an employer) I see this stuff and think holy hell they'll want to stick advertising on everything. Turns me right off.
Author here. I'm sorry you feel this way. The ads are an experiment to see how much money I'm leaving on the table by not doing them. As of late it is currently just enough to pay for half of my server costs per month. This combined with Patreon means that my blog is cashflow positive. It's nowhere near enough to make a living off of, but it helps me get a small passive income and pays for all of my video games.
From your perspective as an employer, this should signal that I know what I'm worth and I am more than willing to negotiate for it. If you want the ads gone, please feel free to email me a job opportunity. I'll be more than willing to seriously consider it should you meet my requirements.
I'm seriously not trolling you, please take this as a question in good faith if you can... (I understand that any message prefaced as such is immediately suspect)
How much does it cost to host that website? It can't be much more than 10-15 usd/month, right? If so, is it really worth it to recover such relatively small amounts through serving ads?
Maybe I'm wildly off in my estimation; personally I see adding ads as a pretty heavy thing to do, so the monetary benefits would have to clearly outweigh that...
My website is hosted off a Hetzner dedicated server (Ryzen 5 3600, 64 GB of ram) in Helsinki. That costs about 70 EUR per month (though I use it for other things like my IRC bouncer, been considering moving that to my homelab, and innumerable side projects I've picked up and abandoned over the years) and also stores all of my random files I've accumulated over the years. It's on another continent so I have an easy to access live backup of some very important files in case my house gets destroyed.
The other expensive parts are AWS Route 53 for DNS after I dropped Cloudflare (cost varies based on popularity, but ranges between USD$7.50 and USD$12.50, this is still cheaper than Cloudflare because I paid them USD$20 per month) and the CDN I host on fly.io which costs about USD$5-7 per month including bandwidth overages.
My website gets more traffic than you are probably comfortable with imagining. I have to take performance and hardware efficiency seriously. I can easily do a few hundred gigabytes per month of egress bandwidth. More if I end up posting bangers. This is after doing things like hilariously optimizing images (the stickers can get down to about 10 KB each with avid files) and literally having my site load everything into ram on application boot.
I am told that these performance requirements are not needed for most websites, but I don't seem to be lucky enough to be able to do things the stupid way. I have to actually think about the performance impacts of everything I do because getting on the front page of Hacker News so often means that I need to focus on making things super efficient.
For a while my tower was also a Risen 5 3600, so I was doing the moral equivalent of compiling my website with -march=native to unlock _even more performance gains_, but now the machine I deploy from is a homelab node with an Intel Core i5 10600 so I can't do that trick anymore.
Subscribing to me on Patreon gives me infinitely more money than ads ever will, but having multiple income streams means that there's redundancy.
Of course I can afford to pay all of this out of pocket, but everything paying for itself is super fucking nice.
The actual site binary ends up using about 256 MB of ram worst case, the main problem is that in order to get more cores on Hetzner, you need to pay for more ram. I run some virtual machines on that dedi but they don't really add up to much in the RAM/active CPU cores department.
I go into much more detail about this on my blog here:
Those files are backed up other safe places as added redundancy. Those files aren't visible from the public_html folder of my website. Please have faith that I know what I'm doing.
Theoretically I have a tracker blocker. I have absolutely no problem seeing ads -- it's being tracked around the internet that I don't like. And no, I won't make an exception for your site, even if I do like your content.
It's possible that their ads don't actually do any tracking, in which case they're receiving collateral damage. They seem technical enough, they should be able to figure out why the tracker blockers are blocking their ads and get things fixed (either in the ad code or in the upstream tracker blocker software).
Problem is there are so many sites that it's hopeless to turn off firefox anti tracking and ublock origin based on a promise from one site that i've seen once because it was on HN...
This gets incredibly annoying using Firefox Focus on mobile. Apparently, the blocking and filtering built into the browser by default is enough to make sites thing I'm running an extension and they politely ask me to disable it, but I can't even if I wanted to. There is nothing to disable.
The functional requirement here is to take some HTML, parse it and emit slack flavoured markdown.
Involving WASM / Rust / cross compiling stuff and fucking around with a build tool chain should not even be being discussed anywhere as a solution for this. I mean it's fine as a toy but the problem is that half the industry doesn't have any idea what's a good idea and what isn't from an engineering perspective when it comes to solving problems and this will be taken as gospel on how to solve the thing.
What we have here is a Rube Goldberg machine, not a cleanly solved engineering problem.