I was tricked into writing a VM while doing Advent of Code last year and it really deepened my understanding of programming (e.g., on how to implement simple coroutines).
Advent of Code is supposed to be 25 daily programming puzzles leading up to christmas. I went in thinking they were going to be algo/datastructure tasks but half of them were actually about implementing and using a VM for a made up assembly language! The difficulty ramp was gradual enough that I did it with no background in compilers and it was lots of fun.
If you want to experience it, in last year's version https://adventofcode.com/2019 you write the VM in Days 2, 5, 7, 9 then apply it to solving problems in Days 11, 13, 15, 17, 19, 21, 23, 25.
I ended up loving the VM puzzles and ended up doing the AoC author's other challenges which have a similar theme: https://challenge.synacor.com/
A mathematician, physicist, and engineer are taking a math test. One question asks "Are all odd numbers prime?"
The mathematician thinks, "3 is prime, 5 is prime, 7 is prime, 9 is not prime -- nope, not all odd numbers are prime."
The physicist thinks, " 3 is prime, 5 is prime, 7 is prime, 9 is not prime -- that could be experimental error -- 11 is prime, 13 is prime, yes, they're all prime."
The engineer thinks, " 3 is prime, 5 is prime, 7 is prime, 9 is prime, 11 is prime, ..."
Haha right, I saw the list without really watching it and jumped to the conclusion before actually paying attention to all the days. It was before my morning coffee, if that's any excuse.
I have written a virtual CPU from very simple building blocks and highly recommend doing it.
I started with class Wire (which just wraps a bool) and class Transistor (which accepts two input Wire& and has an output Wire). It has an Update() function which sets the output state.
From those I built up gates, then flipflops, registers, mux/demuxes, counters, an ALU, and memory banks. Then I wrote a machine language, connected the cpu together, wrote a loader to load machine language files into ROM, then an assembler to allow me to write assembly code.
I then added VRAM, async buses, and wrote a working implementation of Game of Life.
It has 32bit instructions and runs about 10kHz with 8KB of RAM.
If you want a further challenge, write compiler optimization passes that recognize bit arrays a being bytes, figure out what parts are buses, binary adders, etc. and see how fast you can get this to run.
It probably would be fairer to do this on somebody else’s version of this, but I guess doing it on code you know already is hard enough, if you want to get compiler passes that likely would work on other simulated hardware, too.
I once wrote my own virtual machine in college, complete with compiler and assembler, and I cannot recommend doing this enough. Especially the virtual machine part is not nearly as difficult as you would imagine, and to this day (15 years later) I still rely on the things i learned here.
The knowledge you gain from implementing a virtual machine translates reasonably well to inner workings of a CPU, and you’ll have a much better understanding of things like stacks, frame pointers and the overhead of calling a function. It will be completely obvious to you why “i++” is slower than “++i”.
> It will be completely obvious to you why “i++” is slower than “++i”
...but it's not!
All you have to do is perform the most basic of optimizations: check, before generating code for an expression, if that expression is used. If not, then don't bother generating an intermediate for the result. Source: making a c compiler, just implemented this optimization today. (In an 'industrial-grade compiler', you probably want to elide this optimization, but do super complex control flow analysis on the resulting SSA to see that the intermediate is dead code. But for a toy compiler/vm, little tricks can save you a lot in codegen quality for little effort.)
> inner workings of a CPU
...if only. Lots of crazy stuff going on in a CPU that doesn't even start to come up in most VMs.
Technically, that's a bit off topic because this is about compilers and code generation, while a VM defines the basic semantics compilers work with.
For VMs, what is relevant is whether you implement an "inc (reg)" or not, that is if you choose to take the RISC path (small set of mostly orthogonal instructions) or the CISC path (lots of microcode, things like "repnz scasb" in x86 assembly or elaborate addressing schemes).
This actually somewhat a false dichotomy, as you can have a RISC-style instruction set plus a few "high level" instructions for things you expect to do a lot - like, for instance, array operators à la APL/J.
> All you have to do is perform the most basic of optimizations
I mean, of course this is possible, and every compiler everywhere hopefully optimizes this, but at the same time you need to understand why one is faster than the other in order to understand why it makes sense to optimize it. In the end, you’ll acquire a more in-depth understanding of these semantics.
Regarding the inner workings of a CPU, I wasn’t claiming you will completely understand a CPU, my god I hope you don’t because it’s a crazy rabbit hole.
Again, I was thinking about the higher level, like touching the top of the iceberg and seeing what’s in there. Many a programmer don’t have any idea how function calls work, or how functions return values by using the stack, etc etc.
Bottom line, just because you don’t learn everything, doesn’t mean you’ll learn something.
Why x++ was slower than ++x in the original C compiler maybe (and occasionally still when used inside another expression, but quite rarely these days on modern architectures and optimizing compilers).
Regardless, I much prefer to write ++x Because (a) I pronounce it “increase x” and I haven’t found a concise way to say “x’s current value but it is increased afterwards”. And (b) all other unary operators are prefix - easier to reason uniformly about things like order of evaluation.
In some contexts postfix is simpler - e.g. a memcpy-like implementation
while (n—-) *t++ = *s++;
However, they are not very common in my experience.
I thought that i++ ++i difference is related to the inner data structures of C++ iterators and that when using the former you need to keep 2 states compared to just one for that fraction of execution.
You can use ++ on an iterator. Indeed you often do, e.g. in loops. ++iterator is preferable, and for consistency/so you don't have to think about its often recommended in C++ environments to just generally use pre-increment.
(Copying an iterator might be a lot more expensive than just copying an int, and might not be effectively optimized out by the compiler - e.g. if someone builds an iterator that's referenced-counted to an owning object or something)
If you want to go really deep on this I highly recommend NAND To Tetris, a book which starts from the basics of combining NAND gates to build basic logic. You’ll gradually work through building a CPU from those components, building an assembler to program it, and finally putting a virtual machine on top of that.
If you want a really in-depth intro into the topic (with video lectures and guest appearances from legendary VM implementers!), check out CS294 from UC Berkeley that has been made available under a Creative Commons license: http://www.wolczko.com/CS294/
It‘s great for following along self-paced and has hands-on exercises for each topic. You should be a bit familiar with compilers already though.
I've just about finished writing a VM for the ZMachine (the VM that Zork was written for to make it portable). It's been tremendously enjoyable and I've learnt a lot.
I'm planning a series of how-to videos out of the project and will definitely be referencing this article to ensure my presentation of concepts is accurate. Thanks for sharing this.
VMs have so much utility, so +1 to the author on the nerd nugget for compartimentalizing the use as a game cartridge.
VMs and VEs (environments) have gotten so portable that IOT and 5G will have to compete with raw storage (TB's) for the best user experience in a post-filtered world.
Possibly OT (or use case), but I've demod LTSP [0] to run multiple environments on a single server (PXE menu) and clustered multiple LTSP servers (bouncing to other LTSP menus by manually adding a server list to boot menu(s)) and treated them as books with chapters (each LTSP being a book).
Customizing each chapter by OS needs and media/content/apps has so many uses (immersive news,learning,entertainment) and paves the way for offnet productivity/entertainment that currently doesn't exist in the market (ala Home Library/Encyclopedia Server).
LTSP was not as PnP as it needs to be for consumer use (and external PCIe form factors were pricey), but anyone writing a VM (or decorating the interior) is already riding the next wave of physical data subscriptions/refills that bypass metered networks (SDCs or SSDs in TB capacities with OS agnostic data) for 8k video, VR and compilation/mixedtape datasets.
Portable VMs are a trend waiting to happen, so anything you can learn about it (inside-out) is not a waste.
For those who haven't read the post yet, I would definitely recommend doing so if only to understand the striped instruction set implementation using C++ generics at the end.
I rarely run into solutions that fundamentally expand my perspective these days but this one did.
Virtual machines are definitely fun, and can be useful things to know about if you ever design/implement a scripting language or an emulator.
I wrote a toy system a few years ago, a simple interpreter (C) along with a compiler/decompiler (Perl) to match it. Unfortunately my system didn't have a terribly well-designed instruction set. If I were to start over I'd probably implement 8086 instruction-set, or similar.
That said even a toy system is fun, the biggest issue with writing your own instruction-set is that you have to write the actual programs too. Which is often less fun! I rewrote my interpreter in golang recently, keeping the same instruction-set but adding a better trap-system. Of course re-implementing it meant that I still have the problem of no real programs being written for the machine!
Slight tangent. To learn a bit of Zig, I've been doing the usual PDP-8 simulator task I do for systems languages. Zig allows arbitrary fixed width integers, from 1 to 2^16 bits long, signed and unsigned. It is /delightful/ to have those data types when doing an emulator. I'm convinced more languages should have them.
Never wrote virtual machines, because I know why Sun (now Oracle) and Microsoft both spent a billion each to create theirs. You can write something that works over a weekend, but the performance won’t be good without JIT, generational GC, and many other extremely complicated optimizations.
If you think you need to develop a VM, I recommend to reconsider, and think how you can reuse something that’s already there. For instance, modern .NET VM is open source with MIT license, the code quality is more or less OK, and it’s relatively easy to generate .NET bytecode from something else, Reflection.Emit from the inside, or Mono.Cecil from outside.
The point of writing your own vm isn’t to come up something that is on par with Sun’s or Microsoft’s vm, but rather have a hands on learning experience of the inner workings of a vm.
> Real-life VMs don’t interpret, they JIT compile.
Pretty much every real-life JavaScript VM is tiered, and has an interpreter, which gathers data about expected usage which the JIT will use to inform its optimizations when it goes to generate machine code.
Still, you'd be surprised about the performance you can get out of a basic interpreter. Games have used Lua for years. I've written and reverse engineered plenty of custom bytecode for various reasons in the games space. It's a useful tool to have, and there are a lot of situations where performance either isn't the goal, or the large amount of tricks used by JITting VMs isn't helpful.
The dispatch loop you point to is just about using a C extension (computed gotos) to gain a few extra performance points. You can learn about it in about 30 minutes after knowing what a VM is.
I agree about learning experience. But when I need a tiny language, I usually implement them so they compile into something already implemented and supported. Not necessarily .NET bytecode, here’s an example where I compiled a tiny language into HLSL source for the Microsoft’s shader compiler: https://github.com/Const-me/vis_avs_dx/tree/master/avs_dx/Dx...
Yeah sometimes that works. I wanted to write a "compiler" for brainfuck recently. So my first step was converting a BF program to C, then using gcc to compile that.
I guess my actual compiler wasn't so different, just generated an assembly source-file then passed that through an assembler. But that kind of transpiling is pretty simple to get working, and if your destination language is fast enough, or optimized enough, then it'll work just fine.
From my experience, writing your own gives you a level of control and leverage not achievable through a third party VM that might be worth it once you get some practice. As long as you haven't tried, you have no idea about the trade-offs, goes for anything.
Same goes for not compiling all the way down to byte code (or even worse, C), makes some things easier to implement and allows more flexibility because of the lack of separation between compile time and run time.
You can only run the already compiled binary files for it.
If you want to write something new you will need to do it in binary. If you want to write in assembly you will need to write an assembler, or find one online.
Advent of Code is supposed to be 25 daily programming puzzles leading up to christmas. I went in thinking they were going to be algo/datastructure tasks but half of them were actually about implementing and using a VM for a made up assembly language! The difficulty ramp was gradual enough that I did it with no background in compilers and it was lots of fun.
If you want to experience it, in last year's version https://adventofcode.com/2019 you write the VM in Days 2, 5, 7, 9 then apply it to solving problems in Days 11, 13, 15, 17, 19, 21, 23, 25.
I ended up loving the VM puzzles and ended up doing the AoC author's other challenges which have a similar theme: https://challenge.synacor.com/