Wow. Impressive project, especially as a university assignment (I am assuming a bachelor's degree), considering I see a lot of people graduating university without being able to even write the most basic CRUD app.
I myself can't even manage to finish a simple 2d game I started months ago.
the eagerness is the most important thing in learning.
I'm sure we all know SE major students who aim to get a job as a "project manager" right off the bat, not because they actually love/know anything about managing a project, but simply because it is a managerial position where you don't have to code.
I love that the author decided (if my interpretation is correct) "Writing a software rasterizer is annoying...fuck it, let's use a raytracer." Very nice!
It really depends on how much functionality you want for both methods.
Rasterization really doesn't require that much work, but you need to be able to do a world/view-transform, clip triangles/quads, and do perspective correct texturing. The most sophisticated bit of that is a clipping algorithm, which is really easy to implement.
Ray tracing requires you to generate a ray per pixel (essentially the opposite world/view-transform), determine ray/box intersections, and based on the intersection coordinate of ray and box determine a texture coordinate.
As someone who has done both, I would say the two procedures are pretty the same level of complexity (if you stay away from bilinear texture interpolation), but I admit that raytracing feels easier to implement as you avoid the clipping and perspective correct texturing.
However, The Minecraft world is a uniform grid of "boxes", so it contains a lot of quads leading to potentially huge amounts of overdraw which quickly becomes infeasible for a software-rasterizer. So if you wish to rasterize in software, you'll need to do a bit of additional work to avoid drawing a lot of hidden box-sides (ignore shared sides), and you'll never get overdraw down to 0 unless you use additional screen-based data-structures.
On the other hand, the author had to implement a raycasting algorithm on the uniform grid for the raytracer to be efficient. This is actually also a little bit painful.
So for that reason, the ray-tracer is definitely the right decision here.
On a related note, I tried to implement raycasting on a uniform 3D grid on a 486/66 Mhz in 1996... Got around 2-5 FPS in 320x200. So it was completely infeasible back then.
This isn't a typical raytracer. Instead, it's a very clever ray marching in a uniform grid. It resembles a Wolfenstein/Doom -style raycaster more than what you usually think of as a raytracer.
When I tried to build the project I got a linker error, but running the prebuilt iso (linked in the readme) with qemu worked just fine. A very impressive project!
Edit: The link error is probably just a stupid mistake on my part; I was trying to build on a 64-bit machine. It doesn't seem to have any problems on 32 bit.
For others who might be interested, compilation works fine in Ubuntu 12.04 after `apt-get install build-essential nasm qemu`. If you install libsdl and libsdl-dev you can run the reference version written in C which runs significantly faster.
It's indeed an impressive project, doubly so as it was initally only for a university project. Instead of just making a game in assembly he made the project his own by extending it -- bootable, raytaced, raytraced shadow, textures and so on! =]
Not to discount what he did, but he didn't write the game in Assembly. He wrote it in C then decompiled it to Assembly and used that as a starting point for the Assembly code.
From the article:
>Starting in assembly right away would be a bit too insane, so I first wrote a reference implementation in C
As someone who started in Assembly right away creating several games with a much simpler 6510 CPU, I can vouch for the fact that starting in Assembly would be a bit too insane especially on modern CPUs.
>Starting in assembly right away would be a bit too insane, so I first wrote a reference implementation in C
How did he make the reference code? I'm reading that as he wrote an implementation in C, then decompiled it and then cleaned up the decompiled Assembly language code.
Maybe I'm wrong. How do you take his statement above?
Author here, I did look at some of the assembly output from C code, especially for frequently called functions like get_block/set_block, but other than that I used no reference but the original C code to write the assembly.
You can also read it as, write C code, test. Repeat until you have functionality you want. Pick block of C code, replace with assembly equivalent. Repeat.
What was your linker error? If you haven't already, you should either file a bug or solve it yourself and send a pull request of your solution. As someone who hacks around with small open source projects enough, nothing is more annoying than that type of small error that everyone who knows how to fix it finds the solution to obvious to report (it is probably installing an undocumented dependency), and every who doesn't gives up on the project. Admittedly, I have been guilty of solving these problems for myself many times.
Also, its interesting to think about how badly we can estimate size of code. When I saw the headline, I thought its going to be pretty large with 50K lines of code. Turned out to be much smaller - ~3K lines of code.
Impressive. I also like his approach of prototyping it in C first (I chuckled at the thought of C as a prototyping language). That reference is also in the repo and I hope to at least read that :)
Everybody I know who writes assembly language in practice now, self included, prototypes most of it in C first -- why not take the huge headstart from the compiler generated ASM?
In those cases, are you writing in asm for fun? If not, what advantages do you see in taking this approach over just writing most of the project in C and using inline asm where necessary?
It usually is inline assembly, but you write it in C first and copy the disassembly as a starting point. We do this in the embedded world a lot because we're constantly switching architectures, and nobody memorizes the instruction set of every architecture :)
The most common reason is using architecture-dependent instructions that the compiler doesn't generate well, or doesn't generate at all. Examples are SIMD (auto-vectorization is nice, but far from perfect) and DSPs that have specific multiple-and-accumulate instructions or flags that change the behavior of the accumulate register.
In a project I'm currently working on, inlining was still inferior to fully native ASM. LLVM generated unnecessary stack loading in the prologue, and the completely unused memory access had something like a 4% speed penalty.
Would you please file a bug report [0] with a reproduction of the missed optimizations that you see? I'm interested in taking a look at it. Today LLVM trunk enabled the vectorizer for -O2 and -Os.
A lot of missed optimization opportunities come from the impracticability of communicating to the compiler information known to the developer. Basic examples include “the two lvalues here do not alias” and “the int variable x here is always positive, x/8 can be compiled as straightforward right shift instead of a more complex sequence of instructions”. There are various source-level workarounds including using restrict when applicable, copying lvalues to local variables whose address is never taken to make the lack of aliasing clearer, and casting to unsigned int before dividing. In the worst cases, you have to make the program less readable in order to improve the generated code, with no guarantees that the change makes the generated code better for all compilers (some compilers might still miss the optimization opportunity and generate additional instructions for the copies/casts).
On a related note, I have a dream one day to discover a real example where undefined behavior can be used constructively as a license for the compiler to optimize: the following post alludes to this idea but assembly dumps at the bottom show that the compiler is not taking advantage of the information encoded into the undefined behavior:
More seriously, an annotation language for expressing properties that are supposed to be true at various points of the program can be useful to transmit information from the programmer to the compiler and enabling optimizations that would otherwise require difficult full-program analysis. And these annotations can be used to analyze the program too!
Yes, although I make fun of LLVM for the non-optimal code, I really doubt anyone would consider them 'bugs' -- and I did not. They aren't bugs, they are just optimizations that we haven't found a good way to automatically identify yet.
Though, it does seem to always store the old stack pointer in r7, even though it doesn't restore from r7, and even though my inline assembly block specifies r7 on the clobber list. That might be a bug, but it's a single 'add', so who cares.
When you're writing on an architecture with only poor C compilers, it makes a lot of sense to prototype in C and hand compile it. For example, as a hobby, I write on a platform where program memory is limited to 16k and the only available C compiler is sdcc, which outputs relatively large binaries.
So writing in C is fine for smaller projects, but those are easy enough in assembly anyway, especially on the old architecture designed for human-produced asm.
Back when I started programming (1986), Basic, C, Pascal and Forth were seen as prototyping languages, before coding the real applications in Z80, 6502 Assembly.
My major introduction to C was hand optimizing a large C library compiled to M68000 assembly, while complaining incessantly about how shitty the compiler was, as I could at the time easily strip the code down to <40% of original size almost only be removing unnecessary usage of temporary registers... But these days C compilers generate tolerable code in most cases..
I'm always amazed by the strange assembly generated by modern compilers. LLVM will auto-vectorize a routine into this awesome NEON SIMD masterpiece... and then throw in a bunch of completely redundant stack loads/stores and refuse to use any registers except r0 and r1! 0_o
You can tell where the devs with Ph.Ds in compiler design are putting all of their effort :)
That's true. I guess my perspective has also changed in that I don't need every cycle to be used carefully any more - on that 7.16MHz Amiga, deleting those instructions mattered a lot more.. :)
I think that's also affecting where the optimization effort goes to a great extent - it's more likely to be invested on the type of code people are more likely to use in critical inner loops to be run in places that might saturate large numbers of cores...
I myself can't even manage to finish a simple 2d game I started months ago.
Kudos to the author.