41ms to draw a single quad seems incredibly slow, even on ancient hardware - I recall being able to do drawing quite a lot faster than that to a crt even on an old 4MHz z80 cpu in the 1980s.
Is the slowness maybe from the IO to the display or maybe something else?
> You may wonder why I didn't just write a routine to plot filled triangles, and then draw quadrilaterals as two adjacent triangles. The answer is that with the libraries that support exclusive-OR plotting this results in a visible line between the two triangles, because that line is drawn twice; having a dedicated routine to plot quadrilaterals avoids this. In addition, this routine is faster.
I guess the Sega Saturn was on to something, then.
Amusingly, old graphics hardware was often quad-based because their past iterations were 2d sprite hardware, and that's what their designers knew best.
The "correct" thing to do is follow the top-left rule for rasterizing triangles. This causes the pixels of a mesh to only be rendered once and you won't get seams.
PowerVR was especially funky. They used planes to ‘carve’ out polygons. It meant that they could genuinely draw convex polygons with arbitrary numbers of side by using ~N+1 planes. I found it kind of cool to try and read through the simulator code in their open source release [0]. I imagine there would be an interesting write up if someone sat down and really wanted to understand it.
All of this blog's posts are elegant, well-researched, and fun — I'm a big fan. It's amazing what sorts of functionality he squeezes out of 8-bit and other small microcontrollers.
Is the slowness maybe from the IO to the display or maybe something else?