I would be very surprised if even a fourth of the list you just posted could fit inside the design contraints of tiny tapeout. You have about 1000 digital logic gates max as per their FAQ.
I don't think it's fair to knock it like that. This isn't going inside some industrial project. This is a hobbyist who went from zero(!) to ASIC.
It's a completely ridiculous list. The blogpost is about designing a 2 tap filter, not a CPU, so even if there wasn't a gate limit the design wouldn't have those things.
Low-latency is achieved with a Carry Look-Ahead Adder and includes subtraction.
O(n^2) naïve multiplier design, whereas O(n^1.58) is easily achieved and extendable to MAC capabilities.
Then there's a divider unit.
Microcoding.
Pipelining.
Pipelining with branch prediction.
Floating point.
Superscalar with reservation stations.
Register aliasing.
Hyperthreading (slack time virtual cores).
VLIW.
L1 caching.
L2 caching.
L0 caching.
SMP.
L3 caching.
And add your mask easter egg here.