No discussion of CPU design is complete without trying to run your design as fast as possible. That's how people discovered that pipelining is good, and a flat implementation (such as the one in that code) is slow.
Don't forget simplicity - their goal is learning by creating a "super basic CPU" for which you can write "some assembly". Including pipelining in this design would distract from the main purpose.
It would make a nice topic for a follow-up tutorial though.
Very good point. This is even more critical in a FPGA as you are working with a fairly constrained resource.
There were a number of people in my first hardware design course that had a great idea right up until they found it didn't fit on the xilinks chips we had in the lab.
From the proposed flat design, adding pipelining would probably add 10-20% more area, and double or triple the speed. Definitely worth exploring (and would indeed make for a great follow-up post).
Only if it's a 3-stage pipeline without a hazard detection. Otherwise the area would at least double. But, yes, I'd also like to see a pipelined core in this new HDL.