So I guess the premise is that you have two extremes: minimal instruction set (RISC) and complex (a CPU designed to run Python, for example).
IMO, as is often the case, the answer lies in the middle. Look at the tremendous impact that adding AES acceleration features to x86 processors has on applications that require encryption.
That seems to be what he's arguing, actually; that you can get hardware speedups if you carefully target very specific things that a parsimonious addition of hardware features can enable.
Never noticed that tremendous impact - in fact, one of the reasons rijndael became AES was because it was already very fast on contemporary 32 bit computers.
IMO, as is often the case, the answer lies in the middle. Look at the tremendous impact that adding AES acceleration features to x86 processors has on applications that require encryption.