Well the problems of FPGA have more to do with synthesibility of the high level HDL and having to fit that generic representation in a predefined LUT model. its alleviated quite a bit when you are designing 'application specific' FPGA with different substrate. IIRC Mathstar tried to do this some time ago and so did Ambrics. at the time it suffered from a solution looking for problem syndrome & failed.
the big idea here is that IF your base element (memristor crossbar here) is suitable for such rapidly reconfigurable bus architecture (which it seem like it is) then you can use it to synthesis a single neuron directly. which is a huge leap over the next best GPU/TPU based architecture based on instruction fetch-decode-execute model. based on what I have read few years ago you can have a 20M neurons simulated with memristors in about a cm2 die. that is human level integration density even if you totally ignore the vast difference in switching rate (100Hz vs 1+GHz).
the big idea here is that IF your base element (memristor crossbar here) is suitable for such rapidly reconfigurable bus architecture (which it seem like it is) then you can use it to synthesis a single neuron directly. which is a huge leap over the next best GPU/TPU based architecture based on instruction fetch-decode-execute model. based on what I have read few years ago you can have a 20M neurons simulated with memristors in about a cm2 die. that is human level integration density even if you totally ignore the vast difference in switching rate (100Hz vs 1+GHz).