I've only skimmed it, but the gist seems to be that on MNIST the FF algorithm is within 30% as effective as the classic backpropagation. I didn't quite follow in my quick reading how the network can generate its own negative data, but this seems like what the future research would be interested in. Section 8 seemed to appear out of the blue, and talks about alternative hardware models for machine learning.
I was at the NeurIPS talk yesterday where he introduced this paper, and it made a bit more sense in the context of the entire work.
His argument is basically that this method is more amenable to analog implementation, and that eventually he believes that we will throw away the idea of separating hardware and software that's defined digital computation to date. Currently, we expect hardware will run the same software the same each time, but to make things more efficient he wants to "grow" the hardware and software together, to the point that you cannot have a piece of software that runs on a different device. The program has instead been learned to run specifically on that hardware and that hardware alone.
He also makes a point that this won't displace digital computers, and instead will be another class of computation hardware.