Those interested in optimal neural network compression might consider the paper ...

nharada · on Jan 12, 2016

How do you mean "much better compression"? Won't replacing 32bit multiplies by bitwise operations save 32x the memory[1]? Han et al. show not only 35-49x improvement, but on much more difficult benchmarks (MNIST vs Alexnet/VGG).

Combining these two techniques would be really cool and if the bitwise network can work with larger, more complex networks like VGG would be a massive game-changer, allowing these nets to fit on almost any device.

[1] http://minjekim.com/demo_bnn.html