As others have mentioned, there are approaches like regularisation and dropout w...

visarga · on Dec 14, 2016

There are many ways to compress networks - by pruning neurons, by enforcing sparsity, by representing activations and gradients on one bit (or a few bits), and by transfer learning where a large net is transferred into a smaller one.

chriswarbo · on Dec 14, 2016

Yes, my question was more about meta-level algorithms for balancing size against performance. Especially adaptive methods such that we're not just growing up to a limit and stopping, but selectively allocating resources to those parts which need them. Adapting over time would be nice too: "thinking harder" when there are idle resources, but shrinking the results back down under load.

SimonKinds · on Dec 15, 2016

This paper http://dl.acm.org/citation.cfm?id=2830854 kind of has a solution to being more efficient. It has two networks and uses the smaller one (more efficient) to infere first. If the result is accurate with high probability (the probability of one class is much larger than the probability of any other class) then there is no need to run the big (expensive) network.