A tutorial on the free-energy framework for modelling perception and learning

boltzmannbrain · on July 6, 2018

For the probabilistic graphical model and belief prop perspective, check out Friston et al. (2017) "The graphical brain: Belief propagation and active inference": https://www.mitpressjournals.org/doi/pdf/10.1162/netn_a_0001...

For the neural corollaries of predictive coding, check out Shipp (2016) "Neural Elements for Predictive Coding": https://www.frontiersin.org/articles/10.3389/fpsyg.2016.0179...

For a state-of-art CV framework that fits with the free energy principle, check out the Recursive Cortical Network from George et al. (2017) "A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs ": http://science.sciencemag.org/content/358/6368/eaag2612

rm2040 · on July 7, 2018

How does RCN model uncertainty?

boltzmannbrain · on July 7, 2018

Max-product BP inference propagates local uncertainties in the model to arrive at a globally coherent solution. An example of where this is particularly useful is resolving border-ownership amongst neurons representing a common contour.

eli_gottlieb · on July 7, 2018

Disclaimer: I am not the author of the page. I've read the tutorial paper they refer to, and work with related material. In our lab, we've had a very hard time finding accessible material on predictive coding that's actually amenable to equations and code, and so this is something of a godsend.

elcritch · on July 6, 2018

Anybody familiar with this method? It looks very intriguing as it looks to be combinging Bayesian posteriors with the neural activation function. Haven’t had time to dig into myself, and it would be nice to know an outside perspective on it.

rm2040 · on July 7, 2018

A few years back I spent some time reading and following the equations in Friston's papers, maybe understanding like 90% of it. you have to know dynamical systems, differential equations, multivariate matrix stuff, etc. Basically stuff physicists are good at. Seemed that the theory wasn't detailed enough to inspire the next deep NN revoluation, basically the form of the generative model Friston used is very general. My impression was that the coded matlab examples reqired you specify known quantities, like velocity or position. But that requires a human to input, not like a NN where you can just point it at some data and it learns. Would love to be shown otherwise...

elcritch · on July 14, 2018

Nice! Finally could put my physicist training to use... I skimmed the cited papers but they seemed very generalized as you mentioned. I’ll have to find the matlab code examples to understand how a complete system model would look. The linked post seems to deal with just a single perceptron/activation function AFAICT. The embedded constants don’t bother me too much as both physics models have embedded constants and I presume evolved neurobiological systems would have been tuned over time to incorporate the appropriate constants for dealing with useful quantities (force, mass, etc). Could be fun to port the matlab code to Julia!

rm2040 · on July 15, 2018

OK, here's what I was referring to:

https://www.fil.ion.ucl.ac.uk/spm/software/spm12/

in this package there are (was?) some scripts for running dynamic expectation maximization. Cheers

eli_gottlieb · on July 8, 2018

I'd be interested to hear any advice you might have on understanding the papers. Particularly, how do you understand the "priors" that specify goals/preferences/set-points in active inference?

bra-ket · on July 6, 2018

That’s not how the brain works

The proposed method is wasteful in terms of energy spent to get an answer, specifically in this step : ‘ sum the whole range of possible sizes‘, even with approximations and clever algo

Perception is much more economical as it’s done via memorized heuristics that restrict the search space very quickly.

As a rule of thumb, If your method requires many iterations to converge on some minimum it’s a wrong method to model perception. Brain doesn’t solve a mathematical optimization problem.

ReadEvalPost · on July 6, 2018

> The proposed method is wasteful in terms of energy spent to get an answer, specifically in this step : ‘ sum the whole range of possible sizes‘,

Er, the entire approach is motivated by the fact computing p(u) is intractable. That summation is explicitly not done in active inference...

tomjakubowski · on July 7, 2018

> The non-linear function that relates size v to photosensory input u is assumed to be g(v)=v^2

I am having a hard time understanding this sentence - how does g(v) = v^2 relate the size v and the input u if the expression mentions only v?

Is it meant to be v = g(u) = u^2? Is it u = g(v) = v^2?

yorwba · on July 7, 2018

"We assume that this signal [i.e. u] is normally distributed with mean g(v) and variance Σ_v."

That means that there's no deterministic function that fixes u for a given v, but only a distribution of possible values. (It's closer to u = v^2 than the opposite, though.) The precise relationship is expressed symbolically in the likelihood function given in the next part.

tomjakubowski · on July 7, 2018

Thank you!

concernedstats · on July 7, 2018

I don't think this post is correct. The Bayes theorem is mixing up variables v and u in both the source code and the equations.

DrJones1098 · on July 6, 2018

This is a pretty site. Anybody know how they did the code snippets?

kittiepryde · on July 6, 2018

https://mmistakes.github.io/minimal-mistakes/

And looks like the content is in markdown.

DrJones1098 · on July 6, 2018

Thank you!

tedmiston · on July 7, 2018

Can anyone ELI5 what’s going on here for the layman?