Shameless plug: I wrote some high quality (I hope) notes on HMC, because I couldn't find an explanation that was (a) rigorous, (b) explained the different points of view of MCMC, and (c) was visual! Here's the notes: https://github.com/bollu/notes/blob/master/mcmc/report.pdf
Looks cool. Also, I wish that when I started university somebody would have forced me to start a "notes" repo like yours. Looks great and very smart to have.
Your code contains a small bug, in the leapfrog HMC function the line pnext = -p should be pnext = -pnext. Also this line is only of theoretical importance, it has no effect on the final result (unless you use a weird asymmetric kinetic energy).
Oh man, these are great notes, very succinct! I'm not much into proofs, but finally picked up the essence of Hamiltonian M-H thanks to this. (Also, I can't imagine what kind of genius it takes to rigorously derive such a method...)
Well, my method so far has been to at least be familiar with some of the theoretical stuff and then try to figure out how to apply it in depth when the opportunity to do so arises. Not sure if there are references where you can directly jump to the applied bit.
Most of those things are stuff you'll encounter if you read up on Hamiltonian MC and information field theory, but it might take some additional reading to get all the required background knowledge.
The first two are just tricks really. For the weighted estimates you just go from:
log(P) = \sum log(P_i)
to
log(P) = \sum w_i log(P_i)
and for the sufficient statistics you get derivations like:
I wonder how much he knows, interpreting it as a thermodynamic ensemble is just going back to thinking about probabilities really. Information field theory is something very different. The cheesy explanation for it is that it is just quantum field theory in imaginary time.
>it is just quantum field theory in imaginary time.
Quantum field theory is just thermodynamics in imaginary time, not sure why you object to calling it a thermodynamic ensemble. Besides, I could hardly introduce a subject as 'quantum field theory in imaginary time' now could I?
- Easy weighting for observations (just put a weight in front of the corresponding term in the sum)
- Identify and isolate sufficient statistics (by breaking up the summation until you've got terms that only contain data, no parameters)
- Just add a quadratic term and pretend it's kinda Gaussian now (esp. fun if you pretend one of your parameters is generated with a Gaussian process).
- Take the expected value of it w.r.t. some other distribution (Cross entropy / Variational Bayes)
- Pretend the negative log probability is a Hamiltonian and add some momentum + kinetic energy terms (Hamiltonian MC)
- Same but skip the kinetic bit and pretend it's a thermodynamic ensemble (information field theory)
- Same but perform a Legendre transformation to get the Effective action / Free energy (ends up being the same as the KL-divergence)