Thanks. About the second part of my question - are you doing much the same stuff...

dragandj · on July 30, 2016

It uses different MCMC algorithm - affine invariant ensemble MCMC. The difference comes from the fact that this algorithm is parallelizable, while JAGS/Stan's isn't. So, many GPU cores are the main factor. But, the algorithm is also a factor, in a sense that parallel chains always mutually inform each other.

They may do a lot of work to make sure that MCMC is validly converging, and Bayadera also does its stuff on that front, but the truth is, and you'll find it in any book on MCMC (Gelman included) that you can never guarantee MCMC convergence.

nextos · on July 31, 2016

Looks very nice. I wonder if the upcoming Xeon Phi will make the task of parallel sampling simpler. Or at least compiling and optimising automatic parallel samplers on the fly. Macros might be great for this. That's the ultimate probabilistic programming goal. Write the model and get efficient sampling for free.

dragandj · on July 31, 2016

Thanks. I doubt that XeonPhi would be any faster than my old AMD R9 290X, and the 10x price tag is also not inviting.

eli_gottlieb · on July 31, 2016

Can you point me to any good documentation on parallel MCMC algorithms and any info you might have written down on how you parallelized it? This sounds extremely worth porting over to some other probabilistic programming languages.

dragandj · on July 31, 2016

I'll be glad to send you the paper once it gets accepted.

eli_gottlieb · on July 31, 2016

Thanks!