Interactive Numerical Optimization Tutorial

benfrederickson · on Nov 26, 2016

Author here! This post is really just me messing around with Javascript, and trying to make some pretty graphs with D3. All the code is up on my github account, http://github.com/benfred/fmin if you want to check it out.

While these algorithms aren't maybe the most useful thing to have in JS, I am using both the nelder-mead and CG code to calculate area proportional venn diagrams in my venn.js project (which is why I originally wrote all this code in the first place).

santaclaus · on Nov 26, 2016

> CG code to calculate area proportional venn diagrams

Is the matrix in this case guaranteed to be positive definite? I would expect a non-trivial nullspace, and hence semi-definite matrix.

benfrederickson · on Nov 26, 2016

For the venn code, I wrote up the approach here : http://www.benfrederickson.com/better-venn-diagrams/

Basically though, I'm using the non-linear CG method - so it doesn't require a positive definite matrix. The loss function is a little funky with handling the disjoint set/ subset relationships in the euler diagrams appropriately (defines the loss/gradient to be 0 if these constraints are satisfied), but this approach still works pretty well.

That venn diagram post has a couple interactive demos of how this works, and also a randomized test showing overall performance.

I actually believe its the best known algorithm for laying out area proportional venn diagrams. I benchmarked against the code from the venneuler paper here: http://benfred.github.io/venn.js/tests/venneuler_comparison/

skierscott · on Nov 27, 2016

This is an excellent post and I love the interactive demos.

> gradient descent ... ball rolling down a hill

Gradient descent is similar to a ball with no mass falling down a hill. Other methods (heavy-ball and Nesterov acceleration) are similar to a heavier ball rolling down a hill. I've written a post [0] on this with a visualization.

Given a quadratic model (most examples in post) there are some strong theory on convergence rates. Nesterov acceleration has at least linear convergence (which really means exponential). More detail can be found in Noecdal and Wright.

(I'm currently in Steve Wrights Nonlinear Optimization I)

[0]:http://scottsievert.com/blog/2016/01/30/inverse-3/

santaclaus · on Nov 27, 2016

> Gradient descent is similar to a ball with no mass falling down a hill. Other methods (heavy-ball and Nesterov acceleration) are similar to a heavier ball rolling down a hill. I've written a post [0] on this with a visualization.

I don't think mass is the right analogy -- if you non-dimensionalize a ball on a hill the mass shouldn't even appear. Maybe critical vs under damped (gradient descent isn't allowed to overshoot, Nestersov methods are)?

Edit: OK, with a viscous drag force the analogy sort of works, a heavier sliding particle will overshoot. A massless particle will not move at all, however...

deepnotderp · on Nov 27, 2016

I think the best way to think of it is a heavy, rolling ball.

benfrederickson · on Nov 27, 2016

Your visualization of using momentum with gradient descent in that post is really great - nice work!

skierscott · on Nov 27, 2016

I've generated some other interactive visualizations. Some try to give intuition for regularzation [0] and another one [1] that explores colorspaces.

I use ipywidgets-static to generate these and am looking forward to the upcoming release of iphwidgets that supports static widgets.

[0]:http://scottsievert.com/blog/2015/12/09/inverse-part-2/ [1]:http://scottsievert.com/blog/2015/04/23/image-sqrt/

petters · on Nov 27, 2016

Nice visualization!

You should add L-BFGS, in my experience a very efficient and versatile algorithm. It is a jewel of optimization.