Traveling Salesman Problem Visualization [video]

wolfgke · on May 9, 2016

A much better method than those that are shown in the video is to optimize over the subtour elimination polytope by using a cutting plane algorithm. The necessary separation problem for the set of the subtour elimination inequalities can be solved by computing a minimum cut.

This gives a usually very good lower bound for the value of the minimum tour. Unluckily the optimum solution over the subtour elimination polytope can be fractional. This problem can for example be solved by branch & bound (or even better branch & cut) algorithms or by randomized rounding (a more heuristic method for which in general one can nevertheless often prove strict mathematical results).

Full Disclosure: I currently write a PhD thesis in mathematics on some geometric questions in the theory of cutting planes.

eanzenberg · on May 9, 2016

Could you either 1) simplify the explanation using simpler english or 2) provide a paper or thesis which illustrates your point?

wolfgke · on May 10, 2016

> 2) provide a paper or thesis which illustrates your point?

The typical recommendations about this topic are

Schrijver: Combinatorial Optimization - Polyhedra and Efficiency; Volume B, Part V, chapter 58

(often called "the bible of combinatorial optimization") and

Korte, Vygen: Combinatorial Optimization - Theory and Algorithms (5th edition, not that this should make a difference); chapter 21

revelation · on May 10, 2016

Remember linear programs? E.g.

min 10x + 5y s.t. x <= 10 y >= 5 x+y = 8

(Not very useful example)

Now we can extend normal linear programs to what is known as Mixed-integer linear programming (MILP). This allows us to specify the domain of variables, i.e. we can say variable x must be a positive integer or variable y must be binary.

(This extension instantly makes solving generic MILPs NP-hard because obviously we can transform any e.g. 3-SAT problem into an equivalent MILP problem)

MILPs can also be used to solve the Traveling Salesman (TSP) problem. You use a binary variable for every possible pair of cities or edge the salesman could travel on that specifies if the edge is used or not. Since you want a loop, you also add a constraint for each city that it is entered and exited exactly once using the edge variables.

At this point, you could think we are done: we enter and exit each city once, so we have a loop and since we minimise the sum of edge costs, we also have the optimal loop.

Unfortunately, our formulation is incomplete because it allows a loophole for the solver: doing multiple tours or subtours.

If we have a TSP with 5 cities, the solver could choose the edges such that you have two loops or subtours, e.g. one with 3 and another with 2 cities. That would be perfectly consistent with our formulation since each city is still entered and exited exactly once. It's not a valid solution for the TSP however since we require one single loop, not two unconnected ones.

To fix this problem, we need to add what are known as "subtour elimination constraints". These check that for every partition of our city set into two sets A and B, the cities in A and B are connected, i.e. we have one edge leaving A for B and another returning from B to A (or vice versa).

Unfortunately, since we need to do this for every possible combination of sets A and B, we would have to add exponentially many of these subtour elimination constraints, and of course for very large TSPs even enumerating all sets A and B is already infeasible.

So what is done in practice is to first leave out all subtour elimination constraints, since this is what causes the exponential explosion (so far, our problem was polynomial in size). You then solve the linear relaxation of the MILP, i.e. you pretend that the binary constraints on the edge variables do not exist and use Simplex to solve the resulting linear problem.

Then you create a weighted graph: the cities become vertices, and the edge weights are given by the (fractional) values of the edge variables. For this graph, compute the minimal cut using your favourite polynomial time algorithm. This gives you three things: a partition of the cities into two sets A and B, and a value for the flow between these two sets. If the flow is less than 2, you have found a violation of the subtour elimination constraint for sets A and B. You add that one constraint to the solver (cutting the invalid solution) and start anew.

This process can be fully integrated into a branch & bound or branch & cut solver for MILPs.

(I'm sure wolfgke can give a much better explanation, though)

wolfgke · on May 10, 2016

revelation alrady explained the idea pretty well, so I will give a different explanation that outlines some subtile points that are rarely mentioned in the literature (IMHO):

In discrete optimization one often has the job of minimizing a linear function over some set of 0/1 vectors V (this could for now be also an arbitrary set of points in R^d). The good news is: This is equivalent to optimizing over the convex hull of V and thus optimizing over some polytope P. The bad news is: This polyhedron is often very complicated to describe in terms of inequalities (which exist by Minkowski's theorem). So give up the dream? No - we just approach a little bit differently.

Let A, b be a rational (!) matrix/vector. Then a theorem by Meyer says that

conv {x \in Z^m x R^n: A x <= b}

again forms a rational polyhedron. Indeed lots of problems (for example TSP) can be written in this form (subtour elimination polytope for TSP). Again the problem is finding the inequalities that describe it in a reasonable time. One way to handle this problem is adding inequalities iteratively until we get to the (mixed-)integer hull. Indeed one can show that if n=0 (ILP - integer linear programming) or the integral variables are 0/1-valued such algorithms exist and can in principle be implemented (though they are rather slow; for the pure integer case something to google for is "Gomory's cutting plane algorithm"). Note that in the general MILP setting finding such an algorithm is (at least for practical purposes (yes, I know about the famous Del-Pia,Weismantel paper (A. Del Pia, R. Weismantel, On convergence in mixed integer programming), I know about t-branch split disjunctions (Dash,Dobbs,Günlük ,Nowicki,Swirszc - Lattice-free sets, multi-branch split disjunctions, and mixed-integer programming), I know about how Cook-Kannan-Schrijver circumvented this problem (W. Cook, R. Kannan, A. Schrijver - Chvátal closures for mixed integer programming problems); so you don't have to correct me ;-) )) still an unsolved problem (the theoretical side here is different, see paper references - but I don't want to go into these very subtile details). This idea to iteratively add inequalities as necessary is called "cutting plane algorithm".

For the subtour elimination polytope we have the property that the number of inequalities is exponential in the number of vertices. Nevertheless by LP duality and Carathéodory's theorem we just need at most the number of variables of inequalities to certify the optimality of a solution. So we "theoretically" just have to be clever and add the right inequalities ( ;-) ). For a given point, finding a violated inequality (or assure that none is violated) is called the "separation problem". This problem can be solved for the subtour elimination polytope. The good news is: When it can be solved for an arbitrary point (not only a vertex) we can already optimize in polynomial time over the polyhedron (just apply ellipsoid algorithm). The bad news is: The ellipsoid algorithm is painfully slow. So we have to be more clever (but here my knowledge ends - let's say: From here practise begins ;-) ).

I remark that a modern approach to avoid an exponential number of inequalities is to use so-called "extended formulations" (the basic idea is to formulate the polyhedron as a projection of a more simple polyhedron). This can be applied for the subtour elimination polytope (the best reference I can give is the PhD thesis of Kanstantin Pashkovich (full disclosure: I know him personally): http://edoc2.bibliothek.uni-halle.de/hs/urn/urn:nbn:de:gbv:m...). But in general this approach has strong limitations. For example one can show that finding a maximum matching is possible in polynomial time (Edmond's matching algorithm), but there exists no extended formulation of the matching polytope of polynomial size (http://arxiv.org/abs/1311.2369). If you are interested in the theory of extended formulations, besides the PhD thesis of Kanstantin Pashkovich I can recommend the famous Yannakakis paper that laid out the whole theory (Yannakakis - Expressing combinatorial optimization problems by Linear Programs http://www.sciencedirect.com/science/article/pii/00220000919...) and the survery paper by Conforti, Cornuéjols, Zambelli (Conforti, Cornuéjols, Zambelli - Extended formulations in combinatorial optimization).

papierschiff · on May 10, 2016

no. all methods in the video are heuristics. cutting-plane methods are only good for exact results (for example the "concorde" (the fastest tsp-solver) use such methods, but even in the past (~1950) such methods were used). But heuristics are for most problems much better (much faster). For example "LKH" (a programm with a k-opt-heuristic) can solve easily a tsp-problem with >1000 nodes in some seconds (and finds the optimum). Concorde (as i wrote, the fastes exact solver) needs hours for such problems.

cuchoi · on May 10, 2016

How good are heuristics methods compared to exact solvers? Which kind of metric is it used to compare these algorithms?

hartror · on May 10, 2016

They are fast as giving you an answer but cannot tell you when are at the optimum solution. You need some termination rule to stop the algorithm such as time spent running, number of iterations etc.

We [1] have used a heuristic [2] method to solve vehicle routing for a client where optimum isn't important but certain features of the solution as speed of the answer are.

[1] http://www.biarri.com

[2] https://en.wikipedia.org/wiki/Guided_Local_Search

papierschiff · on May 10, 2016

it depends on your problem, on what do you need, ... there are heuristics for trying to finde the best results, some are optimized for speed... i have never seen a better heuristic than the k-opt-heuristic (implemented in LKH), not for speed, not for the results. there are some genetic algorithms, but they are much slower, but can find sometimes results who are a little better. (see the 100000 node mona-lisa tsp problem. lkh founds a tour with the length of 5757199, a genetic algorithm founds a tour with the length of 5757191, the best exact solver concorde founds the lower bound of 5757084 (so there is a gap of 107) after calculating a time of "11.5 CPU years"). heuristics have always the problem, that you can't not say if the found result is the optimum or not. so if you have to find the optimum, you have to take an exact solver (for bigger problems. for problems with <10000 nodes there are heuristics who find the optimum to 99%) to get an idea if you are close to an optimum or not, there is the lower bound call "held-karp-lower-bound". you have to calculate a mst, a one-tree and some extra stuff to get it. than you can calculate the difference between both values.

wolfgke · on May 10, 2016

> heuristics have always the problem, that you can't not say if the found result is the optimum or not.

Even worse: Heuristics have the problem that you usually can't give any metric on how good you even are. If you use an ILP (Integer Linear Programming) based approach if you found an integral solution the optimal solution of the LP relaxation gives you a lower bound (if you minimize) on the optimal length of the tour - so you can tell exactly that you only can get, say, 1% better and cancel any further computation, since you consider this as good enough.

> to get an idea if you are close to an optimum or not, there is the lower bound call "held-karp-lower-bound". you have to calculate a mst, a one-tree and some extra stuff to get it. than you can calculate the difference between both values.

The Held-Karp bound is the same as the optimal value over the subtour elimnination polytope (Theorem 21.34 in Korte, Vygen: Combinatorial Optimization - Theory and Algorithms (5th edition)). This is rather where one starts when one uses an ILP based approach.

wolfgke · on May 10, 2016

> Which kind of metric is it used to compare these algorithms?

Usually (in particular if the edge weights in the graph are non-negative or even positive) the ratio (or sometimes, but more rarely, also the difference) between your attained solution and the optimal solution.

> How good are heuristics methods compared to exact solvers?

First: For mathematically minded people "heuristics" (in the sense of the video) are to be considered worthless since you rarely can prove anything about the solution attained by it. Even if by definition some result about the solution can be shown (say, that it is optimal in some k-neighbourhood of solutions) this will hardly tell you anything about how good you really are.

But one very good heuristic algorithm for which you actually prove something (and thus is really helpful) is the Christophides algorithm (https://en.wikipedia.org/wiki/Christofides_algorithm), for which one can show that if the edge weights form a metric in the graph you will get a solution that is at most 3/2 as long as the optimal solution.

karussell · on May 14, 2016

If you are looking for a simple, powerful but fast open source library that solves not only simple but real world TSPs (with time windows etc) and the more complex vehicle routing problems then have a look into jsprit: https://github.com/graphhopper/jsprit

(note: I'm one of the GraphHopper founders)

jypepin · on May 9, 2016

This is really nice. I find this kind of visualization making understanding algorithms way easier.

dandermotj · on May 9, 2016

This is awesome

mmanfrin · on May 9, 2016

Why is this linked to someone's G+ page?

https://www.youtube.com/watch?v=SC5CX8drAtU

dang · on May 9, 2016

I assumed it was someone involved with the project, but since there's no obvious sign of that, we've changed the url from https://plus.google.com/+TimHarfordEcon/posts/bWijuaPkzfR.

DanBC · on May 10, 2016

Tim Harford is a UK economist who also presents "More or Less" on BBC Radio Four. He draws attention to a lot of other stuff on his social media profiles.

http://www.bbc.co.uk/programmes/b006qshd

So although it's not the original source (and the original source is better) he's not just some Internet random.