Suggestion: TSP is a combinatorial optimization problem and isn't well suited to GAs. You would be much better off using a method meant for combinatorial optimization: notably ant colony optimization.
I agree on this -- I wrote my thesis on the VRP (Vehicle Routing Problem) which is an even more difficult problem. I've studied GAs extensively (they are pretty awesome, hence my interest), but other methods are much better. I've heard great things about ACOs, but eventually I settled for Tabu Search, because:
- It is deterministic, hence not a black box
- It is a local search, easy to explain
- Does not require as many parameters to tweak (GAs' performance demands on how you set these: mutation rate, population size, cross-over mechanism (dozens of possibilities), stopping procedure, selection procedure, etc...)
- Considered by literature to be one of the strongest class of algorithms for the class of TSP problems, in both solution quality and computational effort
If you don't mind expanding a little, I'd be interested to know why they're not suited to combinatorial optimization, and what they are in fact more suited to?
I have searched for a problem that they're more suited to, and I've come to the conclusion that GAs are in fact not known to work on any problem. They do "work" in the sense that sometimes they find an answer, but there are other algorithms that are much simpler and consistently outperform them (notably randomized hill climbing). Here is a paper that despite trying to prove the opposite, clearly shows that GAs are NOT a good way to solve any problem: http://web.cecs.pdx.edu/~mm/nips93.pdf
That's not quite my understanding of the paper's results: "As can be seen, the time to reach level one is comparable for the two algorithms, but the GA is much faster at reaching levels 2 and 3. Further, the GA discovers level 3 approximately twice as often as RMHC [randomized hill climbing]"
I don't dispute that there are generally better optimization algorithms than GAs, but this paper does present an artificial case in a GA outperforms hill climbing.
I never cited the paper's conclusion as my position. What the paper shows is not what the paper concludes. The authors clearly didn't want to show that GAs are inferior, though inadvertently they certainly did. BTW, the authors of the paper conclude that further research is necessary to find cases where GAs outperform hill climbing.
Most GAs are not a very faithful adaptation of actual biological evolution, either. The real evolutionary process is a bit different. Most notably, the "agent" actually implements the algorithm in the sense that it performs its own self-replication and implements the "operators" (crossover, etc.) within its own embodiment. This has certain implications, most notably that the algorithm itself is subject to evolution.
There are other big holes too. For example, few GAs implement anything resembling ontogeny or lifetime learning, and generally have a poor genotype/phenotype divide. That also has big implications. It's a big reason most GAs are far too "greedy" and get stuck at local maxima easily.
"We then analyze an "idealized" genetic algorithm (IGA) that is signicantly faster than RMHC and that gives a lower bound for GA speed. We identify the features of the IGA that give rise to this speedup, and discuss how these features can be incorporated into a real GA."
"As can be seen, the time to reach level one is comparable for the two algorithms, but the GA is much faster at reaching levels 2 and 3. Further, the GA discovers level 3 approximately twice as often as RMHC."
"We have presented analyses of two algorithms, RMHC and the IGA, and have used the analyses to identify some general principles of when and how a genetic algorithm will out-
perform hill climbing."
I'm not sure this paper says what you're claiming that it says.
Note that the "IGA" is not a real algorithm. Knowledge of the solution is encoded into the algorithm. It's no surprise that it outperforms an algorithm that does not have the privilege of knowing the solution before it starts.
Second, to get a result where a GA outperformed hill climbing they did the following:
1. They started with a problem that was DESIGNED to be very well suited to GAs and not so well suited to hill climbing. It turned out that when you do hill climbing in a non ridiculous way that GAs lose big time (by a factor of 10).
2. Through several steps they further modified the artificial problem to give a disadvantage to hill climbing and an advantage to GAs.
3. They tuned the GA's parameters and did not tune the hill climber's parameters.
4. They compared the performance by number of fitness function evaluations. This is unfair to hill climbing because GAs have bigger overheads elsewhere.
After these steps the GA outperformed hill climbing by about a factor of 2. So it is not clear that the GA would still win if you tuned the hill climber. Even if it did, this is a problem explicitly designed to give GAs an advantage. The fact that they had to go through so much effort to design such a problem doesn't instill much confidence that there exists a real world problem where GAs work.
I have tried to replicate their results and do the tuning of the hill climber, but unfortunately the paper is so vague on what the problem is that the algorithms are actually supposed to solve, so that I was not able to do this. If anybody knows a study of a problem (preferably real world) where GAs are shown to outperform reasonable forms of hill climbing I'd be very happy to hear it.
"If anybody knows a study of a problem (preferably real world) where GAs are shown to outperform reasonable forms of hill climbing I'd be very happy to hear it."
GA (specially pseudo-boolean GA) is not a "Golden Algorihtm" that solves every problem as most people think, rather it is a "idea" that was pioneered by Hollad in the sixties, now it's sole purpose is to explain the theoretical aspects of other GA-offshoot Evolutionary Optimization techniques like GP(Genetic Programming), EA (Evolutionary Algorithm), DE(Differential Evolution), ES(Evolutionary Strategy) etc, etc.
I think GP has much more impact in attacking real world problems, GECCO Humies award list has many interesting results where RMHC's may not be suitable -- http://www.genetic-programming.org/combined.html
"Even if it did, this is a problem explicitly designed to give GAs an advantage."
Well, yeah, you'd want to use GAs on problems for which they were well-suited. That's not at all the same as "clearly shows that GAs are NOT a good way to solve any problem".
Sure. The point is the effort they had to find such a problem. That indicates that it is very unlikely that any given practical problem is suitable for GAs. Again, if you are aware of any real problem where GAs work better than hill climbing, please do share. Note that this is not a very high bar. For example the same applied to quicksort vs insertion sort is "find any real example where quicksort outperforms insertion sort". If you couldn't find such an example and had to go to great lengths to artificially construct such an example then I'd call quicksort a failure, given that it's more complicated than insertion sort. Why GAs were/are as hyped as they are is a mystery to me.
But I agree, that they are not a silver bullet for any kind of problem and for most problems there is almost always a way smarter and more efficient algorithm.
I did not hold this presentation to show that GAs were the best solution for any problem. My intention was to teach sth. that I myself liked in university. Especially I thought that the way we were taught GAs in university was too complicated and complex.
A domain in which GAs are slowly finding resurgence is algorithmic portfolio management - I work in this area a bit (ie portfolio management - but not directly involved with the GA guys - plus their research is proprietary) - but their premise is that evolution has done pretty well surviving against nature and products of evolution itself.
The financial markets are a bit like that - stochastic exogenous factors (headline risk like the Euro crisis, the FOMC meeting decisions etc.) and evolved responses based on the principle of no-arbitrage (ie other traders) determine portfolio performance and trading decisions along with transaction costs.
But GAs as they exist are unsuitable for this, from what I've gathered from conversations with those quants. The modifications required appear to be in the direction of structured intra-generation genome modification (or adaptation) rather than pure randomness and crossover to produce agents with a specific trading behaviour (much in the same way as evolution allows a sensible change in the DNA to produce a particular protein suited for a task).
> I have searched for a problem that they're more suited to, and I've come to the conclusion that GAs are in fact not known to work on any problem.
Yes, that's my opinion too. GAs model natural selection, which is extremely slow. The only reason it works in nature is the huge timescales and the lack of anything better (since it has to start from essentially nothing - there is no designer).
I would reserve GAs (at least in their naive form) as a last-resort option. They are extremely inefficient solutions because it doesn't exploit any structure in the problem.
Essentially, a GA solution consists of:
1. Select a random point(s) in the search space, hoping that you have arrived at a sufficiently good solution.
2. If none of the solutions are not good enough, generate a new set of points by combining the best available points and adding a bit of random error.
3. Repeat the above until time has run out/u've hit a good enough solution.
However having said that, GAs can be a good option if 1)not much is known about the function you are trying to optimize, or 2)the crossover/mutation functions are designed to reflect some problem structure, or 3)the search space is small enough.
Thx for the hint :) The task originated from a task I had to do at university in Algorithms class as an introduction to GAs and I liked the simplicity of it.
The coffeescript rocks/sucks crowd does not belong here. This person wrote a blog post explaining something using a language he is familiar with.
If articles are boiled down to a shouting match because the java-haters or the coffeescript-haters or the ruby-haters found the article first we all may as well go back to /.
I'm not saying that it being your main language is bad, it's just the syntax is absolutely terrible to read and understand for those of us on C-style languages. If the point was to teach others, the vast majority of your audience is not going to be reading coffeescript and learning much.
- There are GA tutorials available in several other languages, if this doesn't work for you. Btw, if a tutorial was offered in Lisp/Haskell, would we see the same suggestion?
- For Ruby/Python programmers, this isn't hard at all to read. Familiarity with non C-style languages might be something to pick up.
- Having moved to Coffee from C-style languages, I prefer coffee. But that is just my opinion.
I think HNers should be more tolerant of language choices. That argument can't really be won.
that's funny because coffeescript is meant to be all about readability in code. I wonder if you really tried to read this or if you just commented based on the title. If you did really read it, I wonder if you found it hard to read because your opinion is tainted by those C style languages? It's not scientific of course but I just got my non coder, non technical girlfriend to read through both the coffeescript and javascript versions of this and I definitely had to explain a lot less to her with the coffeescript version, she says it's easier to read because it's structured more like plain English.
Just ran through this quickly, I don't know too much about algorithms, so I can't judge GA's usefulness when compared to other methods, but it certainly seems like a cool/fun thing to implement.
Just for clarification, you mean machine learning rather than ML the language, right?
(I ask because this discussion is talking about both the genetic algorithms and Coffeescript).
Hmm... maybe some sort of GPS application that took into account current road conditions -- length of time stopped at traffic lights, performance characteristics of the current vehicle (velocity and acceleration away from lights, etc).
At one time there was talk of making the air traffic control system distributed rather than centralized. The win there is that 1) each aircraft only has to worry about the other aircraft that near enough to interfere and 2) if the control system goes out in one aircraft, the systems in the other aircraft can compensate. By contrast the centralized system has to watch all the aircraft in a large chunk of airspace, and if it goes out, well...
This is sort of what Google is doing with their self-driving cars, I think (that's just my impression -- I haven't looked into it in detail).
https://github.com/janmonschke/Genetic-Algorithms/tree/gh-pa...