Hacker News new | past | comments | ask | show | jobs | submit login
Beyond A*: Better Planning with Transformers (arxiv.org)
313 points by jonbaer 9 months ago | hide | past | favorite | 120 comments



There has been more interesting work on using transformers for robot motion planning[0]. Getting a robot arm from point a to b without hitting stuff is a very difficult problem, the problem is both high dimensional and continuous. Previous planning methods are both computationally intensive and not very good. This is one reason why robot motion appears 'unnatural' and robots generally being bad at many tasks we'd like them to do. This approach appears pretty competitive with other planning methods, planning near optimal paths faster.

[0]https://sites.google.com/ucsd.edu/vq-mpt/home


Informative comments like this is the reason I go to the comment section before reading the article.


I wonder if they tried the modified J* algorithm (an optimization of A* for games graph/path-finding) before going down this research route.

It's in Game AI Pro 2 [0] if anyone is curious.

[0] https://www.amazon.ca/Game-AI-Pro-Collected-Professionals/dp...



To be fair, they said at the end that their path-finder is not yet competitive with the SOTA. The paper tests how well a transformer does at predicting an execution trace (like in a JIT compiler) and whether this can help improve heuristics in things like path-finding. I'm weary because transformers are slow.


I love these books (and nice to see Steve Rabin still doing them), but $120 for an ebook? That's unexpected.


Worse yet… the game ai pro books are entirely free for digital copies. You’d basically be paying $120 for it to be stitched together into a single pdf, instead of one per chapter

http://www.gameaipro.com/


That’s one expensive imagemagick command!


If you consider it like a textbook (very small niche audience for very specialized knowledge) it does kind of match up due to the small expected volume to sell.

Authors still gotta eat.


Planning is already well taken care of by established techniques ranging from graph search, SAT-solvers, OR, Prolog, etc. The point usually is optimization over multiple feasible alternatives, which I have a hard time seeing transformers being up to. I see the role of LLM techniques more in translating natural language descriptions to executable programs - but then Prolog is already very close, it being designed for classic NLP after all.


It would be interesting to see the comparison between Prolog and a similar purposes LLM


> Searchformer [...] solves previously unseen Sokoban puzzles 93.7% of the time, while using up to 26.8% fewer search steps than standard A∗ search ... [and] significantly outperforms baselines that predict the optimal plan directly with a 5-10× smaller model size and a 10× smaller training dataset.

So this can be interpreted as a generic statement to say there might be progress during learning iterations (compared to its own arbitrary first iteration baseline). I think it's important to not getting carried away and assume the recent immense progress in generative AI is just easily repeated for any other AI task. There's also quantum computing having been advertised for over a decade now for a break through in planning and optimization efficiency.

> We also demonstrate how Searchformer scales to larger and more complex decision making tasks like Sokoban with improved percentage of solved tasks and shortened search dynamics.

Worth mentioning that Sokoban is nowhere near a complex decision making task let alone state of the art in an academic or commercial planning/optimization context.

> A∗'s search dynamics are expressed as a token sequence outlining when task states are added and removed [...] during symbolic planning.

Go ahead and compare against eg. Quantum Prolog's readily available container planning problem and solutions [1] then to generate exactly that in the browser - a plan with symbolic actions to perform - or alternatively, a classic Graphplan or CLIPS planning competition problem.

[1]: https://quantumprolog.sgml.io/\*


Machine Translation used to involve complicated grammatical decoding, using search. Now we just use transformers for MT, with much simpler decoding that doesn't really need search.

Perhaps we can reach full inception. Let's use our current best-of-breed predictive models to learn heuristics for neural architecture search (NAS) and search for new neural blocks (better than transformer and mamba).


"Every time I fire a linguist, the performance of the speech recognizer goes up." --Frederick Jelinek


...and finally enter a world where literally no one understands anymore how the technology works, not even the people developing them. Singularity, here we come...


We could then go on to create something akin to the mechanicum, create the position of tech priests and have them pray to the machine god on on our behalf


If you think that still needs to be created you haven’t worked at tech support


The priest class is there, what we need is to level up the machine god itself.


The tech priests will build electric monks to do that job.


Barring the singularity, just because you find something with search doesn’t mean you can’t understand why it’s better.


Shameless plug, but if you are interested in Sokoban type games check out https://thinky.gg . Features a fun Sokoban variant (called Sokopath) as well as another NP hard variant called Pathology (goal is to get from point A to point B in the shortest amount of steps).

Many in the community have tried to create various solvers and things get very hard once the grid gets to be over 5x5. However, some interesting levels with very high maximum step counts have been discovered by the thinky communnity (via simulated annealing)


Thank you! Great games! Super nice execution! I love it!


"26.8% fewer search steps than standard A∗ search"

So, slightly better than A* which is far from SOTA on Sokoban (https://festival-solver.site/).

What is impressive in this paper? Why is this on Hacker News?


So A* is the most optimal search algorithm under the specific constraints it specifies. You can't do better.

However, sometimes the specific domain you are searching has other constraints that can be exploited to do better than A*. An example of this being Jump Point Search that exploits certain properties of grid searches if you can only move in certain ways.

If you were able to write a general searching algorithm that can effectively exploit the whatever the specific properties of the underlying domain "automatically" without you having to actually work out what they are, that would be useful right?


Paper authors choose to compare against A* and Sokoban.

A* can't solve even the first level of original Sokoban 90 levels.


Because they used a transformer to reach a nice solution, better than a typical A* search, which is a "naive" or go to solution. And they didn't think about designing an algorithm. I find it quite impressive that a simple encoder-decoder transformer can achieve so much.


It is in the abstract, very first line. "While Transformers have enabled tremendous progress in various application settings, such architectures still lag behind traditional symbolic planners for solving complex decision making tasks. In this work, we demonstrate how to train Transformers to solve complex planning tasks ..."

This paper is interesting (to me) as an example of how to use transformers for decision making. I don't particularly care if it is up to A* standards at the moment.


Abstract doesn't answer my question.

What is the scientific contribution of the paper?

They trained transformer on pairs of <sokoban_level, A*_optimal_solution>.


> What is the scientific contribution of the paper?

That's not a question you ask other people, that's a bullet point at the top of the outline you created while reading the paper for yourself. You should see the creation of said outline as a measure of your actual interest in the subject.


Having now read the paper, the research area is interesting because optimizations in optimal path finding have applications in robotics, gaming, and reasoning (what the ultimate intention of this paper is).

The research team identified ways to tokenized path finding algorithms for two tasks maze solving and sokoban (a game where a crate has to be pushed to goal) and then trained a model on the execution traces of these algorithms.

The insight this provided was that the "searchformer" model was about 26% faster than the traditional methods. If that is applied to Route-planning, Robotics, and Game Development, it could have tangible performance benefits.

IMHO, it is not a wild breakthrough but an interesting solution to a real-world problem.


> Why is this on Hacker News?

Anything that is on HN is on HN because the community likes it


Because it's further evidence for the "unreasonable effectiveness of transformers", i.e. that transformers are a fully-general approach for all kinds of learning tasks, not just next-token prediction. Obviously there is a strong and a weak version of that hypothesis and the strong version is probably not true, but to the extent that we appear to be honing in Nature's "one true way" to learn how to accomplish a task, that seems like important news.


In certain computer science problems, a suboptimal action at time t may give rise to an optimal outcome at time >t.

Why wouldn't this be the case for research generally? Has our community really devolved to the point where things should only be noteworthy insofar as they optimize for SOTA for a given problem?

What a sad thought.


Has anyone compared planning algorithms by complexity against optimality (error)?


It's on hn because it sounds similar to the so called Q* algorithm, the supposed secret openai algo that has seen a huge amount of speculation.


What? Someone somewhere tried to do something and wasn't the most optimal possible solution? We should just ban their account honestly.

Comments like these are antithetical to a strong technical sharing community.


I agree. OP's comment could be quickly rewritten into something useful and just by changing the tone, for example:

"26.8% fewer search steps than standard A∗ search" For reference of prior art, it's slightly better than A*, which is far from SOTA on Sokoban (https://festival-solver.site/).


Especially considering the amount of trivial mainstream tech articles nowadays. It's cool to see more algorithmic topics.


Some people find things interesting regardless of if they break records.


If transformers can plan, then AGI only requires better education...


Approximating exhaustive search is not logic or causality.


There’s a lot more pieces, agency being a big one but online learning is required too and many other layers beyond that.


That's probably the foreseeable future, ingesting ever larger amounts of data hoping it prevents hallucinations.


For the auditory learners, here is this paper in a summarized audiobook format :

https://player.oration.app/09fefe41-f2a7-4257-a25e-30e479b30...


Thanks for sharing, I've been looking for an app like this for ages.


Thank you so much for your comment! Apologies if my comment came off like an ad - I have mild dyslexia and struggle to keep up with ML papers; I hope others also find this useful! If you have any feedback or issues, feel free to email us at support [at] trurecord.com


I am extremely optimistic about using learned heuristics in discrete algorithms like A* or Focal search or the various families of ILP.

In most modern discrete optimization libraries, e.g. CPLEX it's the heuristics and tuning that explain the performance.

I'm less understanding of using a end to end learning approach to replace a well understood optimal search routine, but that might be pearl clutching.

It just seems to me the authors missed that opportunity.


I think it is just the bubble/hype effect around transformers and AI. I might try to use transformers to solve tic-tac-toe and apply for some VC moneys.

In a few years we'll all be writing about how much more efficient actual code is compared to AI perhaps ;)


Everyone on Transformer News will be talking about the latest web framework for GPT-7 and how to horizontally scale their system to handle dozens of requests a second, and how when you need to scale beyond that it will "be a good problem to have".

Meanwhile, you and I will be talking about how our low-level, handwritten Ruby on Rails code is so much faster and more efficient, and closer to the metal so you can understand what's _really_ going on.


I agree, learning admissible heuristics will retain worst case performance, which has always been the measuring stick for these algorithms. It's not at all uncommon to find faster solutions for the average or even p99 cases that cannot provide guarantees on the worst case.


How would one go about proving that a learned heuristic (something from an AI model) is in fact admissible?


For something like focal search, it doesn't even need admissibility, you just apply it as a second selection heuristic among the choices your admissable heuristic returns as 'top k' choices.


So a tiebreaker then?


Is someone keeping a list of classical algorithms or NP complete problems that are now better performed using deep learning?


For your convenience, here is the list of NP-complete problems where "AI" works better than the state of the art in the worst case:.


AI + classical algorithms is my sweet daydream. Trained heuristics (even better domain specific ones), deployed for classical A*, ILP families, focal search, etc etc.

That is going to be really amazing.


It is happening. There are papers on deep learning to improve the variable choice in branch-and-bound, etc


Yeah my first exposure was from Marco Pavone's lab, on ML heuristics for MICP in the context of a cubical tumbling robot, IIRC. Really cool stuff.


Even solving large linear systems would already be amazing. But a SAT solver would be nice too :)


I misunderstand your comment. We have those solvers. I'm suggesting AI would plug into existing solvers. This is a ripe area for research.


Thank you for that. I needed the laugh :-)


For those of you who didn't get the joke here.

(S)ETH is a bit of a bummer if you only consider the dominate term in big-O, for the general case.

https://web.stanford.edu/class/cs354/scribe/lecture17.pdf


Probably soon AI will be able to find improvements to most of them? Like using AI to do search in algorithm space


For all we know, some of the current best (still exponential) algorthms were guided by AI. If a mathematician solves a problem using mathematica, they don't usually write in the paper what tools they used.


From my understanding this is very much still active research, without any clear wins deployed in production settings.


This paper reminds me of the Neural Network Diffusion paper which was on the front page of HN yesterday in the sense that we are training another model to bypass a number of iterative steps (in the previous paper, those were SGD steps, in this one, it is A* exploration steps).

On a different note, they choose such a bad heuristic for the A* for Sokoban. The heuristic they choose is "A∗ first matches every box to the closest dock and then computes the sum of all Manhattan distances between each box and dock pair". I played Sokoban for 20 minutes while reading the paper and I feel like this is a very poor exploration heuristic (you often need to move boxes away from goal state to make progress).


I have a hunch they made their decision to train off that particular type of A* traces to avoid an exponential number of embeddings.


To the authors, please rename this new algorithm from the boring Searchformer to more interesting name for example Optimus Prime.


also because googling for "Searchformer" brings up a different paper and model: https://www.sciencedirect.com/science/article/abs/pii/S01722...

"Planformer" seems unused and would reduce confusion

(also since "Search" implies RAG etc which is not what this model is about)


"Search" has been the name for the A* space. So it makes absolute sense. "Planning" means in the symbolic AI space often something quite in the direction of PDDL/STRIPS/ICAPS planning.


T*former or T-star


T* (T-star) is a great name.


It is the copyrighted name of a lens coating by Zeiss Optical, however.


T1* and T2* are common terms in magnetic nuclear resonance imaging, suggesting Tn* for any integer n should be free of bullshit IP protection. May I thus suggest T10*, following the pattern of a11y and i18n?


Funny that you mention RAG since the RAG authors had expressed their regret of not using a much better name and the name stucked once it becomes very popular.

On second thought, in the spirit of A* perhaps they can rename it to O' short for Optimus Prime but this probably will break most of the databases in the world [1],[2].

[1] Exploits of a Mom:

https://xkcd.com/327/

[2] Prime (symbol):

https://en.m.wikipedia.org/wiki/Prime_(symbol)


I can’t think of any database that doesn’t support the ' character in their varchar.


I think the concern was all the buggy apps that connect to databases will accidentally be getting SQL injected.


If they’re susceptible to injections this is the absolute least of their problems. In fact their DB is already dropped.


At this point we need a website cataloging all transformer related names.


Megatron would be better IMO.


Nvidia: "We'll pretend you didn't say that."


https://github.com/NVIDIA/Megatron-LM

> Megatron is a large, powerful transformer [...]


Clearly Megatron because LLMs are decepticons.


Yes that was my reasoning :)


just deceptive


I think Searchformer is kind of catchy. :shrug:


Sacagawea is my proposed name.


Wouldn't that be a trademark violation?


Trademarks are specific to classes of goods. A trademark for toys named Optimus Prime doesn't prevent you from making software called Optimus Prime.


While true, for something as famous as Optimus Prime, there could conceivably by liability for dilution by blurring.


Wouldn't that be an overkill to train a model specifically for that?

I wish someone could come up with an algorithm that would outperform a trained model. This way we would actually understand how it works.


Shortest path search is a very well understood problem, and for "simple" cases A* is the gold standard.

You could probably tweak the heuristic used by A* to the problem to get similarly good results (at much lower compute cost than running what amounts to an LLM with huge context length). But this is a toy problem. The interesting thing about the paper is that you might be able to get good-performing search algorithms for any problem with a cookie-cutter process, instead of spending many man-hours tweaking a hand-written search algorithm. It might also be possible to easily adapt it to much more complex search problems, like cooperatively evading other agents navigating in the same space.


Well in this case, A* outperforms in terms of funding a solution when it exists. It's just the transformer produces a solution faster, but misses some.


Note that the paper doesn't measure execution time of the A* or Transformer-based solutions; it compares length of A* algorithm traces with length of traces generated by a model trained on A* execution traces.

I suspect that the actual execution time would be far faster with the A* implementation than any of the transformers.


Was the pun intended ?


Path planning is highly dynamic and requires real time adjustment. In mature software, the heuristics are already highly optimized for local road/traffic data and users. So, I don't think AI can actually outperform them yet.


do you want the solution faster or more precise? Depending on the answer you would pick either A* or this model


They don't report execution time in the paper. It's likely that the A* implementation would run faster in terms of CPU or wall clock.


Transformer model that optimally solves previously unseen Sokoban puzzles 93.7% of the time, while using up to 26.8% fewer search steps than standard A∗ search. Doesn't fewer search steps imply faster?


No, each step could take significantly more time (and resources).


Can transformers prove absence of the solution? E.g., did they tried to train them on the unsolvable problems?

I once tried my hands on the PCB routing and here's a simple problem of simultaneous multipath search that is unsolvable:

  A.....b
  B.....a
A and B are starting points and a and b are goal points, respectively for A and for B. It is a 2xN maze, the orders of stating and goal points are reversed.

The search algorithm sometimes required to prove absence of the solution. A* can do that, transformers? I do not think so.


In this case, A* is in a finite, bounded 3030 grid. As A is just a type of branch and bound algorithm, which the degenerate form is exhaustive search. Which should be able to 'prove' the absence of a path by simply failing to find one before exhausting all paths.

PCB routing is not as simple as this problem for real world problems.

Digging into why TSP with a positive integer Euclidean metric is NP-C but with a real valued Euclidean metric is NP-hard may be one way to think about it.

But if you choose the right metric and huristic, in what approximates a finite space, A* can run an exhaustive search in practical time.

It is probably a good idea to separate search space exhaustion from no-instance proving which in the case of decision problems can cause confusion.

If you think of NP as relating to yes-instances, co-NP is the no-instances.


You are mathematician, am I right? Your answer is correct (sorta, as far as I am concerned) and does not add to the discussion I tried to open.

Your answer does not show anything related to the ability of transformers to prove absence of solution, to be precise.


No, just a programmer that has to apply math.

Transformers can only do what is called weak negation.

if (not (goal X)), then (assert not x)

One example I have seen is: "you can cross if you have no information on a train coming”

VS:

"you can cross if you have evidence that no train is coming"

See the difference?

As to what 'transformers' can encode is ambiguous and depends on many factors but here is some light reading.

https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00493...

https://arxiv.org/abs/2204.06618


I find it hard to believe that all the heavy-weight data processing and GPU computation really make a constant factor reduction in search steps worth it.


It's also not clear to me how one would determine that an ML model-generated plan is indeed optimal, or how far from optimal it is. A*-based approaches give you these things.


While A* could be used to solve Sokoban, wouldn't some sort of Monte-Carlo Tree Search be better at it? I wonder how would it compare to this model.


The part of MCTS that matters for solving it is the Bellman equation, which is a component of the Bellman-Ford shortest-path algorithm, so all of those methods are part of a continuum.

Which one works best is pretty sensitive to the exploratory decisionmaking algorithm: do I search this cell first, or that one? That heuristic is the differentiator, whether it is put on top of a Dijkstra-style algorithm (A*) or a Bellman-Ford-style algorithm (MCTS).


> 26.8% fewer search steps than standard A∗ search

So you're saying I shouldn't read this paper?

A* is not particularly good. If all the algorithm can do is mirror it by training on its output what's the point?


Yeah, at first I read that as it using 26.8% of the original steps, but reducing the number of steps by 26.8% is not that impressive. I wonder whether it actually reduces total search time as there is added overhead of running the neural network.


There's absolutely no way it reduces search time. A* is trivial to run per timestep. The time must be thousands to maybe hundreds of thousands of times slower.

I publish in this area and this is a common thing for reviewers to bring up when authors don't report wall clock time. And then for papers to be rejected. What's the value in making an algorithm that's drastically slower? Not much.


> What's the value in making an algorithm that's drastically slower?

Perhaps as an important stepping stone? Deferred optimization and all that.


Sounds like half baked research that shouldn't have been published then. Papers are supposed to be an actual advance not a tech preview of something that might be useful maybe if someone else figures it out.

Some people call the flag planting. You publish something that doesn't work with keywords you suspect will be important in the future. Then hope others will cite you.


Amazing. Now do that to sorting algorithms.


Both ml and A* are integral transforms.


What do you mean by that?


Pretty impressive


what's with Cornell University??! They produce constant amazing papers and projects on AI


What makes you say Cornell? Isn't this from Meta?


I think it's either a joke or misunderstanding about arxiv




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: