Hacker News new | past | comments | ask | show | jobs | submit login

As far as I know, in Magic: the Gathering, the best bots are far worse than most players. Part of the difficulty is that the rules are so complicated that there are only a couple of complete rules implementations. Beyond that, it's an imperfect information game with far more actions per game than poker, so optimal-solver techniques haven't seen success.



I think this is purely a resource issue, e.g. if Google Brain decided to make an MtG bot I would be fairly confident it would be superhuman. Even real time strategy games like Starcraft are looking like they're on the cusp of superhuman bots (Alphastar was competitive as Protoss against elite players, but did not consistently beat them).


I doubt very highly it would be able to sit at a game of commander. Four players with 100 card singleton decks would be an absolutely enormous space to operate in.


Many players sounds like it's still a hard nut to crack for AI approaches (although as poker demonstrates its getting easier), but the deck size doesn't sound like it'd be the main issue.


Alphastar also didn't play with the same limitations that a human has. Even after removing its ability to see the entire map and finally forcing it to scroll around, alphastar never misclicks (so its APM==EPM) and can still blast nearly unlimited APM for short bursts as long as its "average APM" over an x-second period matched human's APM.

I believe Alphastar would generate more interesting strategies if we limited alphastar to a bit below human APM and forced it to emulate USB K+M to click (instead of using an API, which it currently does) and adding a progressively increasing random fuzzing layer against its inputs so that as it clicks faster the precision/accuracy goes down.

By "interesting strategies" I mean strategies that humans could learn to adopt. Currently its main strategy is "perfectly juggle stalkers" which is a neat party trick, but that particular strategy is about as interesting to me as 2011-era SC AI[0]. Obviously how it arrived at that strategy is quite interesting, but the style of play is not relevant to humans, and may in fact even get beaten by hardcoded AI's.

I'm also very curious what Alphastar could come up with if it were truly unsupervised learning. AIUI, the first many rounds of training were supervised based on high level human replays -- so it would have gotten stuck in a local minima near what has already been invented by humans.

This may be relevant if Microsoft reboots Blizzard's IP. I would love to have an alphastar in SC3 to play against off-line, or have as a teammate, archon mode, etc. I think all RTS' are kind of "archon mode with AI teammate" already. The AI currently handles unit pathing, selection of units to attack, etc. With an alphastar powering the internal AI instead, more tactics/micro can be offloaded to AI and allow humans to focus more on strategy. That seems like it would be super cool.

Examples: "Here AI, I made two drop ships of marines. Take these to the main base and find an optimal place to drop them. If you encounter strong resistance or lots of static defense, just leave and come back home"

"Here AI, use these two drop ships of marines to distract while I use the main army to push the left flank. Take them into the main, natural, or 4th base -- goal is to keep them alive for as long as possible. Focus on critical infrastructure/workers where possible but mostly just keep them alive and moving around to distract the opponent."

0: Automaton 2000 AI perfectly controls 50-supply zerglings (2.5k mineral) vs. 60-supply (3k mineral, 2.5k gas) siege tanks: https://www.youtube.com/watch?v=IKVFZ28ybQs


> Alphastar also didn't play with the same limitations that a human has. Even after removing its ability to see the entire map and finally forcing it to scroll around, alphastar never misclicks (so its APM==EPM) and can still blast nearly unlimited APM for short bursts as long as its "average APM" over an x-second period matched human's APM. > I believe Alphastar would generate more interesting strategies if we limited alphastar to a bit below human APM...

No Alphastar definitely had misclicks, and it had a maximum cap on APM regardless of average that was far lower than the max burst of APM (or even EPM) of top players. When I have the time I can go dig up some games where Alphastar definitely has misclicks, and I believe the Deep Mind team has said before that it will misclick. Its APM limits are already lower than pros both on average and in bursts (and are reflected in its play, Alphastar will often mis-micro units in larger, more frantic battles such as allowing disruptor shots to destroy its own units, but it will never make the same mistake with much smaller numbers of units).

> Currently its main strategy is "perfectly juggle stalkers"

Definitely not. That was its strategy in its early iterations against MaNa and is no longer feasible with the stricter limitations in place. Its Protoss strategy is significantly more advanced than that now (see its impressive series of games against Serral with an amazing comeback here: https://www.youtube.com/watch?v=jELuQ6XEtEc and a powerful defense against multi-pronged aggression here: https://www.youtube.com/watch?v=C6qmPNyKRGw) (and of course by "now" I mean when Deep Mind took it off the ladder). Both of these involve an eclectic mix of units with Alphastar effectively using each type of unit and varying it in response to what Serral puts out and its own resource constraints.

A lot of commentators have difficulty distinguishing Alphastar from humans when the former plays as Protoss (its Terran and Zerg play is weaker and often more mechanical).

> I mean strategies that humans could learn to adopt.

My main takeaways from watching Alphastar were "pros undervalue static defense and often have a less than optimal number of workers (where Alphastar's seeming overproduction of workers lets it shrug off aggressive harassment)," but I don't know if those have picked up in the meta.


Why is it far worse with Terran and Zerg?


I don't know enough to answer "what mechanisms of how the AI works would cause it to be worse than Terran and Zerg."

If the question is rather "what characteristics of Alphastar's Terran and Zerg play style make me say that its Terran and Zerg play is worse than its Protoss play," the simplest answer is that Alphastar just feels a lot more like a bot. Unlike when playing Protoss, it seems to get into certain "ruts" of unit composition and tactics that are a bad match for the opponent it's facing and can't seem to reactively change based on the game is going, whereas with Protoss it seems more than happy to change its play style over the course of the game based on what the opponent is doing.


Edit 2: Reading through the "supplementary data" of the 2019 paper, it definitely appears that the AlphaStar which reached grandmaster was not limited in the same ways as the 2017 paper would suggest. x/y positions of units are not determined visually, but fed directly from the API. So AlphaStar absolutely can just run Attack(Position: carrier_of_interest->pos.x) and not mis-click. It's "map" / "vision" is really just a bounding box of an array of every entity/unit on the map and all the things that a human would have to spend APM to manually check (precise energy level, precise hit points remaining, exact location of invisible units, etc). See [7]. DeepMind showed they have some fixed time delays to emulate human experience, if they had a position fuzzer, they would have mentioned it. I'm reasonably convinced they gave AlphaStar huge advantages even in the 2019 version that was 'nerfed' from the 2018 version. The 2017 paper was a more ambitious project IMO that didn't quite get fully developed.

Edit: 7 minutes after writing this I re-read the original paper[-1]. https://arxiv.org/pdf/1708.04782.pdf page 6 and 7 make it clear that DeepMind limited themselves to SpatialActions, so they cannot tell units "Attack Carrier" but have to say "Attack point x,y" (and x,y also has to be determined visually, not through carrier_of_interest->pos.x ). It's still not clear in the paper if any randomness is added to Attack(x,y).

Additionally, I have some serious concerns about assuming that the design decisions made in this 2017 paper were actually used in the implementation of the 2019 Alphastar demo vs TLO and MaNa. The paper claims "In all our RL experiments, we act every 8 game frames, equivalent to about 180 APM, which is a reasonable choice for intermediate players." I would agree with this choice! But [5][6] indicates that Alphastar's APM spiked to over 1500 APM in 2019! And even in moments when a human reaches that APM, their EPM would be an order or magnitude lower, whereas Alphastar's EPM matches its APM.

Original post:

Thank you so, so much for adding to the discussion! Would love to chat more about this if you see my reply and feel like it.

Regarding "mis-clicking", my understanding was that AlphaStar used Deepmind's PySC2[0][1], which in turn exposes Blizzard's SC2 API[2][3].

Here is the example for how to tell an SCV to build a supply depot:

  Actions()->UnitCommand(
    unit_to_build,
    ability_type_for_structure,
    Point2D(
      unit_to_build->pos.x + rx * 15.0f,
      unit_to_build->pos.y + ry * 15.0f
    )
  );
  
where unit_to_build->pos.x and unit_to_build->pos.x are the current position of the SCV and rx and ry are offsets. It's possible to fuzz this with some randomness, and indeed in the example, rx and ry are actually random (because the toy example just wants to create a supply depot in a truly random nearby spot, it doesn't care where). But the API doesn't attempt to "click" on an SCV and then use a hotkey and then "click" somewhere else. The API will never fail to select the correct SCV. It will also build precisely at the coordinates provided.

Point 1: Even if DeepMind added a fuzz to this method to make it so AlphaStar can "misclick" where the depot gets built, it cannot accidentally select the wrong SCV to build that depot. (Possibly wrong, as they could be using SpatialActions, see below)

Point 2: Most bot-makers wouldn't add a random fuzz to the depot placement coordinates to make their AI worse and I'd be super surprised if there was hard evidence somewhere that Alphastar had such a fuzz. (This is my main concern.)

My personal conclusion was that anything which looks like a "misclick" is, in fact, a "mis-decision". A human can decide "I want my marines to attack that carrier" but accidentally click the attack onto a nearby interceptor. I didn't think Alphastar could do that because I assumed it would use the Attack(Target: Unit) method instead of Attack(Target: Point) in that scenario -- and even if they used Attack(Target: Point) it would be used as Attack(Target: carrier->pos.x).

However, I realize now that they could be doing everything with SpatialActions (edit: it does, see paper[-1] pp. 6-7) (select point, select rect's)[4], and that they could have a implemented a randomness layer to make alpha star literally mis-click.

I suppose I would need to test this API and dive into the replay files to first see if its possible to discern the different between Attack(Target: carrier_of_interest) and Attack(Target: carrier_of_interest->pos.x). Then, even if Alphastar is using the latter, it's still not clear that there's an additional element of randomness outside of the AI/ML control.

Has anyone already done an analysis of the replay files on this level, or has DeepMind released hard info on how they're controlling the bot?

-1: https://arxiv.org/pdf/1708.04782.pdf

0: https://www.youtube.com/watch?v=-fKUyT14G-8

1: https://github.com/deepmind/pysc2

2: https://github.com/Blizzard/s2client-proto

3: https://blizzard.github.io/s2client-api/index.html

4: https://blizzard.github.io/s2client-api/structsc2_1_1_spatia...

5: https://www.alexirpan.com/2019/02/22/alphastar.html

6: https://deepmind.com/blog/article/alphastar-mastering-real-t...

7: https://ychai.uk/notes/2019/07/21/RL/DRL/Decipher-AlphaStar-...


I unfortunately don't have the time to look at the papers in detail (I could totally see how a lot of what I observed could happen even without intentional misclicks so I do take that back), but I want to point out that January 2019 Alphastar (in exhibition matches against TLO and MaNa) is significantly worse than Fall 2019 Alphastar. Alphastar changed very markedly between those time periods.

If you look at an Alphastar Protoss game from the latter half of 2019, it's not relying on cheap tricks to win (such as the impossible stalker micro). Nothing it's doing leaps out as superhuman. Instead it just grinds down its opponent through a superior sense of timing and macro strategy. The two games I linked against Serral it wins by punishing when Serral overextends his reach or by altering its unit composition to better fit what Serral throws at it, rather than some ungodly micro. Nothing it's doing there couldn't be done by a human. In fact I would say in most of the battles, Serral's micro was better than Alphastar's.

Now it's also worth pointing out that Serral is playing on an unfamiliar computer, rather than his own, so there's a bit of a handicap going on and even Alphastar Protoss will still lose to humans, so it's not superhuman, but it's definitely an elite player and its play style is very difficult to distinguish from that of an elite player.


The search tree is huge in mtg. It has to be the largest of any game. You can take actions all the time. There are triggers all the time, you can stack your actions on top of your opponent actions. Huge space really.

And then of course it's also imperfect information both in the sense of your opponent hand but also his deck. The cardpool is also very large for some formats.

I actually don't think it's solvable just by throwing MCTS at it with todays hardware but would love to know more about this, if someone else has more insight please reply.

EDIT: Oh and there is also the meta-game / deck building aspect. If you are going to win a tournament you have to have favorable matchups against most players in the room.


Search space size is no longer a great heuristic of how difficult a game is for the latest in AI approaches. For example, an RTS game has an absolutely enormous search space as well (effectively every unit of several hundred can move in every direction for every single tick of the game clock, many units have spells and many spells are meant to stack with other spells) and Alphastar is a convincing demonstration that this is not out of the reach of current AIs. And you similarly have imperfect information where you don't know what your opponent is doing unless they are sufficiently close to your current units.

Even the meta-game/deck building aspect doesn't seem all that insurmountable as it doesn't seem fundamentally different from say a build order other than that it cannot change dynamically on the fly.


The search tree size isn't an insurmountable issue. For exple OpenAI 5 managed to play DotA (though admittedly relied heavily on perfect timing and the ability to look at every enemy on the minimap at the same time, and lost a lot when people learned to play around the only strategy it knew).I think it'd be feasible to get an AI to play a modern deck.

I think it'd be much harder for the AI to do deck building in a vacuum. You could model it as every game starting with you building your deck, but I can't see that converging stably.


MCTS is usually paired up with Deep Learning. This doesn't appear to have any problems with games with even larger branching factors. Look up AlphaZero and AlphaStar.


An MTG limited player named Ryan Saxe created an AI that drafts and builds deck to a highly successfully degree: https://github.com/RyanSaxe/mtg

He was able to reach Mythic, the highest ranking tier on Magic Arena. Of course this is a different problem to actually playing the game (and probably significantly easier). That being said, this is one guy doing it as a side project with restricted resources.

I imagine that MTG could played quite successfully by an AI if someone where to dedicate the resources. Imo much of the difficulty is in laying the groundwork. Large amounts of data don't exist publicly and laying the framework for a bot to play itself would be quite difficult (and then the computation costs would be extremely expensive).


Wouldn’t the fact that the game is constantly changing, with new cards being added and old ones being removed, also make it harder to solve?


That's harder on humans than on computers. Computers have perfect information on what the legal cardpool is at game they are playing


Have people tried using GPT-3 for this?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: