- The 1v1 bot played at The International used a special creep block reward (and a big if statement separating that part of the agent from the self-play trained part). It trained for two weeks.
- A 2v2 bot discovered creep blocking on its own, no special reward. It trained for four weeks.
- OpenAI Five does not have a creep blocking reward, but neither (to our knowledge) does it creep block currently. Trained for 19 days!
I see. Thanks! So it manages to win lanes without even creep blocking? That's quite good. Any chance you could share the last hits @ 10 mins for the games it has played (for both bots and humans)? I think that's a crucial number to judge how OpenAI Five is winning its games.
I believe the article said that Blitz rated the bot last-hitting at about average for humans, although he might over-rate what an average human player last hits like.
Yeah, he might be overestimating 2.5k mmr players, and there's also something to be said about the consistency by which the bot last hits. A human player would have a high variance of last-hit performance, while the bot will probably guarantee a minimum amount, thus ensuring a minimum set of items needed for the mid-game transition.
But my larger point is, the early game doesn't have a lot of strategic elements in it. You have to last hit, not die, harass opponent, get items. You can play it by the book pretty much. The challenge in early game is to be able to handle 5 different things at the same time. So there's never really a question of what to do, but doing it does require mechanical prowess, which we know bots can easily be better at, than humans.
The team composition chosen is very early game snowball oriented. So is the bot winning simply due to mechanical superiority and early game advantage? Access to last hits @ 10 mins, gold and net worth graphs would allow us to answer that question.
- The 1v1 bot played at The International used a special creep block reward (and a big if statement separating that part of the agent from the self-play trained part). It trained for two weeks.
- A 2v2 bot discovered creep blocking on its own, no special reward. It trained for four weeks.
- OpenAI Five does not have a creep blocking reward, but neither (to our knowledge) does it creep block currently. Trained for 19 days!