Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Building a game for AI Research
121 points by keerthiko on May 4, 2018 | hide | past | favorite | 36 comments
I'm working on a PC game as a personal project. I am primarily a game designer and developer, but I'm interested in working towards making my game friendly towards AI research. I tried reading TF/PyTorch docs but couldn't find a guide on how to make my game a friendly environment that exposes the necessary surface area for AI research.

I would love advice from anyone who has experience building API layers for video games for AI (BWAPI, OpenAI's DOTA team, deepmind SC2 AI, etc) for pointers.




From my viewpoint, as both a researcher and someone who has built frameworks around environments/games:

- Each step within the game has to be extremely fast. I.e the game should be able to be run as fast as the machine allows while keeping physics etc. consistent.

- Runnable via library import such that there is no drawing to the screen.

- Should be easy to reset the environment to an initial state.

- RNG state should be seedable.

- I highly recommend supporting an identical interface found in OpenAI's gym. Check their docs out. Even better would be to have your game importable as an environment in gym.

- Configurable screen resolution would be great (eg. output 120x100)

- The environment is "hackable" eg. the maps or levels can be modified or loaded say via some ascii map.

- Should support multiple copies of the game running at once.

- A nice to have would be if the current environment state could be exported and loaded later.

- Expose some information/signals such that a reward signal can be created. Or better yet you define one as the game creator.


Excellent list.

> - Should be easy to reset the environment to an initial state.

Adding on to that, the ability to rewind the game state is a pretty big deal.

The biggest deal for AI researchers though is that you implement a replay function and format, and publish lots of tooling around them (to read and parse them, etc; at least in Python).

Also, if it's an online game, save the replays serverside and publish them somewhere. Kaggle will be happy to take it I'm sure.


> - I highly recommend supporting an identical interface found in OpenAI's gym. Check their docs out. Even better would be to have your game importable as an environment in gym.

> - Configurable screen resolution would be great (eg. output 120x100)

I think both of these things assume you are going to be doing RL from pixels. I think to support a wider variety of RL/control research, you should be able to get the game state in a structured form and not just a flat vector the way gym does it.

But even then, that's still just one branch of AI research. I've seen people optimize how games behave to optimize engagement with the game, and in that setting just controlling the player is not enough. The work I saw looked at controlling level progression to increase engagement, but you could imagine controlling other bits of the game, particularly relevant if your game is not symmetric and the metric you care about is not just making the best AI.

Maybe not AI, but people also do research on how to replace components of games with ML components and the results can be pretty cool, e.g. https://www.youtube.com/watch?v=Ul0Gilv5wvY

Which is just to say that there is not one size fits all approach here.


May I ask what do you mean by "RNG state should be seedable"?


If the game depends on random events (eg an attack does random damage between 3-8) it would be useful to make sure it's always the same randomness, if you want it at least.


Same randomness? I can't get gist of the term.


In addition to the other explanation, check out today's NYT article on how one guy cracked the lottery because of pseudo-random behaviour in the lottery code.

https://www.nytimes.com/interactive/2018/05/03/magazine/mone...


Most random sources are PRNG rather than 'true' random sources, and sometimes it's useful (for debugging, for analysis or just for interest) to be able to use a predictable pattern of otherwise random numbers.

One way is to allow some way of 'seeding' the PRNG such that the order of the numbers it produces is the same each time, as we return the random function back to a known state.

Or, by example, if I make 5 calls to the PRNG with seed value '0' and see the following: [5, 2, 9, 18, 4, ...] and that causes the agent I'm testing to do something utterly weird, so I want to re-run my agent to observe the effect in detail to debug it, and for that to happen, I need the same [5, 2, 9, 18, 4, ...] sequence, otherwise I'll be forced to run repeatedly until I observe the same glitch, so by re-seeding the PRNG to '0', it will then predictably return that sequence, rather than a new, random sequence.


It's because most of the randomness used by software is actually pseudorandom. What that means is that you actually use a defined sequence. The sequence has behaviour that's close enough to what you'd get if you were picking random samples from a distribution for the desired application.

The key difference is that it's reproducible and that if you have insight into the parameters of the sequence (e.g. the seed and the current position in the sequence), you can predict the results. That's why people often get upset when people use these pseudorandom number generators for security purposes.

The seed is a value that is used to generate the sequence. If you use the same seed, you get the same sequence.


Typically when you init a random generator, it'll let you pass a number in if you want to. That will set the sequence of "random" output from the generator; different seeds will be random with respect to each other. If you re-use the same seed you'll get the same sequence of "random" numbers as before. This is useful to test or re-try sequences involving "random" in a reproducible way.


It has to be fast. Reinforcement learning takes 1000s of hours of game play for each experiment, and you want to do multiple experiments per day on a small cluster. The Atari games in OpenAI Gym run about 200x real-time per core, and the SNES games maybe 20x. Aim for something in that range. 1x is useless for research.

OpenAI Gym has a stable API for several video games. If you copy the API, researchers will be able to compare algorithms directly across games which is valuable.


What does Dota 2 get? That's probably the metric to aim for, since OpenAI uses it themselves, and it's a modern 3D game.

They had to do a few hacks to get it working: https://blog.openai.com/more-on-dota-2/?!

The first step in the project was figuring out how to run Dota 2 in the cloud on a physical GPU. The game gave an obscure error message on GPU cloud instances. But when starting it on Greg’s personal GPU desktop (which is the desktop brought onstage during the show), we noticed that Dota booted when the monitor was plugged in, but gave the same error message when unplugged. So we configured our cloud GPU instances to pretend there was a physical monitor attached.

Dota didn’t support custom dedicated servers at the time, meaning that running scalably and without a GPU was possible only with very slow software rendering. We then created a shim to stub out most OpenGL calls, except the ones needed to boot.

Still, Dota uses simplified collision detection. It would be interesting to know how fast an actual physics simulation could run in headless mode.


I wish more people would use (or expose) 0AD engine (an open source Age of Empires-like game) in AI research, even though I know Starcraft2 is where most of the focus is these days).

https://github.com/0ad/0ad/tree/master/binaries/data/mods/pu...

https://github.com/agentx-cgn/Hannibal

Nearly everything bullet pointed in that list is there.

Edit: Should also mention there is the Halite AI challenge, https://halite.io


I have been working on a simple project for evolving 5v5 ship battles similar to asteroids. My advice is make it easy to serialize the game state. 99% of the simulations will do the same thing or waste 1000s of frame getting to the "important part" of the match, and the ability to save the game state and re-play it from a specific position can be beneficial.

However, from what I understand, this limits your game to a simple Markov Process, which not all games likely expressed like that. (In my case I can only use this approach if I don't allow neural networks to have memory between frames)

Another benefit is being able to debug in real-time how decisions are being calculated via the model, like a neural network.


Sounds like Netrek (which is 8vs8) and in space (but 2d).


Lots of great suggestions in here!

The General Video Game AI (GVGAI) environment has also just released their own Gym for training. With the competition set to run through CIG'18 in July 2018

https://github.com/rubenrtorrado/GVGAI_GYM/

IEEE Conference on Computational Intelligence and Games Competitions

https://project.dke.maastrichtuniversity.nl/cig2018/?page_id...

It's okay to start out small. It's not necessary to clone StarCraft to investigate high dimensional game state spaces. Remember it was a simple 2D board game like Go that was until recently considered unbeatable ;)

Take a look at games like Generals or Tron Light Cycles. Best of Luck!

http://dev.generals.io/

https://web.stanford.edu/~jbboin/doc/ai_lightcycle.pdf


>"I tried reading TF/PyTorch docs but couldn't find a guide on how to make my game a friendly environment"

Are you aware of 'Elf' and 'TorchCraft' by Facebook (creators of Pytorch)?

From https://facebook.ai/developers/tools/elf :

"ELF provides an end-to-end solution for game research. It includes miniature real-time strategy game environments, concurrent simulation, distributed training over thousands of machines, intuitive APIs, web-based visualizations, and a reinforcement learning framework powered by PyTorch."

From https://facebook.ai/developers/tools/torchcraft :

"TorchCraft is a Brood War API (BWAPI) module that sends StarCraft data out over a ZMQ connection, allowing researchers to parse StarCraft data and interact with BWAPI from C++, Python (PyTorch-friendly), and Lua (Torch-friendly)."


Thanks for mentioning these!

I started exploring this after seeing TorchCraft on HN yesterday (https://news.ycombinator.com/item?id=16979136) and I worked on a BWAPI AI back in college myself (2009).

I took a look at ELF hoping it was a library or module I could integrate into my project to make it applicable with PyTorch, but it was its own gym complete with full game examples.


Something perfect here is OpenAI's Universe [1] enabling you to strap on a uniform AI interface to your project. Here [2] is there systems page for further information/support.

[1] https://github.com/openai/universe

[2] https://openai.com/systems/


Nice, I didn't know about Universe, I was only familiar with the gym and their DOTA 2 scaffolding (which was a gigantic engineering process, as I learned from https://blog.ycombinator.com/building-dota-bots-that-beat-pr... )


If you're using Unity, I would recommend that you check out Unity Machine-Learning Agents!

https://github.com/Unity-Technologies/ml-agents

It makes it really easy to make games for reinforcement learning. I worked on it a bit over summer, so if you have any questions, feel free to reach out to me.


We do indeed use Unity! Somehow I was aware of ml-agents, but didn't realize it was a "gymification" library. I thought it was a framework for running stuff within the Unity editor, but seeing that it allows things to be run from a normal python environment, that's awesome. Thanks for making me look at the README and documentation closer.

And here I was thinking I would need to write a native dll wrapper or wait till my next game project to include necessary bits from the outside.


-complete game state should be compactly represented, ultimately it will need to be in a form that can be inputted to an NN

- player actions should be enumerable (if there are discrete actions)

- score is reward, it should be easy to access directly

- game will be much easier to learn of there are shorter goals with more score feedback. long puzzles with very sparse reward is still challenging for RL


Funnily enough, our game has these properties as part of its core game design, independent of the AI-friendly aspirations. It's a puzzle-based anti-grav racing game where you try to complete N laps in the fastest time possible. When I was playtesting it I realized "huh there are only 2 buttons involved, the game state exists as a 2D bitmap + a few vectors we can serialize to disk, there are predefined deterministic victory parameters (times for medal) and the randomness is 0 outside of physics. This should be an ideal game for AI training."

While I'm sure I miss some knowledge on optimizing the design, what I'm more interested in learning about are pointers to technical frameworks/API implementation details needed to be friendly to tools like TF/PyTorch for developers to train with the game.


Im working on a game RL framework for turn-based games, my aim was to learn about self-play and RL.

I found if you structure the human controller in the exact same way as the AI controllers, you can swap them easily. So I have Agent (abstract), HumanAgent (takes keyboard input), and DqnAgent (DQN learning agent) as the different controllers, the rest of the code is agnostic to the controller. With this setup you can also do things like record your own gameplay.

If your goal is running the track in minimal time, you could reward it at the end (reward = -1* elapsed_time), or as you go (reward=current_speed), or once each lap, etc. These sound similar but may have different training properties. So maybe plan to explore your reward shaping space a bit.


There are plenty of great answers here already, but what I think would help improve the state of the art would be configurable reward sparsity and configurable priors.

- Reward sparsity means that an easier version of the game gives a score after very few moves, but a more difficult version might take many many moves. It would be useful to see how agents compare against humans as the rewards get sparser.

- Configurable priors are vaguer and are to make it easier to have the game match expectations from prior experience in other games. E.g. researchers could overlay a custom texture on enemies that look like enemies from other games to test and develop better transfer learning algorithms.


As an alternative to OpenAI, Facebook has open sourced their AI game play interface just days ago.

https://code.facebook.com/posts/132985767285406/introducing-...


I would love to see a game that needs cooperation between x agents to wins.

I feel like play as a team in a imperfect information environnement is one of the most interesting AI challenge out there.

Imagine a 2d csgo with no need of manual skill, no aim or reflex, just player placement, crosshair(view) placement and teamplay.

If you know some game like this let me know please, if there is not i will build it one day.


Do you know the game netrek? It's a 8vs8 game that heavily relies on teamplay. It was one of the first real-time games to be played over the internet. It also has this imperfect information aspect (fog of war, inaccurate display of cloaked ships, need to scout planets). Free and open source. http://netrek.org


Sounds like Robocup Simulations League comes very close.


Could this Civilization clone have useful ideas for you?

https://en.wikipedia.org/wiki/C-evo


Look at Unity ML agents if you haven't already.


not exactly what you asked for, but if youre looking for a gentle academic introduction to the intersection of ai and games, i recommend [ AI Researchers, Video Games Are Your Friends!](https://arxiv.org/abs/1612.01608)


By fun coincidence one of my personal projects has been the opposite; trying to build AIs that can experiment with generalized games.

The needs of a project like mine are going to deviate from other suggestions on here because I'm explicitly not working with neural networks or genetic algorithms. The lists other people have put up (esp nrmm) are really good for those, which is not to say mine won't have any overlap.

Given the choice between supporting something on my list or someone else's, you should definitely prioritize someone else's. If you want a lot of people to do research with your game, optimize for ML.

That being said, because I'm greedy, stuff that matters specifically to me, in order of importance:

- Encapsulated State: I want to be able to get raw access to the game's internal memory, esp through an API. Byte array is fine, labeled is also good. Basically any variables or memory that are used in game logic should be exposed. If possible, this state should be separated from any state information you're using for graphics, sound, etc...

Note that this does not mean you should encapsulate your entire game into 20 or 30 basic variables and just expose that. That's cheating. Classification of a game state is as important as building a strategy, so the point is to provide your state at a (relatively) low level with only a few abstractions, or at least abstractions that can easily be stripped out for training.

- Headless Mode: I want to be able to run the game on a server or in a low resource environment by disabling as much of the graphics stack as possible. Linux support is also obviously useful since the majority of the server world is using it.

- API-driven input: I want to be able to control the game without mocking button presses on a keyboard. Extra plus if your game is controllable via a scripting language like Javascript, it's trivial for someone to build their own adapter if they have something to call into.

- Serializing/loading state: Emulators are great for AI research because you can export out the entire state of the game and reload it later for testing/comparing results. This isn't essential, just really really useful.

- Easily recordable output: This is a tiny thing that just makes life way easier. If you can pipe your game's output to a video stream without recording it off a desktop (ie, can I stream your game from a headless server), then it just makes it a lot more attractive for building demos or toys.

Loads of bonus points if your graphical display can be run as a separate process from your game logic. More bonus points if they can communicate over a network. All of the bonus points if your graphics engine is embeddable in a web-browser or is similarly scriptable to your game logic.

And some stuff I don't really care about:

- Synchronous game logic: Most games try to keep their logic divided into synchronous chunks that are executed every frame. This is not particularly important for the types of research that I'm doing, async code is fine. Your game will probably be running on a remote server anyway, so someone in my position is hardly ever going to be looking at every single frame in series.

- Seedable RNG: It's nice. I guess. But unless someone is running your game locally and interacting with it via synchronous steps, they're going to be dropping random frames anyway. So it doesn't really matter.

- Extreme optimizations: I'm overstating this a bit - obviously your game needs to be optimized. If your game requires a GPU to run at a decent clip, that's a problem. BUT, if your game is fast enough to run reasonably well on limited hardware (especially in a streaming/headless mode), that's good enough. Aim to support single-core hardware and small dedicated devices like the Raspberry Pi.

This is terrible advice if you want to support reinforcement learning. In all likelihood you should optimize the crap out of your game for that. But, for the type of research I'm doing, I will practically never be running a game at more than 1-2x speed.

Portability and elegance > raw speed, unless you care about training a genetic algorithm or something.

- A fitness function: Honestly, probably not even the traditional ML crowd needs one. If you expose your game's state in a way that they can consume, then anyone can write their own fitness function on top of it. You don't hurt anything by adding one though, I guess.


Your project sounds exactly like what I envisioned myself dedicating my life to if I went to grad school (ah, another life) instead of starting a company after college. These are all the requirements I try to set for my game designs and gameplay implementations, although I appreciate the details from someone in the weeds and building the project.

What I'm trying to gain more information on though right now are the technical details on how to go about modifying an existing game project to provide this, without building a game from scratch entirely inside of a proprietary AI game-gym framework.

A note on recordable output: It actually runs slower than playing a game because usually you have to do video encoding and file io in realtime which is slower than the RAM -> graphics card -> display pipe.


You should look at Halite by Two Sigma




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: