Hacker News new | past | comments | ask | show | jobs | submit login

Sometimes I wish I had the reach of Google Deepmind. I created a sandbox environment for the text-heavy RPG 'Disco Elysium' [1]. The current research I'm focused on is having an agent use a natural language interface (via text generation) to solve quests in the game.

The project required lots of reverse engineering on my part to make a web-based facsimile of the game such that it's possible to conduct controlled experiments on the language capabilities of current agents.

Hopefully what I've created will be useful for others, because unlike big tech, I've released all my code under the AGPL [2].

[1]: https://pl.aiwright.dev [2]: https://git.sr.ht/~dojoteef/pl.aiwright




Isn't an essential part of what they are doing, and why they have results, that they are tackling all games at the same time, rather than focusing on one? Is Disco Elysium a good choice?


Good point, they are quite different objectives.

Their approach is one that works for simple directives: "Go to ship" or "Pick up iron ore" which lends itself well to sandbox-like games (which seems to be a major focus looking at Deepmind's tech report). Similar research has been done in Minecraft [1].

These instruction following agents are more an RL achievement than a language understanding achievement. On the other hand, Disco Elysium has over a million words of dialogue, and solving the quests requires an agent to understand and reason about language much more extensively. People have looked at text-based game agents, like Microsoft's TextWorld [2], but these are much smaller in scope and not easily adapted for humans-in-the-loop.

My work bridges that gap, focusing on the language aspect, rather than navigating a 3D world. Again, they are definitely different objectives, but as a sole researcher there's no way I can compete with Deepmind's budget and manpower anyway. Just look at the extensive author list in the tech report. So it doesn't make sense to necessarily focus on outcompeting them in producing a better generalized RL agent (in fact I merely use GPT-4). Instead, I made a publicly available experimentation platform that allows others to be able to build upon this work, which is valuable for the community at large.

At least, that's my take.

[1]: https://sites.google.com/view/steve-1

[2]: https://www.microsoft.com/en-us/research/project/textworld/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: