Twine seems interesting, but it looks like these are mostly for helping writing out the branching bits of dialogue which would be mostly the LLM's work anyway. Guess some amount of reinventing wheels is gonna be necessary when adapting experiences for AI. Thanks anyways!
You might want to investigate MUDs ('Multiple User Dungeons') more closely. The rules of the game define the locations and items and such, but the character dialogue is between real people. By substituting LLMs for real players within the game, you may be able to enforce a greater level of consistency (the LLMs can't break the rules) and context (the MUD can usually describe one's entire state, which would allow you to prompt your LLM at the beginning of each turn with all the important facts).
I don't really have enough patience for MUDs myself, but they are a continually popular form of role-playing game since they were invented over 50 years ago.
I used to play MUD's as a kid! I've got an LLM powered CLI MUD game slow brewing in my noggin but haven't started on it yet. I did build a multi-player chatgpt powered discord text-adventure bot which I think I'll eventually try to convert into a shared-universe game. I think all you really need is a little bit of state (like, if you were to walk up to an auction house and ask to see the items just pull it from a db and inject into context).