This is fun. It would be interesting to build a single graph of concepts that all users contribute to. Then you wouldn't have to run LLM inference on every request, just the novel ones, plus you could publish the complete graph which would be something like an embedding space.
I just went back and did some new combinations with early ones and I'm still getting intermittent delays even though all early combinations must be done, so I assume part of this is just the server itself being a little overloaded and so even responses that are cached remotely but not locally may experience delays.