I think one solution to grounding models with prompts is to have a secondary model that does the grounding based on the output of the primary model. Essentially run the output from the primary model to the secondary model, have it apply the prompt rules, reformat the response and send it back. The communication between primary and secondary model should not be exposed to the internet, like having a public web server and private database server.