That's not a fundamental limitation of the models, even if it's present in the products running on those models — if you want to populate a database from an LLM, you can constrain the output at each step to be only from the subset of tokens which would be valid at that point.