ChatGPT is powerful, but it gives you different answers to the same question from one session to the next. And research found that overall performance can vary over time, sometimes for the worse. So you may host your own LLM for reproducibility.
I have not tried public LLMs myself. Do they give reproducible results?
If you fix the random number seed virtually all LLMs should be deterministic. However, just 1 token difference in the input could produce a very different output, depending on the sampler, model, etc. So, LLMs can be deterministic, but in practice they are pure alchemy.
I have not tried public LLMs myself. Do they give reproducible results?