The hallucinations are a consequence of sampling from a probability distribution over all possible tokens at each step. There are a lot of very smart people trying to figure out how to sample for generative purposes while "grounding" the model so it hallucinates less. It's an active area of research.