What does it even mean to detect hallucinations. The AI doesn't say something tr...

henri18 · on April 21, 2023

There are many ways to detect hallucinations. Basically, either you have the ground truth answers in external database, in that case you compare to ground truths. Or you don’t have the ground truth. In that case, you need to do metamorphic testing. See this article on it: https://www.giskard.ai/knowledge/how-to-test-ml-models-4-met...

crop_rotation · on April 21, 2023

But GPT4 doesn't hallucinate on things which are popular enough to be replicated enough times on the web as knowledge. It hallucinates on things which are very less likely to be repeated many times. That rules out an external database with true answers. Unless the external database is supposed to contain all info queryable in all ways, in which case the database is just a better version of GPT-X.

The metamorphic testing approach is interesting and might work.

some_random · on April 21, 2023

I've been playing with GPT4 summarization of hard knowledge that has an external database with true answers that GPT knows about, and it's still hallucinating regularly.

krona · on April 21, 2023

Metamorphic testing seems to try to map an output of a model to a ground truth, which I guess is great if you have a database of all the known truths in the universe.

mattbit · on April 21, 2023

Not exactly, metamorphic testing does not need an oracle. That’s actually the reason of its popularity in ML testing. It works by perturbing the input in a way that will produce a predictable variation of the output (or possibly no variation).

Take for example a credit scoring model: you can reasonably expect that if you increase the liquidity, the credit score should not decrease. In general it is relatively easy to come up with a set of assumptions on the effect of perturbation, which allows evaluating the robustness of a model without knowing the exact ground truth.

QuercusMax · on April 21, 2023

But isn't that half the reason people are so excited about this stuff - that you can ask it to make up an episode and it does a plausible job.

crop_rotation · on April 21, 2023

That is beside the point. My point is that detecting hallucinations seems like a very very hard problem.

The utility of it is there and has nothing to do with making up episodes instead of quoting the current one. Like you can ask it to write new episodes with specific settings and specific constraints. Hallucination is not the value add. Nobody is excited because it hallucinates. People are excited despite it since the other value add is too much.

QuercusMax · on April 21, 2023

But hallucinations are exactly the same thing as asking it to write a spec script.

thfuran · on April 21, 2023

But deliberately requesting and receiving content generation is altogether different from requesting a factual answer and receiving plausible-seeming nonsense. Or at least, it's different to the person asking; it's the same thing as far as the model is concerned.

QuercusMax · on April 21, 2023

Precisely my point.

pornel · on April 21, 2023

If the hallucinations are random, how about generating multiple answers, and then checking whether they all agree on key points?