I mean it kinda can. Here's the full prompt. I have no idea about aspartame, I j...

teruakohatu · on April 4, 2023

Interesting.

> The response: {"confidence": "very low", "en": "I'm not sure, but I don't think the moon is made of cheese."}

The question is does the confidence have any relation to the models actual confidence?

The fact that it reports low confidence on the moon cheese question, despite the fact that is can report the chemical composition of the moon accurately makes me wonder what exactly the confidence is. Seems more like sentiment analysis on its own answer.

lsy · on April 4, 2023

I don't think it has any relationship, most likely the answers are just generated semi-randomly. Even the one it's "very" confident about is not agreed-upon (Wikipedia says the outcome was "inconclusive"). Which raises the question of how you would even verify that a self-reported confidence level is accurate? Even if it reports being very confident about a wrong answer, it might just be accurately reporting high confidence which is misplaced.

layer8 · on April 4, 2023

My view is that ChatGPT isn’t a singular “it”. Its output is a random sampling from a range of possible “its”, the only (soft) constraint being the contents of the current conversation.

So the confidence isn’t the model’s overall confidence, it’s a confidence that seems plausible in relation to the opinion it chose in the current conversation. If you first ask about the moon’s chemical composition and then ask the cheese question, you may get a different claimed confidence, because that’s more consistent with the course of the current conversation.

Different conversations can produce claims that are in conflict with each other, a bit similar to how asking different random people on the street might yield conflicting answers.

riffraff · on April 4, 2023

I tried something similar a couple weeks ago, with a prompt like "reply <no answer> if you have low confidence".

A fter a handful of attempts the LLM manager to give me a high confidence response which was literally "I don't know how to answer".

Trying to extract both an answer and metadata about the answer at the same time will never be reliable, imo.

Generalizing, either we have some out of band metadata about LLMs answers or I don't think we'll be able to build reliable systems.