> I’d also note this isn’t confidence in the answer but in the token prediction.
I really don't understand the distinction you're trying to make here. Nor how do you define "computable confidence" - when you ask an LLM to give you a confidence value, it is indeed computed. (It may not be the value you want, but... it exists)
The assertion that the token likelihood metric is some sort of accuracy metric is false. There are more traditional AI techniques that compute probabilistic reasoning scores that are in fact likelihoods of accuracy.
I’d note you can’t ask an LLM for a confidence value and get any answer that’s not total nonsense. The likelihood scores for the token prediction given prior tokens isn’t directly accessible to the LLM and isn’t intrinsically meaningful regardless in the way people hope it might be. They can quite confidentially produce nonsense with a high likelihood score.
I really don't understand the distinction you're trying to make here. Nor how do you define "computable confidence" - when you ask an LLM to give you a confidence value, it is indeed computed. (It may not be the value you want, but... it exists)