Yes... But humans generally have to explain ourselves. At the societal level, it...

PeterisP · on Jan 22, 2023

It is well known with experimental evidence that when people justify their decisions with the reasoning leading to it, it does not necessarily have anything to do with the actual reasons for the decision; even ignoring lying, we do not have privileged insight into all the factors our brain used to make a decision, and when asked for a justification for any decision which wasn't 100% the result of explicit rational planning (which is almost every decision, as even high-planning decisions are demonstrably influenced by non-rational factors, intuition, etc), we use our social skills to effectively invent a post-factum reasoning that both sounds plausible and is socially acceptable - just as language models can do. We often deceive ourselves, we have biases that discourage us from admitting and acknowledging certain reasons for why we did or do something; in some cases extensive therapy can help us recognize why some justification or reason we told others and ourselves was a false one, etc.

In essence, dealing with plausible but potentially misleading justifications is something that we have had to do forever, and will still have to do for future artificial agents as well.

mindslight · on Jan 22, 2023

I fully agree [0]. And I can see how my original comment was worded such that you'd think I wasn't taking that into account ("the reasoning").

When using ChatGPT, I can't help but thinking "this sounds like a high school essay". But really, it's not. Rather, it's someone who has college+ level of reading and studying, but has never had to have their ideas tested or scrutinized. A user of ChatGPT kind of mitigates this by asking for clarification, which is a form of learning in the specific session. But in the real world, ChatGPT-as-student would be adjusted with that feedback and then incorporate it into its overall model for serving the next user.

[0] For example, the SSC post about "predictive processing" resonated strongly with me

jacquesm · on Jan 22, 2023

If they only add a feedback loop it will already improve hundredfold in a very short time with respect to the output because then suddenly people rather than endless unstructured input will guide the answers towards the expected outcome. That is definitely still a very far cry from actual intelligence but it may well become a lot more useful that way.

mannykannot · on Jan 22, 2023

I feel you are making a valid point here, but I have to demur where you say “…just as language models can do.” Current LLMs do not appear to have any concept of self, let alone a theory of mind - the notion that other people think of themselves the same way as one thinks about oneself (in fact, they don’t appear to have have any concept of language as being about an external world that exists independently of what is said about it.)

An LLM can sometimes be prompted to give a response that looks like an explanation for how it came up with a previous response, but it is important to realize that the actual process by which it came to make the original response bears no relationship to the “explanation” it gave. Because everyone's experience with language has been with human-generated language, which is often not completely rational but, when it is not, often deviates in ways we intuitively understand, it is difficult to see the responses of current LLMs as being fundamentally different, even when we know they are.

mansoon · on Jan 23, 2023

Your intuitive understanding is an illusion.

_benedict · on Jan 23, 2023

What you describe is quite different: humans may well mislead themselves or others when there is a reason to do so. But most of the time, if there is no cost or benefit associated with an explanation, a human is fairly good (with great variability in both person and context) at assessing why they did something or said something, or why they reached a conclusion - and they can get better at doing so with practice.

The fact the brain reaches decisions before we are consciously aware of them does not mean the brain does not tag decisioms with motives and other information necessary to explain them, though obviously we have to trust this information, and it may sometimes be wrong, but it has a pretty good track record and there’s no equivalent facility for AI today.

This holds especially true for the kinds of actions taken by AI today which are usually deliberated thought processes for humans.

This isn’t the same thing is saying all actions are explainable, but it is quite different to being a black box. In general, one only needs to question the explanations of a person if there is a good objective reason to do so.

> It is well known with experimental evidence that when people justify their decisions with the reasoning leading to it, it does not necessarily have anything to do with the actual reasons for the decision

Out of interest, what’s the highest quality evidence you are aware of in this area?