Yes, you can stretch your model to try to explain why humans go to work. In real...

mitthrowaway2 · on March 9, 2023

Between "I go to the grocery store because I've been socialized and trained that going to the grocery store regularly is a Good Thing, and I have adopted it as a terminal goal to which I know I should dedicate efforts", vs "I go to the grocery store because I'm out of carrots, my dinner recipe calls for carrots, and I think I can get some there", the latter model is not the one that strikes me as being stretched to explain human behaviour.

As for there being "a myriad of ways to survive without working a daily job", congratulations! Your intelligence has allowed you to identify alternative instrumental goals that provide a path to your terminal goal; now you can rank them and choose the best option. You can also grow carrots in the garden or ask your neighbour if they have any, or ask your spouse to pick some up on the way home. Your intelligence will do the work and find a way. But your intelligence isn't what will guide you toward preferring carrot soup over parsnip soup, and preferring parsnip soup over fasting.

I'm not American and not in the USA, btw.

simple-thoughts · on March 9, 2023

You’re arguing again from a position that assumes that entities have clearly defined terminal versus instrumental goals - which is precisely the position I reject. For instance in your example of “terminal goal is groceries” versus “terminal goal is hunger” neither of these describe how actual humans make decisions. Instead there’s a process, part biological and part environmental. The human checks the fridge then thinks “oh I feel hungry”. Is hunger the terminal goal or the fridge? That question doesn’t even make sense - it’s an interaction between the agent and the environment. Do Pavlov’s dogs have a terminal goal of “salivating to bells” or “salivating to food”? Again the question doesn’t make sense- the agent has built a habit from within a certain environment and the salivating is not goal directed. That’s why training works for dogs, and putting the fridge out of site reduces hunger in humans.

Think more carefully about the implications of multiple ways to survive here. Why do people pick one over the other? In a terminal/instrumental goal model, agents would pick the instrumental route that maximizes the return on the terminal goal. In reality we see that instead humans adopt habits, processes, and heuristics that guide them through daily life even when those do not lead to any specific goal.

mitthrowaway2 · on March 9, 2023

Yeah, that's because re-evaluating your entire life plan and belief structure every second is expensive, and heuristics are cheap. People certainly have flaws in their thinking, which is why we fall prey to pyramid schemes, gambling, responding to pointless comment chains on HN, and so on. I don't disagree with this, and again, it's why we publish so many self-help books. But I believe it's our weakness and stupidity, not our superior intelligence and clear thinking, that traps us in bad habits.

So remind me of your original point? I believe you said it's "obviously wrong and laughable" that "an intelligent being can pursue stupid goals". Now here you are trying to convince me that humans are the ones who, like Pavlov's dogs, "pursue habits, processes, and heuristics that guide them through daily life even when those do not lead to any specific goal". Even when those habits involve repeatedly re-opening a fridge that you already know has no carrots in it, or salivating at a bell when you already know no food is coming.

So I'm confused how that proves your point about AGI. If I accept your view, it seems that if an AGI does merely no better than a human on this metric, I should anticipate all sorts of strange and irrational behaviour, including the pursuit of goals that would appear stupid, such as addiction to a reward channel. That does not seem to undermine the orthogonality thesis.

And the smarter the AGI gets, presumable the less it should lean on Pavlovian heuristics and the more it should make use of clear thought, which puts it more in my camp.

So that would apparently put the lower bound at "the AGI takes unexpected and irrational actions because it's not a rational agent and doesn't think coherently", and the upper bound at "the AGI takes unexpected and dangerous actions as rational steps toward an unaligned terminal goal".

I'm not sure where in this chain of thought it becomes laughably obvious that intelligence and goals are correlated, such that an AGI's increasing intelligence will tend it toward actions that we humans approve of, because anything else would be a "stupid goal"?