That seems like an attempt to set up a futile exercise in needle-threading that ...

circuit10 · on Aug 2, 2023

> It's not too intelligent to be "aligned" to underlying intentions and values?

Intelligence makes that harder, not easier. Just because it can work out the underlying intentions doesn't mean it cares about them. Remember, this is an optimisation process that maximises a function; deciding to do something that doesn't maximise that function will not be selected for

Whenever you do something, do you think "the underlying goal of my behaviour set by evolution is to reproduce and have children, so I'd better make sure my actions are aligned to the goal of doing that"? No, you don't care what the underlying "intentions" are, and neither does an AI. Because our environment has changed, many of our instincts no longer line up well with that goal, which is actually another problem with aligning AI because it can do the same thing if the environment changes since training as the training process can create an AI with goals that aren't exactly the same as the training goal but line up well during training

nuancebydefault · on Aug 2, 2023

> not too intelligent to be aligned

The thing is, being aligned cannot be solved with intelligence per se.

Say, you are (far) more intelligent than a spider. There's no way you can get aligned with (all of) its values unless the spider finds a way to let you know (all of) its values. Maybe the spider just tells you to make plenty of webs without knowing that it might get entangled in them by itself. The webs are analogous to the paperclips.

circuit10 · on Aug 2, 2023

It's less about not knowing the intentions and more that it has no reason to care about anything other than the goal you gave it