That seems like an attempt to set up a futile exercise in needle-threading that relies on narrow worst-case-scenario definitions of "superintelligence"/AGI.
It's too intelligent to restrict our constraints.
but
It's not too intelligent to be "aligned" to underlying intentions and values?
That approach doesn't even work on humans, why would it work on a superintelligence?
> It's not too intelligent to be "aligned" to underlying intentions and values?
Intelligence makes that harder, not easier. Just because it can work out the underlying intentions doesn't mean it cares about them. Remember, this is an optimisation process that maximises a function; deciding to do something that doesn't maximise that function will not be selected for
Whenever you do something, do you think "the underlying goal of my behaviour set by evolution is to reproduce and have children, so I'd better make sure my actions are aligned to the goal of doing that"? No, you don't care what the underlying "intentions" are, and neither does an AI. Because our environment has changed, many of our instincts no longer line up well with that goal, which is actually another problem with aligning AI because it can do the same thing if the environment changes since training as the training process can create an AI with goals that aren't exactly the same as the training goal but line up well during training
The thing is, being aligned cannot be solved with intelligence per se.
Say, you are (far) more intelligent than a spider. There's no way you can get aligned with (all of) its values unless the spider finds a way to let you know (all of) its values. Maybe the spider just tells you to make plenty of webs without knowing that it might get entangled in them by itself. The webs are analogous to the paperclips.
It's too intelligent to restrict our constraints.
but
It's not too intelligent to be "aligned" to underlying intentions and values?
That approach doesn't even work on humans, why would it work on a superintelligence?