Or the researchers don't think existential threats are realistic, and paper maximizing thought experiments are silly. Maybe they're wrong, but maybe not. It's easy to imagine AI takeover scenarios by giving them unlimited powers, it's hard to show the actual path to such abilities.
It's also hard to understand why an AI smart enough to paperclip the world wouldn't also be smart enough to realize the futility in doing so. So while alignment remains an issue, the existential alignment threats are too ill-specified. AGIs would understand we don't want to paperclip the world.
I agree completely with your first paragraph, and disagree completely with your second.
"Futility" is subjective, and the whole purpose of the thought experiment is to point out that our predication of "futility" or really any other purely mental construct does not become automatically inherited by a mind we create. These imaginary arbitrarily powerful AIs would definitely be able to model a human being describing something as futile. Whether or not it persues that objective has nothing to do with it understanding what we do or don't want.
> It's also hard to understand why an AI smart enough to paperclip the world wouldn't also be smart enough to realize the futility in doing so.
Terminal goals can't be futile, since they do not serve to achieve other (instrumental) goals. Compare: Humans like to have protected sex, watch movies, eat ice cream, even though these activities might be called "futile" or "useless" (by someone who doesn't have those goals) as they don't serve any further purpose. But criticizing terminal goals for not being instrumentally useful is a category error. For a paperclipper, us having sex would seem just as futile as creating paperclips seems to us. Increased intelligence won't let you abandon any of your terminal goals, since they do not depend on your intelligence, unlike instrumental goals.
It's not like you want to eat ice cream constantly, even if it means making everything into ice cream.
Of course the premise becomes that the AI has been instructed to make paperclips. They should have hired a better prompt engineer, capable of actually specifying the goals more clearly. I don't think an AI that eradicates humankind, will have such simplistic goals, if an AI ever becomes the end of humans. Cybermen, though, are inevitable.
> AGIs would understand we don't want to paperclip the world.
Even if they did, what if they aren't smart enough for eloquent humans to convince them it's for the greater good. True AGIs will need a moral code to match their intelligence, and someone will have to decide what's good and bad to make that moral code.
I've seen people calculate how much human blood would be needed to make an iron sword, for fun. AGIs won't need the capability to transmute all matter into iron, just enough capabilities to become significantly dangerous.
That would be not accepting the premise that deployments are irresponsible. I guess there could be a situation where every researcher thinks everyone else's deployment is irresponsible and theirs is fine, but I don't think that's what you're saying.
It's also hard to understand why an AI smart enough to paperclip the world wouldn't also be smart enough to realize the futility in doing so. So while alignment remains an issue, the existential alignment threats are too ill-specified. AGIs would understand we don't want to paperclip the world.
Fun game though.