This reads to me like begging the question, by assuming the existence of a "superintelligent AI" without addressing how a goal-optimizing machine becomes a superintelligent AI in the first place.
The exercise of fearing future AIs seems like the South Park underpants gnomes:
1. Work on goal-optimizing machinery.
2. ??
3. Fear superintelligent AI.
Or maybe it's like the courtroom scene in A Few Good Men:
> If you ordered that Santiago wasn't to be touched, -- and your orders are always followed, -- then why was Santiago in danger?
If a paperclip AI is so dedicated to the order to produce paperclips, why wouldn't it be just as dedicated to any other order? Like "don't throw me in that incinerator!"
> assuming the existence of a "superintelligent AI" without addressing how a goal-optimizing machine becomes a superintelligent AI
I'm just talking about the fallout if one did exist, saw ways to achieve goals that you didn't foresee, and did exactly what you asked it to do. I have no idea how the progression from better-than-humans-in-specific-cases to significantly-better-than-humans-at-planning-and-executing-in-the-real-world will play out. It's not relevant to what I'm claiming.
> why wouldn't it be just as dedicated to any other order?
It would be just as dedicated to those other orders. The problem is that we don't know how to write the right ones. "Don't throw me into that incinerator" is straightforward, but there's a billion ways for the AI to do horrible things. (A super-optimizer does horrible things by default because maximizing a function usually involves pushing variables to extreme values.) Listing all the ways to be horrible is hopeless. You need to communicate the general concept of not creating a dystopia. Which is safely-wishing-on-monkey's-paw hard.
Part 2 is when the AI reaches the point where it's smarter than it creators, then starts improving its own code and bootstraps its way to superintelligent. This idea is referred to as "the intelligence explosion" https://wiki.lesswrong.com/wiki/Intelligence_explosion
>If a paperclip AI is so dedicated to the order to produce paperclips, why wouldn't it be just as dedicated to any other order? Like "don't throw me in that incinerator!"
The paperclipper scenario is meant to indicate that even a goal which seems benign could have extremely bad implications if pursued by a superintelligence.
People concerned with AI risk typically argue that of the universe of possible goals that could be given to an AI, the vast majority of goals in that universe are functionally equivalent to papperclipping. For example, an AI could be programmed to maximize the number of happy people, but without a sufficiently precise specification of what "happy people" means, this could result in something like manufacturing lots of tiny smiley faces. An AI given that order could avoid throwing you in an incinerator and instead throw you in to the thing that's closest to being an incinerator without technically qualifying as an incinerator. Etc.
I think you're just asserting that part 2 exists. What matters is how an optimizing machine bootstraps super-intelligence, because the machine you fear in part 3 has a very specific peculiarity: it's smart enough to be dangerous to humans, but so dumb that it will follow a simple instruction like "make paperclips" without any independent judgment as to whether it should, or the implications of how it does so.
Udik highlighted this contradiction more more succinctly that I have been able to:
If we stipulate the existence of such a machine, we can then discuss how it might be scary. But we can stipulate the existence of many things that are scary--doesn't mean they will ever actually exist.
Strilanc above made the analogy between a scary AI and the Monkey's Paw. This is instructive: the Monkey's Paw does not actually exist, and by the physical laws of the universe as we know them, cannot exist.
I think the analogy actually goes the other way. The paperclip AI is itself just an allegory, a modern fairytale analogous to the Monkey's Paw.
The exercise of fearing future AIs seems like the South Park underpants gnomes:
Or maybe it's like the courtroom scene in A Few Good Men:> If you ordered that Santiago wasn't to be touched, -- and your orders are always followed, -- then why was Santiago in danger?
If a paperclip AI is so dedicated to the order to produce paperclips, why wouldn't it be just as dedicated to any other order? Like "don't throw me in that incinerator!"