The idea is that if you build a system that poses an existential risk you want to be reasonably sure it's safe before you turn it on, not afterwards. It would have been irresponsible for the scientists at Los Alamost to do the math on whether an atomic explosion would create a sustained fusion reaction in the atmosphere until after their first test, for example.
I don't think it's possible for a large language model, operating in a conventional feed forward way, to really pose a significant danger. But I do think it's hard to say exactly what advances could lead to a dangerous intelligence and with the current state of the art it looks to me at least like we might very well be only one breakthrough away from that. Hence the calls for prudence.
The scientists creating the atomic bomb knew a lot more about what they were doing than we do. Their computations sometimes gave the wrong result, see Castle Bravo, but had a good framework for understanding everything that was happening. We're more like cavemen who've learned to reliably make fire but still don't understand it. Why can current versions of GPT reliably add large numbers together when previous versions couldn't? We're still a very long way away from being able to answer questions like that.
I don't think it's possible for a large language model, operating in a conventional feed forward way, to really pose a significant danger. But I do think it's hard to say exactly what advances could lead to a dangerous intelligence and with the current state of the art it looks to me at least like we might very well be only one breakthrough away from that. Hence the calls for prudence.
The scientists creating the atomic bomb knew a lot more about what they were doing than we do. Their computations sometimes gave the wrong result, see Castle Bravo, but had a good framework for understanding everything that was happening. We're more like cavemen who've learned to reliably make fire but still don't understand it. Why can current versions of GPT reliably add large numbers together when previous versions couldn't? We're still a very long way away from being able to answer questions like that.