If somebody needs step by step instructions from an LLM to synthesize strychnine, they don't have the practical laboratory skills to synthesize strychnine [1]. There's no increased real world risk of strychnine poisonings whether or not an LLM refuses to answer questions like that.
However, journalists and regulators may not understand why superficially dangerous-looking instructions carry such negligible real world risks, because they probably haven't spent much time doing bench chemistry in a laboratory. Since real chemists don't need "explain like I'm five" instructions for syntheses, and critics might use pseudo-dangerous information against the company in the court of public opinion, refusing prompts like that guards against reputational risk while not really impairing professional users who are using it for scientific research.
That said, I have seen full strength frontier models suggest nonsense for novel syntheses of benign compounds. Professional chemists should be using an LLM as an idea generator or a way to search for publications rather than trusting whatever it spits out when it doesn't refuse a prompt.
I would think that the risk isn’t of a human being reading those instructions, but of those instructions being automatically piped into an API request to some service that makes chemicals on demand and then sends them by mail, all fully automated with no human supervision.
Not that there is such a service… for chemicals. But there do exist analogous systems, like a service that’ll turn whatever RNA sequence you send it into a viral plasmid and encapsulate it helpfully into some E-coli, and then mail that to you.
Or, if you’re working purely in the digital domain, you don’t even need a service. Just show the thing the code of some Linux kernel driver and ask it to discover a vuln in it and generate code to exploit it.
(I assume part of the thinking here is that these approaches are analogous, so if they aren’t unilaterally refusing all of them, you could potentially talk the AI around into being okay with X by pointing out that it’s already okay with Y, and that it should strive to hold to a consistent/coherent ethics.)
I remember Dario Amodei mentioned in a podcast once that most models won't tell you the practical lab skills you need. But that sufficiently-capable models would and do tell you the practical lab skills (without your needing to know to ask it to in the first place), in addition to the formal steps.
The kind of harm they are worried about stems from questioning the foundations of protected status for certain peoples from first principles and other problems which form identities of entire peoples. I can't be more specific without being banned here.
However, journalists and regulators may not understand why superficially dangerous-looking instructions carry such negligible real world risks, because they probably haven't spent much time doing bench chemistry in a laboratory. Since real chemists don't need "explain like I'm five" instructions for syntheses, and critics might use pseudo-dangerous information against the company in the court of public opinion, refusing prompts like that guards against reputational risk while not really impairing professional users who are using it for scientific research.
That said, I have seen full strength frontier models suggest nonsense for novel syntheses of benign compounds. Professional chemists should be using an LLM as an idea generator or a way to search for publications rather than trusting whatever it spits out when it doesn't refuse a prompt.
[1] https://en.wikipedia.org/wiki/Strychnine_total_synthesis