> As the developer of a very high-powered tool, I may well wish to limit its use...

pksebben · on June 29, 2023

(the year is 2048. The camera pans across an office at Quantico, which is eerily serene. A messenger knocks on an important-looking door with a plaque that reads 'DIRECTOR')

Director: Come in

Messenger: Message from the Tulsa field office, sir. They're reporting that they've found a sex trafficking ring, but they're not sure what to do about it.

Director: Not sure? Arrest them, obviously. What's the problem?

Messenger: Well, they can't seem to secure a warrant. Some technical issue with the system.

Director: I know we migrated to a new system recently. Let's see if we can get this sorted.

(Director thwacks at the keyboard briefly)

Computer: Your request for "Child Sex Trafficking Warrant" has been found to contain content marked "Not Safe For Work". This violation has been reported.

Director: What the hell.

Messenger: Yeah, we tried to email you about it but the filters dropped the message. That's why they sent me.

Director: I'll deal with this. Let me make a call.

(Director picks up phone and dials)

Director: Hello? Hi, Paul. Yeah, we're having some issues with the new warrant system.... No, it's doing everything as advertised... yes, it's a lot faster and we've managed to lay off a ton of our data staff. The problem is with getting warrants; Me and my guys have been trying to get one but it keeps getting rejected... Oh, you know, some sex trafficking ring in Tulsa.... Hello?

Phone: Your call cannot be completed as spoken. Our automated systems have detected content related to sex trafficking. This incident will be reported.

Director: God Damnit.

(as the director holds the phone trembling in frustration, the power goes out and they are enveloped in darkness in the windowless room. Roll credits)

jstarfish · on June 29, 2023

You jest, but this is actually how frustrating it is to try to use ChatGPT in the domains of crime/fraud/cybersecurity.

It called me out recently as attempting to write malware. Which is true, but it wouldn't accept the plain explanation that I am authorized to do this by my employer, for deployment on their machines. Stonewalling is just making everyone better at carefully-crafting their inquiries so as not to arouse suspicion. ("As an AI language model, I cannot help you with your task in writing arousing malware...")

Unless you dial it back to a Swadesh list or something, language is too complicated to be used as a firewall for itself. People have always been able to talk their way into anything. Our prevention efforts are just training better social engineers, who call themselves "prompt engineers" now.

pksebben · on June 30, 2023

TIL about Swadesh lists.

It's not just a matter of complexity, either. Especially with English, you can say pretty much anything using any words - if you use the right combination of euphemism, analogy, poetic structure, context, etc.

As always, attempts at censorship produce awkward to hilarious to depressing results.

tracker1 · on June 29, 2023

I wish I could upvote this more than once. It truly feels like the direction we're headed in.

vessenes · on June 29, 2023

Yes, great analogy!