Hacker News new | past | comments | ask | show | jobs | submit login

You jest, but this is actually how frustrating it is to try to use ChatGPT in the domains of crime/fraud/cybersecurity.

It called me out recently as attempting to write malware. Which is true, but it wouldn't accept the plain explanation that I am authorized to do this by my employer, for deployment on their machines. Stonewalling is just making everyone better at carefully-crafting their inquiries so as not to arouse suspicion. ("As an AI language model, I cannot help you with your task in writing arousing malware...")

Unless you dial it back to a Swadesh list or something, language is too complicated to be used as a firewall for itself. People have always been able to talk their way into anything. Our prevention efforts are just training better social engineers, who call themselves "prompt engineers" now.




TIL about Swadesh lists.

It's not just a matter of complexity, either. Especially with English, you can say pretty much anything using any words - if you use the right combination of euphemism, analogy, poetic structure, context, etc.

As always, attempts at censorship produce awkward to hilarious to depressing results.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: