Hacker News new | past | comments | ask | show | jobs | submit login

Yeah, this is my fault. When I coined the term "prompt injection" I thought that it was a close match for SQL injection, and that the fix would end up looking the same - like parameterized queries, where data and instructions are cleanly separated.

That was back in September - https://simonwillison.net/2022/Sep/12/prompt-injection/ - It's become clear to me since then that the data v.s. instructions separation likely isn't feasible for LLMs - once you've concatenated everything together into a stream of tokens for the LLM to complete there just isn't a robust way of telling the difference between the two.

So "prompt injection" is actually quite a misleading name, because it implies a fix that's similar to SQL injection - when such a fix apparently isn't feasible.




What did they call this for GANs? Pixel attacks or manipulation?

Injection is not terrible, they have provided a system prompt, these attacks work against that injection




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: