Yeah, this is my fault. When I coined the term "prompt injection" I thought that it was a close match for SQL injection, and that the fix would end up looking the same - like parameterized queries, where data and instructions are cleanly separated.
That was back in September - https://simonwillison.net/2022/Sep/12/prompt-injection/ - It's become clear to me since then that the data v.s. instructions separation likely isn't feasible for LLMs - once you've concatenated everything together into a stream of tokens for the LLM to complete there just isn't a robust way of telling the difference between the two.
So "prompt injection" is actually quite a misleading name, because it implies a fix that's similar to SQL injection - when such a fix apparently isn't feasible.
That was back in September - https://simonwillison.net/2022/Sep/12/prompt-injection/ - It's become clear to me since then that the data v.s. instructions separation likely isn't feasible for LLMs - once you've concatenated everything together into a stream of tokens for the LLM to complete there just isn't a robust way of telling the difference between the two.
So "prompt injection" is actually quite a misleading name, because it implies a fix that's similar to SQL injection - when such a fix apparently isn't feasible.