Hacker News new | past | comments | ask | show | jobs | submit login

Huh, this is the least interesting thing I've written about prompt injection in the last few weeks, but the only one to make it to the Hacker News homepage.

Better recent posts:

- Delimiters won’t save you from prompt injection - https://simonwillison.net/2023/May/11/delimiters-wont-save-y... - talks about why telling a model to follow delimiters like ``` won't protect against prompt injection, despite that being mentioned as a solution in a recent OpenAI training series

- Prompt injection explained, with video, slides, and a transcript - https://simonwillison.net/2023/May/2/prompt-injection-explai... - a 12 minute video from a recent LangChain webinar I participated in where I explain the problem and why none of the proposed solutions are effective (yet)

- The Dual LLM pattern for building AI assistants that can resist prompt injection - https://simonwillison.net/2023/Apr/25/dual-llm-pattern/ - my attempt at describing a way of building AI assistants that can safely perform privileged actions even in the absence of a 100% reliable defense against prompt injection

More of my writing about prompt injection:

- https://simonwillison.net/series/prompt-injection/

- https://simonwillison.net/tags/promptinjection/




Those posts are great! I've put https://news.ycombinator.com/item?id=35911595 ("Delimiters won’t save you from prompt injection") in the second-chance pool (https://news.ycombinator.com/pool, explained at https://news.ycombinator.com/item?id=26998308), so it will get a random placement on HN's front page. (I know you posted it earlier, but I prefer to spread the love by letting karma rain down on less-prolific submitters (love being one thing that isn't a power law).

I've emailed a repost invite to the submitter of https://news.ycombinator.com/item?id=35803564 ("Prompt Injection Explained"). Invited reposts go into the second-chance pool once they're submitted. If the article hasn't appeared after (say) a couple weeks or so, someone else is welcome to post it and email hn@ycombinator.com and we'll put it in the SCP.

I've emailed you a repost invite for https://news.ycombinator.com/item?id=35705159 ("The Dual LLM pattern for building AI assistants that can resist prompt injection"). It would be good to space these out, so maybe wait to use that link until a few days have gone by without one of your posts basking in front page glory?

Thanks for all the work figuring out this stuff and explaining it to the rest of us! It's amazing what a good writer can do when self-employed (https://news.ycombinator.com/item?id=35925266).


Thanks so much! Really appreciate it.


Thanks! Love your writing. One question for you - how do you absorb these new concepts and experiment with them so quickly? It seems like you have the output of a small team, not just one person.


I'm "self-employed" aka I don't have anyone to tell me what else to spend my time on!


you don't really know how new that is for different people. I'd imagine 3 months of learning about something interesting gives you quite a good idea about the topic, at least good enough to write about it.


Delimiters are shown quite often as possible mitigations, but they do not work. I had the same observation when doing the Prompt Engineering class from OpenAI/DeepLearningAI.

Basically every example was vulnerable, and I made it a special challenge to perform an indirect prompt injection for each one of them. This led to interesting exploits such as JSON object injections, HTML injection and even XSS. Overwriting order prices with the OrderBot was also quite fun. :)

Here is a post and Notebook I used to learn/repro and experiment with these issues (incl. JSON Object injection and XSS): https://embracethered.com/blog/posts/2023/adversarial-prompt...

Also, an older post about data exfil for bots (with a Discord bot as an example): https://embracethered.com/blog/posts/2023/ai-injections-thre...


I think many people, like me, found your other content through this link and took that as an opportunity to vote this link.


(This was posted in https://news.ycombinator.com/item?id=35924293, which we merged hither. It makes more sense there but I don't want to leave it stranded, so moved it over.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: