Without removing the functionality as it currently exists, I don't see a way to ...

simonw · on Dec 15, 2023

Writer.com could make this a lot less harmful by closing the exfiltration vulnerability it's using: they should disallow rendering of Markdown images, or, if they're allowed, make sure that they can only be rendered on domains directly controlled by Writer.com - so not a CSP header for *.cloudfront.net.

There's no current reliable solution to the threat of extra malicious instructions sneaking in via web page summarization etc, so the key thing is to limit the damage that those instructions can do - which means avoiding exposing harmful actions that the language model can carry out and cutting off exfiltration vectors.

ranguna · on Dec 16, 2023

Just prompt the user every time an image needs to be rendered and show the call details. The users will see the full url with all their text in it and they can report it.

This works for images and any other output call, like normal http REST calls.

jcparkyn · on Dec 16, 2023

I would think that a fairly reliable fix would be "only render markdown links that appear verbatim in the retrieved HTML", perhaps with an additional whitelist for known safe image hosts. The signifiant majority of legitimate images would meet one or both of these criteria, meaning the feature would be mostly unaffected.

This way, the maximum theoretical amount of information exfiltrated would be log2(number of images on page) bits, making it much less dangerous.