Hacker News new | past | comments | ask | show | jobs | submit login
Executable PNGs (djharper.dev)
236 points by todsacerdoti on Dec 26, 2020 | hide | past | favorite | 39 comments



In my own system, Octo[0], I encode programs and their metadata in a similar steganographic fashion in GIF files[1]. As others have noted here, both GIF and PNG offer extension mechanisms and ways to embed "comments", but popular image-sharing sites universally re-encode images and discard this data. The advantage of GIF over PNG (for my purposes) is that I store an arbitrary payload in a fixed-looking image by creating additional frames of animation.

In the past, I've also used a different technique- if you simply concatenate a PNG onto a JAR (which is really just a ZIP archive) you end up with a file that acts like a PNG unless you change the extension to JAR, in which case it acts like a Java executable. This works because the PNG header is at the beginning of the file, while the ZIP header is at the end. Nowadays, though, desktop Java is pretty much dead, so it's a less exciting party trick.

[0] https://github.com/JohnEarnest/Octo

[1] https://github.com/JohnEarnest/Octo/blob/gh-pages/js/sharing...


> In the past, I've also used a different technique- if you simply concatenate a PNG onto a JAR (which is really just a ZIP archive) you end up with a file that acts like a PNG unless you change the extension to JAR, in which case it acts like a Java executable. This works because the PNG header is at the beginning of the file, while the ZIP header is at the end. Nowadays, though, desktop Java is pretty much dead, so it's a less exciting party trick.

It was a common party trick for the reasons you outlined on image boards: you could smuggle small-ish zip files as long as the board made the original file available as-is somehow by concatenating the image (usually jpeg) and zip.


Author here, hello.

The post glosses over some aspects of the implementation, for example to fit some executables into an image you can optionally compress the data with gzip and encode that instead

It also supports encoding at two-bit and four-bit levels but obviously the grain starts becoming apparent in the output images

Also it puts a 40 byte or so "header" in the encoded output so the decoder can see how many bytes to read up to, validate a hash and check a magic. It's a bit basic but it's enough to get it to work.

Anyway this was fun, thanks for reading!


> It also supports encoding at two-bit [...] levels

Excellent! That was going to be my question after having just read the blog post and comments. Nice work!


This might be of interest - https://github.com/chinarulezzz/pixload


Vehicle for viruses/malware?


I’ve seen Android malware that imports and exfilterates compressed data disguised as GIF’s


> to fit some executables into an image you can optionally compress the data with gzip and encode that instead

We already know such solution as "Rarjpeg"[0]: "letter file" + "letterbox file"[1]

Supported "letter files" (which could make box files selfexecutable):

- archives (selfexecutable): .rar/.7z

- other (not selfexecutable): any other files

Actually, supported "letterbox files" are:

- audio: .wav/.mp3/.aac/.amr

- image: .jpg/.png/.gif/.webp

- other: .torrent , .html

So, there are nothing new in your solution.

[0] http://lurkmore.to/Rarjpeg

[1] https://news.ycombinator.com/item?id=25329600


Thanks, I didn't go into this expecting it to be new, or a production ready thing, I did it because it was dumb and I learned a few things along the way


I appreciate you sharing your explorations. Novelty is overrated when it comes to learning.


"Novelty is overrated when it comes to learning."

I'm still trying to understand what this means.

I must admit to being aesthetically offended by any dismissal of novelty, but this is clearly an emotional reaction and I recognize that.

I agree that making an implementation of an existing thing can be a useful way to learn about it's structure. I really don't think that novelty and original work (i.e. a hacking exploit and it's writeup) are opposed to the previous sentence.

What does it mean to oppose novelty and learning?


For what it's worth, I read this less as "dismissal" and more as a statement of orthogonality. Novelty is neither positively nor negatively correlated with learning; in other words, if your goal is to learn, novelty simply doesn't need to enter the equation.

> I really don't think that novelty and original work [...] are opposed to the previous sentence.

So I don't think you disagree with your parent comment.


E.g. when learning one should not also need to implement something new.

As an example making tic-tac-toe or a calculator or bubble sort are all tried and true exercises which you can find plenty of prior art on to compare and contrast your solution with.

Perhaps too many people don't try something because "it's been done to death".


I can't believe someone flagged a citation-laden dry-as-a-bone statement with no insults, rude words, etc.

I do think that that the author addressed this comment by saying that they don't CLAIM original work, but I find such use of the "flag post" tool to be disturbingly motivated and illogical.


> I can't believe someone flagged a citation-laden dry-as-a-bone statement with no insults, rude words, etc.

This is actually pretty common. Although it appears that it has been unflagged at this point.


I would think PICO-8 would just use png's ability to add chunks of data that are ignored if not recognized by a regular decoder, why bother with steno?

Separately, this is pretty nifty

edit: well i stand corrected by the comment below[1] that pico does in fact use a steno technique, but i am curious why they took such a route considering nothing in the cart is intended to be secret or copy proof

[1]https://news.ycombinator.com/item?id=25543336


I'd expect a not insignificant amount of image hosting sites to throw away unrecognized chunks, especially if they contain a few dozen kilobytes of data. It might even make sense for privacy or security reasons, just like EXIF data is usually stripped.


One core concept of PICO-8 is the artificial limit to 32 KiB per cartridge.

While that would certainly also be enforceable with PNG data chunks, I think the 32K restriction lends itself to the steganographic solution more naturally than using a chunk would.


"why bother" isn't a very relevant question about anything PICO-8. The entire thing is about artificial limits for "because we can" reasons.

Basically, the answer would be something akin to "because that would feel like cheating"


As other posters mentioned, using extra chunks might make it incompatible with hosting sites that remove them.

Plus consider that they can be removed even perhaps unintentionally, if the upload process manipulates the image, it might convert the PNG file into a format-independent "Image" structure while in memory, which possibly discards this extra information.

Steganography is a bit more resillient since it will survive basically everything but scaling or lossy compression, as the information is in the image itself (and thus a format-independent program doesn't need to take into account that data).

Naturally, it is still liable to websites that assume images are photos and thus can be scaled and otherwise tampered with much less consequence.


“Because Zep thought it was more fun that way” is always a good guess at the reason anything in Pico-8 works the way it does, tbh.


I liked the integration with binfmt_misc! There are several different approaches to bundle data files with a single executable, as the author probably knows it. My favorite one is AppImage. Even though, I've also looked into using alternative approaches to software distribution in the recent past, and ended up creating an extra section in the ELF file to store ancillary data. `objcopy --add-section <params> --set-section-flags <params>` appends the payload to the executable, and libelf can be used at some later point to retrieve the data back. Works like a charm, as long as you don't let `strip` remove unneeded sections from that executable file.


This is a bit old but I used to write my JavaScript to PNG and load that instead. I don’t recommend it, it’s just for fun.

I think the encoder still works:

https://donohoe.dev/project/jspng-encoder/


This is really cool but I doubt this is now PICO8 works, since PNG lets you create arbitrary chunks of data that standard decoders will ignore:

> All ancillary chunks are optional, in the sense that encoders need not write them and decoders can ignore them.

http://www.libpng.org/pub/png/spec/1.2/PNG-Chunks.html


You don't need to doubt, you can just check:

https://pico-8.fandom.com/wiki/P8PNGFileFormat

> The cart data is stored using a steganographic process. Each PICO-8 byte is stored as the two least significant bits of each of the four color channels, ordered ARGB


I stand corrected! I maintain it’s an odd choice given the availability of alternatives, but I suppose it’s all in good fun.


> I maintain it’s an odd choice given the availability of alternatives

The problem is that the "alternatives" are not reliable: image hosting sites commonly will optimise files by removing ancillary and unknown chunks, losing the data. However as long as they don't rewrite the image data itself (e.g. by resizing the PNG or putting it through a quantizer) the steganography method will ensure the data propagates correctly.


They could have created their own chunk type and set the safe-to-copy but to 1, permitting apps to round-trip the chunk without knowing exactly what was in it. From the spec:

> If a chunk's safe-to-copy bit is 1, the chunk may be copied to a modified PNG file whether or not the software recognizes the chunk type, and regardless of the extent of the file modifications.


I don’t have a link to back it up but I recall that yes, PICO-8 in fact uses the two least significant bits of each color channel, respectively.


> I don’t have a link to back it up but I recall that yes, PICO-8 in fact uses the two least significant bits of each color channel, respectively.

"The cart data is stored using a steganographic process. Each PICO-8 byte is stored as the two least significant bits of each of the four color channels"

https://pico-8.fandom.com/wiki/P8PNGFileFormat


My poor man's version from this HN discussion[1]:

  curl -s https://pbs.twimg.com/media/Dq2sPGNU0AEKyyC.jpg | dd status=none bs=1 skip=599 count=40
[1] https://news.ycombinator.com/item?id=18342042


This is cool. I've always loved pico8 and been in some level of awe about the idea of distributing games as pngs, so it's really interesting to read more about it, even if "pointless" :-)


I suspect the clang issue is because it expects argv[0] to be a path to a real binary, and internally reruns itself as a subprocess. It might start working if instead of using /proc/self, you fetched your pid and used it explicitly in the path, so the subprocess could run it too.


Hmm good suggestion thanks, I just tried this but it exhibits the same behaviour.

I think it's running into this problem https://stackoverflow.com/questions/27494866/llvm-cannot-fin... - it looks like it looks at /proc/self/exe which I'm guessing returns a bad value? Not sure.


Ah, maybe it's using readlink on /proc/self/exe to get the actual path to the binary, which would obviously fail if it's not a real file. If it's something along those lines then I guess it's probably not fixable. Oh well, still a bit of fun anyway :p


Well, at least it's intentional, unlike the Windows Metafile exploit, or 4chan.jse. (Intentional by the user, at least; the developer's intention is assumed)


Makecode Arcade also saves your code in a downloadable png.

https://arcade.makecode.com/


PNG could be modified into 16-bit executables on Windows. The extension is .con and could start with any header bytes.

You can even access some 32bit APIs I assume?


any file is executable on linux, if you are brave enough




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: