OxiPNG: PNG optimizer written in Rust

anp · on Sept 8, 2017

For the curious, I replicated some benchmarks from ~1yr ago (https://www.reddit.com/r/rust/comments/48vkjy/oxipng_a_multi...).

Note in that thread that the author implies that the optimization levels between the two tools are not necessarily equivalent or stable.

    $ optipng --version
    OptiPNG version 0.7.6
    $ oxipng --version
    oxipng 0.16.3

Ran on a 2016 MBP, i7.

Time:

    opt-level    oxipng    optipng
    2            148.63s   30.80s
    3            9.56s     45.99s
    4            23.82s    90.57s

Size of resulting image:

    opt-level
      original      12M

                 oxipng    optipng
      2          12M       12M
      3          12M       "already optimized"
      4          12M       "already optimized"

TBH I don't really know how to interpret these results. But with the strange exception of `oxipng -o 2`, it does seem like the naive parallelism offers some real performance wins. Disclaimer: this is an incredibly amateur attempt to benchmark, I really don't have enough knowledge to produce something authoritative.

userbinator · on Sept 8, 2017

It would be more useful if you didn't round the sizes to "12M", because I have a hard time believing they were all exactly the same. Bytes are a discrete and exact quantity, and ideally would be displayed to full precision to allow useful comparison.

anp · on Sept 8, 2017

It would probably be more useful, yes. But that would require remembering what my ls alias without `-h` is, and also would maybe give more confidence in the results of the quick 'n' dirty analysis than is justified.

62747478182 · on Sept 9, 2017

You can type `/bin/ls -l` to skip your shell aliases.

vladdanilov · on Sept 8, 2017

Comparing supposedly the best results.

          oxipng         optipng        optimage
  size    12 231 612B    12 231 612B    11 027 901B
  time    23.82s         90.57s         50.8s

I keep on improving my single-threaded image compressor (http://getoptimage.com).

mrob · on Sept 8, 2017

Is that optimage result for lossless or lossy mode?

vladdanilov · on Sept 8, 2017

Lossless in this case.

DiThi · on Sept 8, 2017

With that file size, I'm guessing you're using a big picture without transparency and with many colors, which is the opposite of the usual use case of optimizing PNGs.

anp · on Sept 8, 2017

With significantly smaller file sizes, it could be more challenging to see timing differences. Also, this image is one that someone used the last time OxiPNG was discussed, which seemed relevant to me.

DiThi · on Sept 8, 2017

Then the ideal test is having a bunch of icons with transparency, merged in a single image.

tyingq · on Sept 8, 2017

"complete rewrite of the OptiPNG project, which was assumed to be dead as no commit had been made to it since March 2014"

Version 0.7.6 of OptiPNG was released 2016-apr-03.

Not super recent, but it seems reasonable, given the limited scope, that it would stabilize and not change much after a certain point

unkown-unknowns · on Sept 8, 2017

First commit of OxiPNG was 2015-12-15. The way I read the part you quoted is that they originally assumed OptiPNG to be dead, not that they are saying now that it's dead.

Absolutely agree with your point about reaching a stage where not so many big changes are necessary though.

tyingq · on Sept 8, 2017

Ahh, I did miss that, but the OxiPng readme doesn't mention that OptiPNG has had a release since their initial observation either.

pcwalton · on Sept 8, 2017

Great example of the use of Rust to add parallelism to existing algorithms.

ricardobeat · on Sept 8, 2017

Did the increased parallelism lead to meaningful performance gains?

cesarb · on Sept 8, 2017

It appears so. I did a quick test with a huge PNG (80 megabytes), just to see what would happen. optipng took 5m9s, oxipng with "--threads 1" took 3m21s, and oxipng with the default number of threads took 2m00s. Not quite "half the time", but not bad.

Houshalter · on Sept 8, 2017

I imagine most usecases where this performance would matter would be if you were compressing a ton of pngs at once. In which case wouldn't it be much faster just to run several different processes instead of one process using multiple cores?

Corrado · on Sept 8, 2017

Not necessarily. Back in the day I generated thumbnails for lots (and lots) or very large images and getting the best performance is tricky. Most of the time running multiple processes just ends up causing massive context switching and slowing down the whole thing. Even multithreading turns out to be bad, most of the time. I got the best performance by forcing GraphicsMagick to use a single thread and spinning up multiple operations. I wonder how this app would deal with something like that.

foota · on Sept 8, 2017

Latency could matter. Perhaps you're storing an image and want to ack that the image is written to disk before responding to the caller.

solidsnack9000 · on Sept 8, 2017

But running several different processes can be a nuisance...better performance by default is still great.

Houshalter · on Sept 8, 2017

Well it must have taken them a ton of work to parallelize their algorithm also. It would have been better to just have a batch mode that lets you process multiple images at once on different cores.

cesarb · on Sept 8, 2017

> Well it must have taken them a ton of work to parallelize their algorithm also.

Not really. Both optipng and oxipng do "trials": pick a set of parameters, try to compress the image with them, repeat with the next set of parameters. Each trial is completely independent from the others, so it's trivial to run them all in parallel.

Most of the work of running the trials in parallel is done by the "rayon" crate; the oxipng author only had to do an .into_par_iter() to get a rayon parallel iterator, and then do a map/reduce on it.

The difficulty of doing that with optipng was global state. Quoting from https://www.reddit.com/r/rust/comments/48vkjy/oxipng_a_multi...:

"My initial plan as well was to fork optipng and implement multithreading on it, but optipng, like many C programs, takes pride in using global mutable shared state in every place possible, and I didn't want to try to untangle that spaghetti code to make it thread-safe."

tomjakubowski · on Sept 8, 2017

Well, not really, because implementing a batch mode is more work than letting xargs or GNU Parallel or Make handle the multiprocessing for the "many files" case.

Parallelizing the algorithm helps in the "single file" case, which is still important.

Houshalter · on Sept 8, 2017

Well parent comment was complaining that it would be more work for the end user. I was just noting it would be less work to include that convenience in the tool itself than trying to make the algorithm parallel for less benefit.

solidsnack9000 · on Sept 8, 2017

Managing lots of processes is a ton of work...

Houshalter · on Sept 8, 2017

Is it? I believe it can be done in like a line or two of bash.

solidsnack9000 · on Sept 9, 2017

It takes more than a line or two of Bash just to shut them all down after Ctrl-C.

ac29 · on Sept 8, 2017

Does this include the work from zopflipng? That usually makes more of a difference in size than OptiPNG does.

anp · on Sept 8, 2017

There are CLI options for supporting zopfli, not sure whether it's from zopflipng.

hobofan · on Sept 8, 2017

It is not. It uses the native Rust implementation of zopfli [0], which doesn't include anything from zopflipng.

[0]: https://crates.io/crates/zopfli

0x4a42 · on Sept 8, 2017

>Windows users will need to ensure they have the Visual C++ 2015 Runtime installed.

Why?

kbrosnan · on Sept 8, 2017

> I'm not sure how easy this is to resolve in Appveyor, but I would also prefer to have Windows binaries compiled statically. I'm much less familiar with compiling Rust binaries on Windows and don't have a Windows machine readily available, so I'll have to fire up a virtual machine and experiment.

> Worst case scenario, I'll need to add an instruction for Windows users to install the Visual C++ 2015 runtime. I'd prefer to avoid that though. This also likely won't be a problem with the GCC-compiled version, but this runtime is a dependency for all programs compiled by MSVC.

https://github.com/shssoichiro/oxipng/issues/49

userbinator · on Sept 8, 2017

You should be able to link with MSVCRT.DLL, yielding a small binary that will "just work" on a variety of systems going all the way back to Win95(!) (provided you don't use newer API calls):

https://stackoverflow.com/questions/10166412/how-to-link-aga...

Microsoft doesn't seem to like this, and will say it's "officially not supported" (if I remember correctly, there's a very strongly worded post by Raymond Chen on the topic...), but this is how essentially all the apps that come with Windows are compiled, and I believe MingW does it too.

http://planet.jboss.org/post/fighting_the_msvcrt_dll_hell

orlp · on Sept 8, 2017

Microsoft's the one to blame here. If they made sure their systems automatically come with MSVCRTs pre-installed, and new ones downloaded through the update system, this problem wouldn't exist in the first place.

Asking your users to install some sketchy 'runtime' won't work. And not every app should need or want a specific install script to verify for the umpteenth time that something as basic as the C/C++ runtime library is available.

userbinator · on Sept 8, 2017

MS does supply the MSVCRT.DLL with Windows, and even link their apps with it, but then said you (as in, non-MS apps) weren't supposed to use it in lieu of the MSVCRxxx (and now, the even worse "split the libc into several dozen pieces, each in its own file"[1]) mess.

Most developers grudgingly obeyed, bloating their apps either through static linking, including a copy of the set of runtime DLLs, or the .MSI installer in their own distribution.

But then there are those of us who saw through that ruse, gave MS the proverbial finger and linked with the "system MSVCRT.DLL". The result being tiny, highly portable binaries which don't require any installation or dependencies beyond what Windows already has.

[1] https://blogs.msdn.microsoft.com/vcblog/2014/06/10/the-great...

pjmlp · on Sept 8, 2017

Many OSes unlike UNIX, including Windows, traditionally don't have a system C library, rather multiple compiler vendors each with its own C library.

Which is one reason, when you look at Windows 3.x code, it was common to call Win16 APIs directly instead of ANSI C ones.

For example, ZeroMemory() instead of memset().

int_19h · on Sept 9, 2017

In Win10, it does come with a preinstalled CRT, which is going to remain ABI-compatible.

https://blogs.msdn.microsoft.com/vcblog/2015/03/03/introduci...

Also, please don't call them "MSVCRTs". That - MSVCRT.DLL - is the name of the binary of one particular version of the runtime, corresponding to VS 6.0. All further versions had different names. The proper generic term is CRT (as in, "C runtime").

cesarb · on Sept 8, 2017

AFAIK, you are supposed to distribute the MSVC runtimes as merge modules within your MSI package, which is what for instance LibreOffice does.

Const-me · on Sept 8, 2017

> you are supposed to distribute the MSVC runtimes as merge modules

Right, that’s what MS tell.

What they don’t tell is these merge modules aren’t compatible enough. E.g. it’s impossible to install VC 2015 runtime DLLs (so-called Universal CRT) on a never updated Windows 7 SP1 machine.

pjmlp · on Sept 8, 2017

Well my work laptop is Windows 7 up to date, with VS 2015 Update 3.

So it got installed there somehow.

Const-me · on Sept 8, 2017

The problem only affects PCs that aren’t up to date.

Specifically, PCs without KB2999226 installed.

Not all people update their Windows. Some just don’t care. Others people don’t want to connect PC to the internet. There’re also people who use mobile internet and pay for bandwidth.

When they’re unable to run the software they’ve paid money for, I don’t want to update Windows for them, I want my software to work. Hence, no dynamic CRT.

hypervis0r · on Sept 10, 2017

So you're refusing to have a basic update installed? Let me guess, you're going to complain and say Windows is insecure because you got WannaCry'd because you didn't update your system and blame it on MS instead?

(you = whoever "doesn't care" about updates)

Const-me · on Sept 10, 2017

It’s unlikely to get WannaCry unless the PC is connected to internet or LAN.

The software we’re offering doesn’t require to be online. It normally works unattended for hours, controlling some specialized industrial hardware. Not unlike embedded software. For this particular use case, being offline has it’s upsides: no downloads, no updates, no reboots.

The users who discovered that bug (initially I did include these CRT merge modules in my installer, as recommended by MS) didn’t even send us a screenshot. Instead, they took a photo of their PC’s screens with the error message, and sent us that. Then I was able to reproduce on an offline Win7 VmWare machine, and deliver a fixed version with statically linked CRT, which BTW worked OK even on a vanilla Windows 7 sp0 from 2009.

Personally, I would recommend updating Windows instead, and I do update the PCs I own. But I can, and do, support running my software on a never updated system.

cesarb · on Sept 8, 2017

> Not all people update their Windows. Some just don’t care. Others people don’t want to connect PC to the internet. There’re also people who use mobile internet and pay for bandwidth.

And sometimes Windows Update simply stops working. I've seen it happen more than once.

connorcpu · on Sept 8, 2017

You can actually just compile by setting $env:RUSTFLAGS="-C target-feature=+crt-static" and it will completely drop the dependency on vcruntime.dll. Rust is basically only pulling in memcpy/memset/etc. Not really things you'd be worried about statically linking.