Hacker News new | past | comments | ask | show | jobs | submit login
OxiPNG: PNG optimizer written in Rust (github.com/shssoichiro)
130 points by 19h on Sept 7, 2017 | hide | past | favorite | 45 comments



For the curious, I replicated some benchmarks from ~1yr ago (https://www.reddit.com/r/rust/comments/48vkjy/oxipng_a_multi...).

Note in that thread that the author implies that the optimization levels between the two tools are not necessarily equivalent or stable.

    $ optipng --version
    OptiPNG version 0.7.6
    $ oxipng --version
    oxipng 0.16.3
Ran on a 2016 MBP, i7.

Time:

    opt-level    oxipng    optipng
    2            148.63s   30.80s
    3            9.56s     45.99s
    4            23.82s    90.57s
Size of resulting image:

    opt-level
      original      12M

                 oxipng    optipng
      2          12M       12M
      3          12M       "already optimized"
      4          12M       "already optimized"

TBH I don't really know how to interpret these results. But with the strange exception of `oxipng -o 2`, it does seem like the naive parallelism offers some real performance wins. Disclaimer: this is an incredibly amateur attempt to benchmark, I really don't have enough knowledge to produce something authoritative.


It would be more useful if you didn't round the sizes to "12M", because I have a hard time believing they were all exactly the same. Bytes are a discrete and exact quantity, and ideally would be displayed to full precision to allow useful comparison.


It would probably be more useful, yes. But that would require remembering what my ls alias without `-h` is, and also would maybe give more confidence in the results of the quick 'n' dirty analysis than is justified.


You can type `/bin/ls -l` to skip your shell aliases.


Comparing supposedly the best results.

          oxipng         optipng        optimage
  size    12 231 612B    12 231 612B    11 027 901B
  time    23.82s         90.57s         50.8s
I keep on improving my single-threaded image compressor (http://getoptimage.com).


Is that optimage result for lossless or lossy mode?


Lossless in this case.


With that file size, I'm guessing you're using a big picture without transparency and with many colors, which is the opposite of the usual use case of optimizing PNGs.


With significantly smaller file sizes, it could be more challenging to see timing differences. Also, this image is one that someone used the last time OxiPNG was discussed, which seemed relevant to me.


Then the ideal test is having a bunch of icons with transparency, merged in a single image.


"complete rewrite of the OptiPNG project, which was assumed to be dead as no commit had been made to it since March 2014"

Version 0.7.6 of OptiPNG was released 2016-apr-03.

Not super recent, but it seems reasonable, given the limited scope, that it would stabilize and not change much after a certain point


First commit of OxiPNG was 2015-12-15. The way I read the part you quoted is that they originally assumed OptiPNG to be dead, not that they are saying now that it's dead.

Absolutely agree with your point about reaching a stage where not so many big changes are necessary though.


Ahh, I did miss that, but the OxiPng readme doesn't mention that OptiPNG has had a release since their initial observation either.


Great example of the use of Rust to add parallelism to existing algorithms.


Did the increased parallelism lead to meaningful performance gains?


It appears so. I did a quick test with a huge PNG (80 megabytes), just to see what would happen. optipng took 5m9s, oxipng with "--threads 1" took 3m21s, and oxipng with the default number of threads took 2m00s. Not quite "half the time", but not bad.


I imagine most usecases where this performance would matter would be if you were compressing a ton of pngs at once. In which case wouldn't it be much faster just to run several different processes instead of one process using multiple cores?


Not necessarily. Back in the day I generated thumbnails for lots (and lots) or very large images and getting the best performance is tricky. Most of the time running multiple processes just ends up causing massive context switching and slowing down the whole thing. Even multithreading turns out to be bad, most of the time. I got the best performance by forcing GraphicsMagick to use a single thread and spinning up multiple operations. I wonder how this app would deal with something like that.


Latency could matter. Perhaps you're storing an image and want to ack that the image is written to disk before responding to the caller.


But running several different processes can be a nuisance...better performance by default is still great.


Well it must have taken them a ton of work to parallelize their algorithm also. It would have been better to just have a batch mode that lets you process multiple images at once on different cores.


> Well it must have taken them a ton of work to parallelize their algorithm also.

Not really. Both optipng and oxipng do "trials": pick a set of parameters, try to compress the image with them, repeat with the next set of parameters. Each trial is completely independent from the others, so it's trivial to run them all in parallel.

Most of the work of running the trials in parallel is done by the "rayon" crate; the oxipng author only had to do an .into_par_iter() to get a rayon parallel iterator, and then do a map/reduce on it.

The difficulty of doing that with optipng was global state. Quoting from https://www.reddit.com/r/rust/comments/48vkjy/oxipng_a_multi...:

"My initial plan as well was to fork optipng and implement multithreading on it, but optipng, like many C programs, takes pride in using global mutable shared state in every place possible, and I didn't want to try to untangle that spaghetti code to make it thread-safe."


Well, not really, because implementing a batch mode is more work than letting xargs or GNU Parallel or Make handle the multiprocessing for the "many files" case.

Parallelizing the algorithm helps in the "single file" case, which is still important.


Well parent comment was complaining that it would be more work for the end user. I was just noting it would be less work to include that convenience in the tool itself than trying to make the algorithm parallel for less benefit.


Managing lots of processes is a ton of work...


Is it? I believe it can be done in like a line or two of bash.


It takes more than a line or two of Bash just to shut them all down after Ctrl-C.


Does this include the work from zopflipng? That usually makes more of a difference in size than OptiPNG does.


There are CLI options for supporting zopfli, not sure whether it's from zopflipng.


It is not. It uses the native Rust implementation of zopfli [0], which doesn't include anything from zopflipng.

[0]: https://crates.io/crates/zopfli


>Windows users will need to ensure they have the Visual C++ 2015 Runtime installed.

Why?


> I'm not sure how easy this is to resolve in Appveyor, but I would also prefer to have Windows binaries compiled statically. I'm much less familiar with compiling Rust binaries on Windows and don't have a Windows machine readily available, so I'll have to fire up a virtual machine and experiment.

> Worst case scenario, I'll need to add an instruction for Windows users to install the Visual C++ 2015 runtime. I'd prefer to avoid that though. This also likely won't be a problem with the GCC-compiled version, but this runtime is a dependency for all programs compiled by MSVC.

https://github.com/shssoichiro/oxipng/issues/49


You should be able to link with MSVCRT.DLL, yielding a small binary that will "just work" on a variety of systems going all the way back to Win95(!) (provided you don't use newer API calls):

https://stackoverflow.com/questions/10166412/how-to-link-aga...

Microsoft doesn't seem to like this, and will say it's "officially not supported" (if I remember correctly, there's a very strongly worded post by Raymond Chen on the topic...), but this is how essentially all the apps that come with Windows are compiled, and I believe MingW does it too.

http://planet.jboss.org/post/fighting_the_msvcrt_dll_hell


Microsoft's the one to blame here. If they made sure their systems automatically come with MSVCRTs pre-installed, and new ones downloaded through the update system, this problem wouldn't exist in the first place.

Asking your users to install some sketchy 'runtime' won't work. And not every app should need or want a specific install script to verify for the umpteenth time that something as basic as the C/C++ runtime library is available.


MS does supply the MSVCRT.DLL with Windows, and even link their apps with it, but then said you (as in, non-MS apps) weren't supposed to use it in lieu of the MSVCRxxx (and now, the even worse "split the libc into several dozen pieces, each in its own file"[1]) mess.

Most developers grudgingly obeyed, bloating their apps either through static linking, including a copy of the set of runtime DLLs, or the .MSI installer in their own distribution.

But then there are those of us who saw through that ruse, gave MS the proverbial finger and linked with the "system MSVCRT.DLL". The result being tiny, highly portable binaries which don't require any installation or dependencies beyond what Windows already has.

[1] https://blogs.msdn.microsoft.com/vcblog/2014/06/10/the-great...


Many OSes unlike UNIX, including Windows, traditionally don't have a system C library, rather multiple compiler vendors each with its own C library.

Which is one reason, when you look at Windows 3.x code, it was common to call Win16 APIs directly instead of ANSI C ones.

For example, ZeroMemory() instead of memset().


In Win10, it does come with a preinstalled CRT, which is going to remain ABI-compatible.

https://blogs.msdn.microsoft.com/vcblog/2015/03/03/introduci...

Also, please don't call them "MSVCRTs". That - MSVCRT.DLL - is the name of the binary of one particular version of the runtime, corresponding to VS 6.0. All further versions had different names. The proper generic term is CRT (as in, "C runtime").


AFAIK, you are supposed to distribute the MSVC runtimes as merge modules within your MSI package, which is what for instance LibreOffice does.


> you are supposed to distribute the MSVC runtimes as merge modules

Right, that’s what MS tell.

What they don’t tell is these merge modules aren’t compatible enough. E.g. it’s impossible to install VC 2015 runtime DLLs (so-called Universal CRT) on a never updated Windows 7 SP1 machine.


Well my work laptop is Windows 7 up to date, with VS 2015 Update 3.

So it got installed there somehow.


The problem only affects PCs that aren’t up to date.

Specifically, PCs without KB2999226 installed.

Not all people update their Windows. Some just don’t care. Others people don’t want to connect PC to the internet. There’re also people who use mobile internet and pay for bandwidth.

When they’re unable to run the software they’ve paid money for, I don’t want to update Windows for them, I want my software to work. Hence, no dynamic CRT.


So you're refusing to have a basic update installed? Let me guess, you're going to complain and say Windows is insecure because you got WannaCry'd because you didn't update your system and blame it on MS instead?

(you = whoever "doesn't care" about updates)


It’s unlikely to get WannaCry unless the PC is connected to internet or LAN.

The software we’re offering doesn’t require to be online. It normally works unattended for hours, controlling some specialized industrial hardware. Not unlike embedded software. For this particular use case, being offline has it’s upsides: no downloads, no updates, no reboots.

The users who discovered that bug (initially I did include these CRT merge modules in my installer, as recommended by MS) didn’t even send us a screenshot. Instead, they took a photo of their PC’s screens with the error message, and sent us that. Then I was able to reproduce on an offline Win7 VmWare machine, and deliver a fixed version with statically linked CRT, which BTW worked OK even on a vanilla Windows 7 sp0 from 2009.

Personally, I would recommend updating Windows instead, and I do update the PCs I own. But I can, and do, support running my software on a never updated system.


> Not all people update their Windows. Some just don’t care. Others people don’t want to connect PC to the internet. There’re also people who use mobile internet and pay for bandwidth.

And sometimes Windows Update simply stops working. I've seen it happen more than once.


You can actually just compile by setting $env:RUSTFLAGS="-C target-feature=+crt-static" and it will completely drop the dependency on vcruntime.dll. Rust is basically only pulling in memcpy/memset/etc. Not really things you'd be worried about statically linking.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: