Like the author said this is completely unoptimized. The natural next step in op...

phire · on Nov 25, 2021

The algorithm is extremely resistant to SIMD optimizations.

Every pixel uses a different encoding, 95% of the encodings rely on the value of the previous pixel, or the accumulated state of all previously processed pixels. The number of bytes per pixel and pixels per byte swing wildly.

SIMD optimization would basically require redesigning it from scratch.

sitkack · on Nov 25, 2021

SIMD only gets you up to the width that your hardware platform supports and every SIMD program has to be rewritten for the new width.

Two other immediate avenues are multithreading, which think could be quite effective for this algorithm or GLSL, of that I have no opinion.