JPEG XL: How it started, how it’s going

derefr · on July 20, 2023

Question: why do we see stable video and audio "container formats" like MKV that persist as encodings come and go (where you might not be able to play a new .mkv file on an old player, but the expected answer to that is to upgrade your player to a new version, with universal support for pretty much any encoding being an inevitability on at least all software players); but every new image encoding seemingly necessitates its own new container format and file extension, and a minor format war to decide who will support it?

Is this because almost all AV decoders use libffmpeg or a fork thereof; where libffmpeg is basically an "uber-library" that supports all interesting AV formats and codecs; and therefore you can expect ~everything to get support for a new codec whenever libffmpeg includes it (rather than some programs just never ending up supporting the codec)?

If so — is there a reason that there isn't a libffmpeg-like uber-library for image formats+codecs?

mananaysiempre · on July 20, 2023

The original entrant in this competition is TIFF, and—like Matroska or QuickTime add indexing to raw MP3 or MPEG-TS—it does provide useful functionality over raw codec stream non-formats like JPEG (properly JIF/JFIF/EXIF), in the form of striping or tiling and ready-made downscaled versions for the same image. But where unindexed video is essentially unworkable, an untiled image is in most cases OK, except for a couple of narrow application areas that need to deal with humongous amounts of pixel data.

So you’re absolutely going to see TIFF containers with JPEG or JPEG2000 tiles used for geospatial, medical, or hi-res scanned images, but given the sad state of open tooling for all of these, there’s little to no compatibility between their various subsets of the TIFF spec, especially across vendors, and more or less no FOSS beyond libtiff. (Not even viewers for larger-than-RAM images!) Some other people have used TIFF but in places where’s very little to be gained from compatibility (e.g. Canon’s CR2 raw images are TIFF-based, but nobody cares). LogLuv TIFF is a viable HDR format, but it’s in an awkward place between the hobby-renderer-friendly Radiance HDR, the Pixar-backed OpenEXR, and whatever consumer photo thing each of the major vendors is pushing this month; it also doesn’t have a bit-level spec so much as a couple of journal articles and some code in libtiff.

Why did this happen? Aside from the niche character of very large images, Adobe has abandoned the TIFF spec fairly quickly after it acquired it as part of Aldus, but IIUC for the first decade or so of that neglect Adobe legal was nevertheless fairly proactive about shutting up anyone who used the trademarked name for an incompatible extension (like TIFF64—and nowadays if you need TIFF you likely have >2G of data). Admittedly TIFF is also an overly flexible mess, but then so are Matroska (thus the need for the WebM profile of it) and QuickTime/BMFF (thus 3GPP, MOV, MP4, ..., which are vaguely speaking all subsets of the same thing).

One way or another, TIFF is to some extent what you want, but it doesn’t get a lot of use these days. No browser support either, which is likely important. Maybe the HEIF container (yet another QuickTime/BMFF profile) is better from a technical standpoint, but the transitive closure of the relevant ISO specs likely comes at $10k or more. So it’s a bit sad all around.

omoikane · on July 20, 2023

I think TIFF has some unique features that makes it more prone to certain security issues[1] compared to other formats, such as storing absolute file offsets instead of relative offsets. So I am not sure TIFF is a good container format, but many camera raws are TIFF-based for some reason.[2]

[1] https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=libtiff

[2] https://exiftool.org/#supported (search for "TIFF-based")

mananaysiempre · on July 20, 2023

> I think TIFF has some unique features that makes it more prone to certain security issues[] compared to other formats, such as storing absolute file offsets instead of relative offsets.

That’s an impressive number of CVEs for a fairly modest piece of code, although the sheer number of them dated ≥ 2022 baffles me—has a high-profile target started using libtiff recently, or has some hero set up a fuzzer? In any case libtiff is surprisingly nice to use but very old and not that carefully coded, so I’m not shocked.

I’m not sure about the absolute offsets, though. In which respect are those more error-prone? If I was coding a TIFF library in C against ISO or POSIX APIs—and without overflow-detecting arithmetic from GCC or C23—I’d probably prefer to deal with absolute offsets rather than relative ones, just to avoid an extra potentially-overflowing addition whenever I needed an absolute offset for some reason.

There are things I dislike about TIFF, including security-relevant ones. (Perhaps, for example, it’d be better to use a sequential format with some offsets on top, and not TIFF’s sea of offsets with hopefully some sequencing to them. Possibly ISO BMFF is in fact better here; I wouldn’t know, because—well—ISO.) But I don’t understand this particular charge.

omoikane · on July 20, 2023

Absolute file offsets demand a particular memory layout or some extra bookkeeping that could be avoided with relative offsets. If I were to write a JPEG parser, I could write a function to handle one particular segment and not have to worry about other segments because relative offsets makes parsing them independent, compared to TIFF where I need to maintain a directory of things and make sure the offsets land in the right place.

I think parsing file format with absolute offsets is similar to handling a programming language with all GOTOs, compared to relative offsets which are more like structured control flow.

eyesee · on July 21, 2023

If you’re interested in BMFF and don’t care to spend ISO prices, you can always go back to the original, Apple’s QuickTime File Format: https://developer.apple.com/standards/qtff-2001.pdf

mcpackieh · on July 20, 2023

> Admittedly TIFF is also an overly flexible mess, but then so are Matroska (thus the need for the WebM profile of it)

Webm went way too far when they stripped out support for subtitles. The engineers who made that decision should be ashamed.

mananaysiempre · on July 20, 2023

As much as I’m fond of my collection of Matroska files with SSA/ASS subtitle tracks, I don’t think those are appropriate for the Web, what with all the font issues; and SRT is a nightmare of encodings. But apparently there’s a WebM-blessed way[1] of embedding WebVTT ( = SRT + UTF-8 − decimal commas) now? Which is of course different[2] from the more recent Matroska.org-blessed way[3], sigh.

[1] https://www.webmproject.org/docs/container/#webvtt-guideline...

[2] https://trac.ffmpeg.org/ticket/5641

[3] https://matroska.org/technical/codec_specs.html#s_textwebvtt

Lammy · on July 20, 2023

> I don’t think those are appropriate for the Web, what with all the font issues

Fun fact: several broadcast standards use Bitstream TrueDoc Portable Font Resource, which was supported for embedded web fonts way back in Netscape 4:

https://people.apache.org/~jim/NewArchitect/webrevu/1997/11_...

https://web.archive.org/web/20040407162455/http://www.bitstr...

“The PFR specification defines the Bitstream portable font resource (PFR), which is a compact, platform-independent format for representing high-quality, scalable outline fonts.

Many independent organizations responsible for setting digital TV standards have adopted the PFR font format as their standard font format, including:

— ATSC (Advanced Television Systems Committee)

— DAVIC (Digital Audio Visual Council)

— DVB (Digital Video Broadcasting)

— DTG (Digital TV Group)

— MHP (Multimedia Home Platform)

— ISO/IEC 16500-6:1999

— OCAP (OpenCable Application Platform)”

gsich · on July 20, 2023

All text based subtitles share the (non-)issue of encoding. Nothing wrong with SRT, it's UTF8 in Mkv anyway.

killerstorm · on July 20, 2023

Video container formats do something useful: they let you to package several streams together (audio, video, subtitles), and they can take of some important aspects of av streaming, letting codec part to focus on being a codec. They let you to use existing audio codecs with a new video codec.

OTOH a still image container would do nothing useful. If an image is all that needs to be contained, there's no need for a wrapper.

derefr · on July 20, 2023

> a still image container would do nothing useful

It would, at least, create a codec-neutral location and format for image metadata, with codec-neutral (and ideally extensible + vendor-namespaced) fields. EXIF is just a JPEG thing. There is a reason that TIFF is still to this day used in medical imaging — it allows embedding of standardized medical-namespace metadata fields.

Also, presuming the container format itself is extensible, it would also allow the PNG approach to ancillary data embedding ("allow optional chunks with vendor-specific meanings, for data that can be useful to clients, but which image processors can know it's safe to strip without understanding because 'is optional' is a syntactic part of the chunk name") to be used with arbitrary images — in a way where those chunks can even survive the image being transcoded! (If you're unaware, when you transcode a video file between video codecs using e.g. Handbrake, ancillary data like thumbnail and subtitle tracks will be ported as-is to the new file, as long as the new container format also supports those tracks.)

Also, speaking of subtitle tracks, here's something most people may have never considered: you know how video containers can embed "soft" subtitle tracks? Why shouldn't images embed "soft" subtitle tracks, in multiple languages? Why shouldn't you expect your OS screen-reader feature to be able to read you your accessibility-enabled comic books in your native language — and in the right order (an order that, for comic books, a simple OCR-driven text extraction could never figure out)?

(There are community image-curation services that allow images to be user-annotated with soft subtitles; but they do it by storing the subtitle data outside of the image file, in a database; sending the subtitle data separately as an XHR response after the image-display view loads; and then overlaying the soft-subtitle interaction-regions onto the image using client-side Javascript. Which makes sense in a world where users are able to freely edit the subtitles... but in a world where the subtitles are burned into the image at publication time by the author or publisher, it should be the browser [or other image viewer] doing this overlaying! Saving the image file should save the soft subtitles along with it! Just like when right-click-Save-ing a <video> element!)

dist-epoch · on July 20, 2023

> Why shouldn't images embed "soft" subtitle tracks

That would be a layered image format, like .psd (Photoshop).

It's an interesting idea, memes could become editable :)

Lammy · on July 20, 2023

GIF89a actually defines something like this https://www.w3.org/Graphics/GIF/spec-gif89a.txt

“The Plain Text Extension contains textual data and the parameters necessary to render that data as a graphic, in a simple form. The textual data will be encoded with the 7-bit printable ASCII characters. Text data are rendered using a grid of character cells defined by the parameters in the block fields. Each character is rendered in an individual cell. The textual data in this block is to be rendered as mono-spaced characters, one character per cell, with a best fitting font and size.”

“The Comment Extension contains textual information which is not part of the actual graphics in the GIF Data Stream. It is suitable for including comments about the graphics, credits, descriptions or any other type of non-control and non-graphic data.”

I hesitate to say GIF89a "supported" it since in practice approximately zero percent of software can use either extension. `gIFt` was dropped from the PNG spec for this reason: https://w3c.github.io/PNG-spec/extensions/Overview.html#DC.g...

If it had been well-supported we might have avoided the whole GIF pronunciation war. Load up http://cd.textfiles.com/arcadebbs/GIFS/BOB-89A.GIF in http://ata4.github.io/gifiddle/ and check out the last frame :)

dtech · on July 20, 2023

While funny, I think parent meant more like alt text.

derefr · on July 20, 2023

Correct. Not subtitles as a vector layer of the image, but rather subtitles as regions of the image annotated with textual gloss information — information which has no required presentation as part of the rendering of the image, but which the UA is free to use as it pleases in response to user configuration — by presenting the gloss on hover/tap like alt text, yes; or by reading the gloss aloud; or by search-indexing pages of a graphic novel by their textual glosses like how you can search an ePub by text, etc.

In the alt-text case specifically, you could allow for optional styling info so that the gloss can be laid out as a visual replacement for the original text that was on the page. But that's not really necessary, and might even be counterproductive to some use-cases (like when interpretation of the meaning of the text depends on details of typography/calligraphy that can't be conveyed by the gloss, and so the user needs to see the original text with the gloss side-by-side; or when the gloss is a translation and the original is written with poetic meter, such that the user wants the gloss for understanding the words but the original for appreciating the poesy of the work.)

Concrete use-cases:

• the "cleaner" and "layout" roles in the (digitally-distributed) manga localization process, only continue to exist, because soft-subbed images (as standalone documents) aren't a thing. Nobody who has any respect for art wants to be "destructively restoring" an artist's original work and vision, just to translate some text within that work. They'd much rather be able to just hand you the original work, untouched, with some translation "sticky notes" on top, that you can toggle on and off.

• in the case of webcomic images that have a textual "bonus joke" (e.g. XKCD, Dinosaur Comics), where this is currently implemented as alt/title-attribute text — this could be moved into the image itself as a whole-image annotation, such that the "bonus joke" would be archivally preserved alongside the image document.

arthur2e5 · on July 21, 2023

Region annotation is used for some images on Wikimedia Commons and a lot of Manga pages on the booru sites[1]. It's really, really good for translations.

[1]: https://danbooru.donmai.us/posts/6510411?q=arknights

npteljes · on July 20, 2023

I think we could do that with SVG, no? SVG is a vector format of course, but can also have raster parts embedded.

arboles · on July 20, 2023

https://www.openraster.org

yread · on July 20, 2023

That's a very basic view, take a look at TIFF or DICOM specs. It can be useful to have multiple images, resolutions, channels, z or t dimensions, metadata, ... all in a single container as it's all one "image"

Groxx · on July 20, 2023

captions / alt-text could also very reasonably be part of the image, as well as descriptions of regions and other metadata.

there are LOTS of uses for "image containers" that go beyond just pixels. heck, look at EXIF, which is extremely widespread - it's often stripped to save space on the web, but it's definitely useful and used.

dundarious · on July 20, 2023

Container formats for video often need to:

- contain multiple streams of synced video, audio, and subtitles

- contain alternate streams of audio

- contain chapter information

- contain metadata such as artist information

For web distribution of static images, you want almost none of those things, especially regarding alternate streams. You just want to download the one stream you want. Easiest way to do that is to just serve each stream as a separate file, and not mux different streams into a single container in the first place.

Also, I could be wrong on this part, but my understanding is that for web streaming video, you don't really want those mkv* features either. You typically serve individual and separate streams of video, audio, and text, sourced from separate files, and your player/browser syncs them. The alternative would be unnecessary demux on the server side, or the client unnecessarily downloads irrelevant streams.

The metadata is the only case where I see the potential benefit of a single container format.

* Not specific to mkv, other containers have them of course

pixl97 · on July 20, 2023

Container formats increase size. Now for video that doesn't matter much because it doesn't move the needle. For images a container format could be a significant percentage of the total image size.

dundarious · on July 20, 2023

Yes, I focused mostly on the lack of benefit, but even for a single stream, size is another important cost.

bmacho · on July 20, 2023

> The alternative would be unnecessary demux on the server side, or the client unnecessarily downloads irrelevant streams.

HTTP file transfer protocols support partial downloads. A client can choose just to not to download irrelevant audio. I think most common web platforms already work this way, when you open a video it is likely to be in .mp4 format, and you need to get the end of it to play it, so your browser gets that part first. I am not entirely sure.

toast0 · on July 20, 2023

I believe mp4 files can be repackaged to put the bookkeeping data at the front of the file, which makes them playable while doing a sequential download.

Tigress8780 · on July 21, 2023

That metadata is usually put around the end of the file for compatibility reasons, but one can use ffmpeg's `-movflags faststart` option to move it to the beginning (very common in files that are meant to be served on the web).

oefrha · on July 20, 2023

> You typically serve individual and separate streams of video, audio, and text, sourced from separate files, and your player/browser syncs them.

That's one school of thought. Some of the biggest streaming providers simply serve a single muxed video+audio HLS stream based on bandwidth detection. Doesn't work very well for multi-language prerecorded content of course, but that's just one use case.

dundarious · on July 20, 2023

That's true, but my understanding is they serve a specific mux for a specific bandwidth profile, and serve it by just transmitting bytes, no demux required. I didn't mean to imply that wasn't a common option. I only meant to say I don't think a common option is to have a giant mux of all possible bandwidth profiles into one container file, that has to be demuxed at serve time.

My understanding is that YouTube supports both the "separate streams" and "specific mux per-bandwidth profile" methods, and picks one based on the codec support/preferences of the client.

jonsneyers · on July 20, 2023

Containers are just containers — you still need a decoder for their payload codec. This is the same for video and images. For video, containers are more important because you typically have several different codecs being used together (in particular video and audio) and the different bitstreams need to be interleaved.

The ISOBMFF format is used as a container for MP4, JPEG 2000, JPEG XL, HEIF, AVIF, etc.

And yes, there are ffmpeg-like "uber-libraries" for images: ImageMagick, GraphicsMagic, libvips, imlib2 and gdk-pixbuf are examples of those. They support basically all image formats, and applications based on one of these will 'automatically' get JPEG XL support.

Apple also has such an "uber-library" called CoreMedia, which means any application that uses this library will also get JPEG XL support automatically.

capableweb · on July 20, 2023

I'm guessing it's mostly up to mostly tradition/momentum on how the formats where initially created and maintained.

Videos has (most of the time at least) at least two tracks at the same time that has to be syncronized, and most of the time it's one video track and one audio track. With that in mind, it makes sense to wrap those in a "container" and allow the video and audio to be different formats. You also can have multiple audios/video tracks in one file, but I digress.

With images, it didn't make sense at least in the beginning, to have one container because you just have one image (or many, in the case of .gif).

DaleCurtis · on July 20, 2023

We're starting to see a move towards this with HEIF / AVIF containers, however in cases where "every bit must be saved" the general purpose containers like ISO-BMFF introduce some wastage that is unappealing.

derefr · on July 20, 2023

> however in cases where "every bit must be saved" the general purpose containers like ISO-BMFF introduce some wastage that is unappealing.

Sure, but I don't mean general-purpose mulimedia containers (that put a lot of work into making multiple streams seekable with shared timing info.) I mean bit-efficient, image-oriented, but image-encoding-neutral container formats.

There are at least two already-existing extensible image file formats that could be used for this: PNG and TIFF. In fact, TIFF was designed for this purpose — and even has several different encodings it supports!

But in practice, you don't see the people who create new image codecs these days thinking of themselves as creating image codecs — they think of themselves as creating vertically-integrated image formats-plus-codecs. You don't see the authors of these new image specifications thinking "maybe I should be neutral on container format for this codec, and instead just specify what the bitstream for the image data looks like and what metadata would need to be stored about said bitstream to decode it in the abstract; and leave containerizing it to someone else." Let alone do you ever see someone think "hey, maybe I should invent a codec... and then create multiple reference implementations for how it would be stored inside a TIFF container, a PNG container, an MKV container..."

zokier · on July 20, 2023

But HEIC/AVIF did exactly that, defined image format on top of standard container (isobmff/heif). JPEG-XL is the odd one out because it doesn't have standardized HEIF format, but for example JPEG-XS and JPEG-XR are supported in HEIF.

jonsneyers · on July 20, 2023

JPEG XL uses the ISOBMFF container, with an option to skip the container completely and just use a raw codestream. HEIF is also ISOBMFF based but adds more mandatory stuff so you end up with more header overhead, and it adds some functionality at the container level (like layers, or using one codestream for the color image and another codestream for the alpha channel) that is useful for codecs that don't have that functionality at the codestream level — like video codecs which typically only support yuv, so if you want to do alpha you have to do it with one yuv frame and one yuv400 frame, and use a container like HEIF to indicate that the second frame represents the alpha channel. So if you want to use a video codec like HEVC or AV1 for still images and have functionality like alpha channels, ICC profiles, or orientation, then you need such a container since these codecs do not natively support those things. But for JPEG XL this is not needed since JPEG XL already does have native support for all of these things — it was designed to be a still image codec after all. It's also more effective for compression to support these things at the codec level, e.g. in JPEG XL you can have an RGBA palette which can be useful for lossless compression of certain images, while in HEIC/AVIF this is impossible since the RGB and A are in two different codestreams which are independent from one another and only combined at the container level.

It would be possible to define a JPEG XL payload for the HEIF container but it would not really bring anything except a few hundred bytes of extra header overhead and possibly some risk of patent infringement since the IP situation of HEIF is not super clear (Nokia claims it has relevant patents on it, and those are not expired yet).

zokier · on July 21, 2023

> JPEG XL uses the ISOBMFF container, with an option to skip the container completely and just use a raw codestream

Hey, thanks for the clarification! I was basing my info on Wikipedia (my bad), ISO BMFF page doesn't mention JXL at all, and even JPEG XL page has only small print in infobox saying that its "based on" ISO BMFF but the main article text doesn't mention that at all.

> But for JPEG XL this is not needed since JPEG XL already does have native support for all of these things — it was designed to be a still image codec after all

I suppose that is bit the thing grand-parent comment was complaining about, format not designed for general-purpose containers but rather as an standalone thing. I suppose it could be fun thought experiment to imagine what JXL would look like if it were specifically designed to be used in HEIF.

Of course it is well understandable that making tailored purpose-built format ends up better in many ways vs trying to fit into some existing generic thing.

> It would be possible to define a JPEG XL payload for the HEIF container but it would not really bring anything except a few hundred bytes of extra header overhead and possibly some risk of patent infringement since the IP situation of HEIF is not super clear (Nokia claims it has relevant patents on it, and those are not expired yet).

I suppose JXL-in-HEIF would allow some image management tools to have common code path for handling JXL and HEIC/AVIF files, grabbing metadata etc, and possibly would not need any specific JXL support. But that is probably not a practical concern in reality.

netol · on July 20, 2023

And at the same time, we are likely going to use codec-specific extensions for all AOM video codecs (.av1, .av2) as well as for images (.webp2, not sure if .avif2 will ever exist but I guess so), even when the same container is used, as we did with .webm (which was a subset of .mkv)

brucethemoose2 · on July 20, 2023

> with universal support for pretty much any encoding being an inevitability on at least all software players

This is not a good assumption. MKV supports a loooot of things which many video players will not support at all.

And IIRC some browsers do not support MKV.

CharlesW · on July 20, 2023

They all support ISOBMFF, though. https://en.wikipedia.org/wiki/ISO_base_media_file_format

vanderZwan · on July 21, 2023

But do they all support all of ISOBMFF?

GuB-42 · on July 20, 2023

I think that's because video is a much more active and complex topic than still images.

We are still using image formats from the 90s, and their matching containers, and they are good enough, so there is not much work for going beyond that. There is no real incentive for making a more flexible format. By comparison, video is the biggest bandwidth hog and people care a lot.

And mkv supports video, multiple sound tracks, subtitles,... All using different codecs made by different people (ex: h265+opus or vp9+vorbis, or any other combination). An image container usually only has the image and a few metadata.

quackdoc · on July 20, 2023

videos can have their own "containers" too, for instance in AV1 the stream is stored inside of an obu, which is wrapped in a external container. (such as matroska) if you really wanted to you could (and can) put images into containers too, PNGs in a matroska are actually pretty useful way for transfering PNG sequences.

you can also with a simple mod on a older commit of ffmpeg (the commit that added animated jxl broke this method and I haven't gotten around to fixing it) by simply adding jxl 4cc to mux JXL sequences into MKV.

Conscat · on July 20, 2023

There is ImageMagick, which last I checked (a year ago?) didn't support KTX, but it did have DDS and a lot of other niche image formats.

amelius · on July 20, 2023

Question: why don't we:

1. put a reference to the decoder into the header of the compressed file

2. download the decoder only when needed, and cache it if required

3. run the decoder inside a sandbox

4. allow multiple implementations, based on hardware, but at least one reference implementation that runs everywhere

Then we never need any new formats. The system will just support any format. When you're not online, make sure you cached the decoders for whatever files you installed on your system.

tinus_hn · on July 20, 2023

We used to have formats like this, and then the attacker points to his decoder/malware package.

Apart from that of course the decoder has to be fast and thus native and interface with the OS, so the decoder is X86 on the today version of Windows, until the company hosting it dies and the patented, copyrighted decoder disappears from the internet.

Dylan16807 · on July 21, 2023

Formats like what? You're not skipping the "sandbox" part are you?

A decoder can be extremely isolated. It's much easier to sandbox a decoder than to sandbox javascript, for example.

tinus_hn · on July 22, 2023

If you assume you just say the magic word ‘sandbox’ and everything is safe, then yes, security is a solved problem. This is however a prime example of the saying that in theory, there is no difference between theory and practice but in practice there is.

Dylan16807 · on July 22, 2023

You're treating sandboxing like it's all the same, but it's not. Multimedia decoders are one of the absolute easiest things to sandbox. Webassembly was designed for sandboxing and even that's very overcomplicated for what a decoder needs.

If we had standard headers, then reading metadata wouldn't be part of the decoder. The decoder would only need to take in bytes and output a bitmap, or take in bytes and output PCM audio. It doesn't need to be able to call any functions, or run any system calls, and the data it outputs can safely contain any bytes because nothing will interpret it.

It's like taking the very core of webassembly and then not attaching it to anything. The attack surface is astoundingly small.

You just need to give it a big array of memory and let it run arithmetic within that array, plus some control flow instructions. Easy to interpret, easy to safely JIT compile.

tinus_hn · on July 22, 2023

I suggest you try building it and see ‘just’ how easy it is.

Dylan16807 · on July 23, 2023

I have made secure emulators for simple CPUs before. Seriously, you only need a handful of opcodes and they only need to operate on a big array. It's hard to do wrong!

The part of sandboxing that's hard is dealing with I/O, or giving useful tools to the sandboxed code, or implementing data structures for the sandboxed code. You don't need any of that for a multimedia decoder. You just let it manipulate its big block of bytes, and make sure you bounds check.

A Java VM exposes tens of thousands of functions to the code inside it. A barebones sandbox exposes zero. It just waits for the HLT opcode.

And when it gives you raw RGB data, or raw PCM data, there's no way to hide a triggerable malicious payload inside. If the code does something bad, the worst it can do is show you the wrong image.

tinus_hn · on July 23, 2023

Yes, and you can also download a million projects like this, the sandbox exists.

But my suggestion would be that you build a video codec out of it. Preferably one that has the properties the market demands: performance and energy efficiency.

Dylan16807 · on July 23, 2023

Not a codec, the only thing customized for this would be the container format.

You'd use existing codecs, and the way you get good performance and energy efficiency on a video codec is by having a hardware implementation. Software decoding doesn't even come into that picture.

As far as practical software decoding outside of battery-powered video, can I just point at webassembly? Especially the upcoming version with vector instructions. You could use normal webassembly, or even an extra-restricted version. It gets pretty good performance, and when you remove its ability to talk to the outside world it goes from pretty good to extremely good security.

tinus_hn · on July 24, 2023

The container format would be decoded by a sandboxed codec that can be found by decoding the container?

WebAssembly codecs indeed exist, and they are impractical due to a lack in performance.

Dylan16807 · on July 24, 2023

> The container format would be decoded by a sandboxed codec that can be found by decoding the container?

The container parser would not be dynamically downloaded, and may or may not be sandboxed.

We don't need a new container with almost every codec. We just need the new codec itself.

> WebAssembly codecs indeed exist, and they are impractical due to a lack in performance.

Mostly because they don't have vector instructions yet, I bet. But plenty of webassembly is within 50% of native, which is good enough for lots of things, which includes image decoding for sure.

tinus_hn · on July 25, 2023

So now container decoders have been magically vetted and secured so they don’t need the sandbox. Which is quite surprising considering most vulnerabilities in streams are in the container decoders and their multitude of hardly used features, but okay.

The challenge remains for you to actually provide the codec you describe. Which a few comments ago was trivial because it was a hardware codec anyway, now it’s just a bit of WebAssembly away. Well that should be trivial because cross compilers to WebAssembly exist. So why don’t you just provide a few real world examples? Your probably not the first to think of these ideas, there has to be a reason why it hasn’t been done yet.

Dylan16807 · on July 26, 2023

> So now container decoders have been magically vetted and secured so they don’t need the sandbox.

Not "magically". But you only need one or two, and they don't need to be very fast, so you can put a lot of effort into making them secure.

But more importantly, browsers already have many container decoders. This is not an expansion in attack surface. The goal here is allowing a lot more codecs compared to current browsers without a significant increase in attack surface compared to current browsers. Pointing out flaws that already exist doesn't disqualify the idea.

> So why don’t you just provide a few real world examples? Your probably not the first to think of these ideas, there has to be a reason why it hasn’t been done yet.

Image decoders in webassembly already exist. Did you even look? Including JXL!

Video decoding needs more support structure in the browser, and I already said some decoders need things that are being added to webassembly but aren't done yet. Even then, the first google result for "av1 webassembly" is a working decoder from five years ago.

ElectricalUnion · on July 20, 2023

This is what PrintNightmare did?

You no longer need "printer drivers", they're supposed to be automatically downloaded, installed and ran in a sandbox. You never need any "new drivers". The system will support any printer.

Except the "sandbox" was pretty weak and full of holes.

amelius · on July 20, 2023

> Except the "sandbox" was pretty weak and full of holes.

Nothing prevents you from installing only the trusted ones.

Second, software is getting so complicated that if we don't build secure sandboxes anyway then at some point people will be bitten by a supply chain attack.

kalleboo · on July 21, 2023

You can’t cache things online anymore as that is a way to track people (generate a unique “decoder” per user and check if they have it cached or not)

amelius · on July 21, 2023

The solution is to cache things per user, not system wide. Or per "container" like in Firefox containers.

paulddraper · on July 21, 2023

Because that is complex, unnecessary, and dangerous?

codemiscreant · on July 20, 2023

See also- https://dennisforbes.ca/articles/jpegxl_just_won_the_image_w...

It loads JXL if your client supports it.

Recent builds of Chrome and Edge now support and display JXL on iOS 17. They have to use the Safari engine underneath, but previously they suppressed JXL, or maybe the shared engine did.

dagmx · on July 20, 2023

Afaik WebKit added support in iOS 17 so it’s just a transitive win

_khhm · on July 20, 2023

There's only Safari on iOS. EVERY browser on iOS is just a skin on top of Safari's WebKit.

See 2.5.6 here - https://developer.apple.com/app-store/review/guidelines/

codemiscreant · on July 22, 2023

Literally specifically said that. Yet Edge and Chrome on iOS actively suppressed JXL in prior builds (yes they can do that — actually calling it a skin is technically wrong as while the engine is Safari, they have a lot of flexibility), and very recently exposed it.

matricaria · on July 21, 2023

Wasn’t this supposed to change with iOS 17?

tourmalinetaco · on July 21, 2023

Apple is dropping requirements for all browsers to use WebKit, but that doesn’t mean Mozzila/Google/Brave have started releasing them, especially with iOS 17 still being a beta.

https://9to5mac.com/2023/02/07/new-iphone-browsers/

this_user · on July 20, 2023

The problem with trying to replace JPEG is that for most people it's "good enough". We already had "JPEG 2000", which would have been a step up in terms of performance, but it never saw any real adoption. Meanwhile, "JPEG XL" is at best an incremental improvement over "JPEG 2000" from the user's POV, which raises the question why people would care about this if they didn't about the previous one.

jacoblambda · on July 20, 2023

The big reason is that JPEG XL is a seamless migration/lossless conversion from JPEG.

You get better compression and services can deliver multiple resolutions/qualities from the same stored image (reducing storage or compute costs), all transparent to the user.

So your average user will not care but your cloud and web service companies will. They are going to want to adopt this tech once there's widespread support so they can reduce operating costs.

BugsJustFindMe · on July 20, 2023

On top of not being backward compatible, JPEG 2000 was significantly slower and required more RAM to decode, which at the time it was released was a much bigger deal than that is today. And for all of its technical improvements for some domains (transparency, large images without tiling, multiple color spaces), it was not substantially better at compressing images with high contrast edges and high texture frequency regions at low bitrates because it just replaced JPEG's block artifacts with its own substantial smoothing and ringing artifacts.

dale_glass · on July 20, 2023

JPEG 2000 had very bad implementations for a long time.

Second Life went with JPEG2000 for textures, and when they open sourced the client, they had to switch to an open source library that was dog slow. Going into a new area pretty much froze the client for several minutes until the textures finally got decoded.

pgeorgi · on July 20, 2023

JPEG2000 ran into a patent license trap. JPEG XL is explicitly royalty free.

rwmj · on July 20, 2023

But that doesn't address the point that JPEG XL is only marginally better and has a gigantic mountain to climb if it ever hopes to displace JPEG (and likely never will given the vast set of JPEG files that exist and will never be converted).

pgeorgi · on July 20, 2023

JPEG XL can losslessy transcode JPEG into a smaller format. JPEG2000 (or WebP or anything but Lepton[0]) didn't offer that. Besides, we have gif and png for approximately the same space. gif still isn't gone. Displacement isn't necessary for a new format to become useful.

[0] https://github.com/dropbox/lepton

fold3 · on July 21, 2023

Even if JPEG-XL could be considered only 'marginally' better than JPEG, and as good or even worse than avif/webP in some specific context, it is unique in the fact that it can also do lossless, HDR, extremely high res, complex color channels, generation loss protection, multi layers, advanced authoring features, etc... It's not a format only meant for the web but plenty of other use cases such as in science, medecine, art print, etc.

And not only that, it's reasonably fast to encode on consumer hardware.

JPEG XL has the ambition to supplant all the images formats of the next 20+years.

ksec · on July 22, 2023

> JPEG XL is only marginally better

You will have to define what is marginally better. WebP is definitely marginally better than JPEG. And JPEG XL is easily 30- 40% BD-Rate at the same quality at BPP 0.8 or over.

The_Colonel · on July 20, 2023

It's actually pretty rare how successful and long-lasting JPEG has been when you think about it. Relatively simple, elegant compression, but still quite sufficient.

(Having said that I do wish for JPEG XL to become a true successor)

mappu · on July 21, 2023

JPEG was, and remains, "alien technology from the future" (Tim Terriberry)

ktosobcy · on July 20, 2023

Majority of the usual users wouldn't even notice (save for possibly faster page load). JPEG XL has mostly only benefits - backward compatible, can be converted without loss to and from, has better compresión thus smaller sizes and less data to transfer/store and it has nice licensing. JPEG2000 had nothing of that...

acdha · on July 20, 2023

> JPEG2000 had nothing of that...

That’s not true. JPEG 200 had substantially smaller file sizes and better progressive decoding – something like responsive images could have just been an attribute telling your browser how many bytes to request for a given resolution. It also had numerous technical benefits for certain types of images - one codec could handle bitonal images more efficiently than GIF, lossless compressed better than TIFF, effortlessly handle colorspaces and bit depths we’re just starting to use on the web, etc.

What doomed it was clumsy attempts to extract as much license revenue as possible. The companies behind it assumed adoption was inevitable so everything was expensive - pay thousands for the spec, commercial codecs charged pretty high rates, etc. and everyone was so busy pfaffing around with that that they forgot to work on things like interoperability or performance until the 2010s. Faced with paying money to deal with that, most people didn’t and the market moved on with only a few exceptions like certain medical imaging or archival image applications. In the early 2000s the cost of storage and disk/network bandwidth meant you could maybe try to see the numbers as plausibly break-even but over time that faded while the hassle of dealing with the format did not.

yread · on July 20, 2023

There is also JPEG-XR! Life is confusing

theandrewbailey · on July 20, 2023

JPEG XR started off as a Microsoft format about 20 years ago, but no one trusted them to not sue if reimplemented. Microsoft supported XR, but almost nothing else did.

tjalfi · on July 20, 2023

They gave up on it too. Internet Explorer was the last browser to support JPEG XR.

infinet · on July 21, 2023

Image file generated by Zeiss microscope uses JPEG XR.

yread · on July 21, 2023

Yes, I wonder if they got a good deal from MS or why is that. Most of the other vendors went with JP2000 (Aperio, Olympus even Motic)

IshKebab · on July 20, 2023

Yes if it was just about compression ratio nobody would bother, but that's not it's only feature.

tourmalinetaco · on July 21, 2023

Even if it could only encode JPG into a smaller format losslessly it would still be a major selling point that basically everyone in at least the e-commerce world would want. Think about how many hundreds of terabytes of JPG files eBay and Alibaba send per day, and cut that size in half for no quality loss.

Amazon alone sold over 375 million items this last Prime Day. Let’s say that was 200 million items loaded/day (ignoring the unpublished number of failed sales), with 9 images (the maximum from a cursory glance) at 2000x2000 for a 1:1 ratio and zoomability. For a 90% quality JPG at 24-bit color that’s 410KB. ((410KBx9)200,000,000) = 738TB. Now imagine cutting that in half with no perceptive difference except faster loading to the end-user.

For end users the other options may be more desirable, but I would argue the importance is in the compression itself.

swyx · on July 20, 2023

great writeup. i wish it had started with the intro of "wtf is JPEG XL" for those of us not as close to it. but the ending somewhat approximates it. i'm still left not knowing when to use webp, avif, or jxl, and mostly know that they are difficult files to work with because most websites' image file uploaders etc dont support them anyway, so i end up having to open up the file and take a screenshot of them to convert them to jpeg for upload.

so do we think Chrome will reverse their decision to drop support?

pgeorgi · on July 20, 2023

> so do we think Chrome will reverse their decision to drop support?

The argument was that there's no industry support (apparently this means: beyond words in an issue tracker), let's see how acceptance is with Safari supporting it.

An uptick in JXL use sounds like a good-enough reason to re-add JXL support, this time not behind an experimental flag. Maybe Firefox even decides to provide it without a flag and in their regular user build.

mgaunard · on July 20, 2023

We all know what the real argument is: NIH.

bbatsell · on July 20, 2023

One of the three main authors of JXL works for Google. But in the Google Research Zurich office, so he might as well not exist to the Chrome team, I guess.

hortense · on July 20, 2023

Last time I looked, all the top commiters were from Google.

masklinn · on July 20, 2023

Ah yes, NIH from a contributor to the spec, makes complete sense.

mgaunard · on July 22, 2023

I've seen people rage-quit because their work was standardized but with modifications they themselves didn't approve of.

mrguyorama · on July 20, 2023

[flagged]

ruuda · on July 20, 2023

This is definitely not the case, the team behind Pik is eager to get JPEG XL deployed, but Chrome is blocking it.

mrguyorama · on July 20, 2023

I'm not sure what we are misunderstanding from each other. I'm saying google could very well be blocking it's deployment in chrome because it is not the implementation they originally came forward with if they are that petty and NIH.

vanderZwan · on July 20, 2023

The team that wrote the original Google proposal (PIK) joined forces with the team that wrote FUIF and together they created JXL, so no, those particular Googlers are not petty about this.

They're a distinct team from the Chrome team though.

alickz · on July 20, 2023

While that's a possibility the article made it seem like Google's PIK team and Cloudinary worked together on JPEGXL

pgeorgi · on July 20, 2023

Same with av1. It's not verbatim vp10, but incorporates concepts developed by xiph.org and Cisco (and whoever else).

jorvi · on July 20, 2023

What I don’t understand is, why still push for JPEG XL when webP already has a lot of support and AVIF has a lot of momentum?

jacoblambda · on July 20, 2023

JPEG XL and AVIF have tradeoffs.

AVIF works extremely well at compressing images down to very small sizes with minimal losses in quality but loses comparatively to JPEG XL when it comes to compression at higher quality. Also I believe AVIF has an upper limit on canvas sizes (2^16 pixels by 2^16 pixels I think) where JEPGXL doesn't have that limitation.

Also existing JPEGs can be losslessly migrated to JPEGXL which is preferable to a lossy conversion to AVIF.

So it's preferable to have JPEG XL, webP, and AVIF.

- webP fills the PNG role while providing better lossless compression

- AVIF fills the JPEG role for most of your standard web content.

- JPEG XL migrates old JPEG content to get most of the benefits of JPEG XL or AVIF without lossy conversion.

- JPEG XL fills your very-high fidelity image role (currently filled by very large JPEGs or uncompressed TIFFs) while providing very good lossless and lossy compression options.

pmarreck · on July 20, 2023

Possibly an underrated but potentially very useful unique feature of JXL is that it completely eliminates the need to use a third party thumbnail/image-scaling rendering site or workflow. If you need a full size JXL image rendered down to 25% size for one of your web views, you literally just truncate the bitstream at 1/4 the total (or whatever percentage of the total number of pixels of the full-size image you need, that's a trivial math calculation) and send just that.

That's tremendously simpler, both from an architectural and maintenance standpoint (for any site that deals with images), than what you would usually have to do, such as relying on either a third party host (and added cost, latency (without caching), and potential downtime/outage) or pushing it through the (very terrible and memory/cpu-wasteful codebase at this point) ImageMagick/GraphicsMagick library (and potentially managing that conversion as a background job which incurs additional maintenance overhead), or getting VIPS to actually successfully build in your CI/CD workflow (an issue I struggled with in the past while trying to get away from "ImageTragick").

You get to chuck ALL of that and simply hold onto the originals in your choice of stateful store (S3, DB, etc.), possibly caching it locally to the webserver, and just... compute the number of pixels you need given the requested dimensions (which is basically just: ((requested x)*(requested y))/((full-size x)*(full-size y)) percentage of the total binary size, capping at 100%), and bam, truncate.

Having built out multiple image-scaling (and caching, and sometimes third-party-hosted) workflows at this point, this is a very attractive feature, speaking as a developer.

magicalist · on July 20, 2023

That's just progressive decoding, though, and is only possible if you encoded the image correctly (which is optional). You can also do similar things with progressive jpeg, png, and webp, with jpeg being the most flexible.

The unique part AFAIK is that you can order the data blocks however you want, allowing progressive loading that prioritizes more important or higher detailed areas: https://opensource.googleblog.com/2021/09/using-saliency-in-...

(your thumbnails may or may not look terrible this way, as well. really better suited for progressive loading)

lifthrasiir · on July 21, 2023

The thing with JPEG XL though is that its design is inherently progressive. Even when there is no reordering you will get 8x downsampled image before everything else (and the format itself exploits a heck out of this fact for better compression).

cubefox · on July 20, 2023

Apart from limited resolution probably the biggest problem with AVIF: It doesn't support progressive decoding. Which could effectively cancel out its smaller file size for any web applications. AVIF only shows when it is 100% finished. See

https://www.youtube.com/watch?v=UphN1_7nP8U

This comparison video is admittedly a little unfair though, because AVIF would have easily 30% lower file size than JPEG XL on ordinary images with medium quality.

javier2 · on July 20, 2023

Hehe, I see we have been down the same route. Sad to say but ImageMagick is awful at resource usage. VIPS can do 100x better in many specific cases, but is a little brittle. I do not it that incredibly difficult to build though

adzm · on July 20, 2023

This is fascinating, I had never heard of this aspect of JXL.

worrycue · on July 20, 2023

Or JPEG XL can takeover all of it.

- JPEG XL can do lossless compression better than PNG if I’m right.

- At low bit rates, JPEG XL isn’t that far from AVIF quality. You will only use it for less important stuff like “decorations” and previews anyway so we can be less picky about the quality.

- For the main content, you will want high bit rates which is where JPEG XL excels.

- Legacy JPEG can be converted to JPEG XL for space savings at no quality loss.

ksec · on July 20, 2023

Thank You both. Couldn't have said it better.

The use cases of WebP is limited, the actual advantage over decent JPEG and isn't that big, and unless you use a lot of lossless PNG I would argue it should have never been pushed as the replacement of JPEG. To this day I still dont know why people are happy about WebP.

According to Google Chrome, 80% of images transferred has an BPP 1.0 or above. The so called "low bit rate" happens at below BPP 0.5. The current JPEG XL is still no optimised for low bitrate. And judging from the author's tweet I dont think they intend to do it any time soon. And I can understand why.

wongarsu · on July 20, 2023

AVIF is even more limited in resolution than that, just 8.9 megapixels in baseline profile or 35 megapixels in the advanced profile.

If you have image-heavy workflows and care about storage and/or bandwidth then JPEG-XL pairs great with AVIF: JPEG-XL is great for originals and detail views due to its great performance at high quality settings and high resolution support, meanwhile AVIF excels at thumbnails where resolution doesn't matter and you need good performance at low quality settings.

quikoa · on July 20, 2023

JPEG XL Lossless: about 35% smaller than PNG (50% smaller for HDR). Source: https://jpegxl.info/ So with JPEG XL WebP may not serve any real purpose anymore.

prox · on July 20, 2023

Memory is hazy but doesn’t JXL have better colors or color profile support?

vanderZwan · on July 21, 2023

You can scroll down (on mobile) to see an overview image comparing technical features on https://jpegxl.info/. It doesn't mention color profiles (although I presume that just means they're all equal there), but jxl does support higher max bit depth per channel (32 vs 10 for AVIF) and more channels (4099 vs 10). So for raw sensor data, and intermediate formats for image processing, where information loss should be avoided, it should be a lot better.

I'm hoping it gets adopted as a better underlying technology for various RAW formats, and hopefully a better successor to the DNG format while we're at it (currently these are TIFF based). I'm not even a professional photographer, and my hard drive is still mostly occupied by RAW files.

prox · on July 21, 2023

Yeah, the points you mention are what I remember what photographers really dig about JXL. Also higher bit depth is a big deal for some pro photographers.

vanderZwan · on July 23, 2023

I actually studied photography (technically contemporary art, but photography was my main medium) but chose to not pursue a career in it. You are correct, bit depth matters. It is unlikely 32 bits will ever be needed for RAW files though.

Specifically, it matters for source files and intermediate files.

With RAW files from the camera, the higher the bit depth of the analog-to-digital conversion (ADC) step, the less posterization this introduces on the signal. Theoretically at least, you're still limited by the sensor's dynamic range, and there are other subtleties involved, like light perception being logarithmic instead of linear, but RAW encodings being linear[0][1]. But in simple terms: paired with a sensor with high dynamic range and good ADC, a higher bit depth results in less noise and higher dynamic range. Which allows one to recover more fine detail from shadows and highlights. Which makes the camera more forgiving in normally difficult lighting scenes (low light and/or high contrast). So a higher bit depth can aid in giving photographers creative freedom when shooting, and more flexibility in editing their photos without loss of fidelity.

So yes, it is an important cog in the machine that is the whole processing pipeline.

Having said that, as I mentioned our eyes perceive light logarithmically. The dynamic range of the human eye is... complicated to determine, because it adjusts so quickly. At night it may go up to 20 stops, during the day 14 stops is likely to be the typical range[2]. So it's probably not a coincidence that digital cameras have "stalled" at using 14 bits for their RAW files, typically: the photographer likely wouldn't be able to see more contrast in the lights and shadows before taking a photo anyway!

[0] https://www.dpreview.com/articles/4653441881/bit-depth-is-ab...

[1] No I don't understand why floating point ADCs aren't used either, seems like it would be a more sensible approach to me and they do exist: https://ieeexplore.ieee.org/abstract/document/776106

[2] https://clarkvision.com/imagedetail/eye-resolution.html

pgeorgi · on July 20, 2023

According to the article, WebP requires more CPU to decode. JPEG XL also supports lossless transcoding from JPEG, so it could be used for old image sets with no loss in image fidelity.

There are arguments for the new format, but the Chrome people seemed unwilling to maintain support for it when pick-up was non-existent (Firefox could have moved it out of their purgatory. Safari could have implemented it earlier. Edge could have enabled it by default. Sites could use polyfills to demonstrate that they want the desirable properties. And so on.)

To me, the situation was one of "If Chrome enables it, people will whine how Chrome forces file formats onto everybody, making the web platform harder to reimplement, a clear signal of domination. If they don't enable it, people will whine how Chrome doesn't push the format, a clear signal of domination", and they chose to use the variant of the lose-lose scenario that means less work down the road.

cubefox · on July 20, 2023

> There are arguments for the new format, but the Chrome people seemed unwilling to maintain support for it when pick-up was non-existent

Of course there is no pick-up when Chrome, with its massive market share, doesn't support it. Demanding pick-up before support makes no sense for an entity with such a large dominance.

pgeorgi · on July 20, 2023

- Polyfills (there _is_ polyfill-enabling code - maintained by Chrome devs.)

- Microsoft enabling the flag in Edge by default and telling people that websites can be 30% smaller/faster in Edge, automatically adding JXL conversion in their web frameworks

- Apple doing the same with Safari (what they're _now_ doing)

- Mozilla doing the same with Firefox (instead of hiding that feature in a developer-only build behind a flag)

None of that happened so far, only the mixed signal of "lead and we'll follow" and "you are too powerful, stop dominating us." in some issue tracker _after_ the code has been removed.

cubefox · on July 21, 2023

Why are you talking about Microsoft, Apple, and Mozilla, when Chrome has a larger market share than all of them?

> "you are too powerful, stop dominating us."

That's twisting things. The problem was that the argument of the Chrome team against JPEG XL was self-refuting. They were themselves the main cause of what they complained about.

pgeorgi · on July 21, 2023

Because Microsoft, Apple and Mozilla can still exert pressure: "Support this feature we enabled and benefit from 20% less traffic with users of our browsers" and "Use Edge/Safari/Firefox to browse the web faster (and with metered connections: cheaper)" still has an effect on Chrome's decision making.

Chrome had that code, hidden behind a flag. There wasn't any kind of activity. No questions "when will you put it in by default in Chrome?". No other Blink-based browser (Edge, Brave, Vivaldi, Opera) that could easily pick up the support by enabling that damn flag by default did so. Firefox hid JXL support even better than Chrome. No image sharing site that did the math and considered "200KB for a polyfill saves us and our users megabytes in traffic on each visit" and acted on that.

That doesn't look like anybody is interested in JXL support.

I'm bringing this up again and again because I dislike that notion of "Chrome is the market leader and we're powerless to do anything about it. Bad Google." It neither encourages the Chrome folks to do better nor anybody else to pick up the slack. It's 100% complaint, no matter what Chrome does.

fireweed · on July 21, 2023

That's why I hate Chrome's monopoly.

brucethemoose2 · on July 20, 2023

> so do we think Chrome will reverse their decision to drop support?

Nope.

Microsoft could probably push Google over the Edge. They have a lot of influence over Chrome with Edge/Windows defaults, business apps and such.

youngtaff · on July 20, 2023

If they don’t they’re going to look pretty stupid… Chrome Leadership stated “we’ll only support JXL if Safari do” to at least one large tech company who were unhappy with JXL being dropped by Chrome

(and no I can’t tell you how I know this)

asddubs · on July 21, 2023

Microsoft also weirdly go out of their way to strip AVIF support from Edge

brucethemoose2 · on July 21, 2023

I noticed that.

It feels disturbingly tribal.

ocdtrekkie · on July 20, 2023

The only way to strip Chrome of their monopoly power is to remove their decisionmaking mattering: Switch all your stuff on your websites to JXL, let Chrome provide a bad experience, and then it's up to them if they fix it.

jraph · on July 20, 2023

Sadly, unless many webmasters do it, it'll likely feel like "your site is broken" instead of "Chrome is broken" to users.

Maybe with a banner like "You are using Chrome. You might have a degraded experience due to the lack of support for better image formats. Consider Firefox".

pgeorgi · on July 20, 2023

Provide alternative formats, making the browser autoselect. Or provide a JXL polyfill.

In either case: Have Chrome telemetry report home that "user could have 20% faster page load with JXL support".

wolpoli · on July 20, 2023

JXL polyfill is definitely the way to go to force Chrome's hand.

alwillis · on July 20, 2023

It’s not going to force Google’s hand. Using a polyfill will slow page loads and adds additional fragility.

Nothing changes for Chrome users, especially sites using the <picture> element where the first supported image format is used.

worrycue · on July 20, 2023

It will slow page loads for Chrome and browsers not supporting it but it will be blazing fast on Safari. You keep the file size savings and high image quality.

I’m sure YouTubers and tech sites will love to do Safari vs Chrome (and Co.) content to spread the message that Chrome is inferior.

Zen1th · on July 20, 2023

Feels a little like a déjà vu from an older browser by a big company that had a monopoly.. IMO, the sad truth is that what will happen is that JXL simply won't be used because it's not worth it to lose customers in exchange for a few kilobytes saved. Google has won, it has a monopoly on search, advertisement and the browser and decides de facto of all standards.

yboris · on July 20, 2023

Chrome did not settle things with its decision to not use JXL at this time.

(WASM) Polyfill and we're done.

jraph · on July 21, 2023

> Feels a little like a déjà vu from an older browser by a big company that had a monopoly..

Any resemblance with previous events would be totally unintentional, of course :-)

jedberg · on July 20, 2023

This gives me a ping of nostalgia from back in the day with JPEG was new and you had to have an external application to see jpg files until the browsers started adopting the standard. Then you had to decide if there were enough jpg images on the sites you liked to warrant changing browsers!

drcongo · on July 20, 2023

I feel like JPEG XL's problem is branding. The name suggests it's like JPEG, but the file size will be bigger which isn't something I want.

Tommstein · on July 20, 2023

Don't think that's the problem, but agree with what the name immediately suggests. It wouldn't have been very hard to come up with a name that implies "these files are better" instead of "these files are extra large."

Clamchop · on July 20, 2023

XL is pretty well understood to mean extra large, no?

Tommstein · on July 20, 2023

Yeah, that's the only thing I (and I would bet 99.999% of everyone else) have ever understood it to mean.

devnullbrain · on July 21, 2023

HF, like a Lancia.

formerly_proven · on July 20, 2023

Should've been JPEG SX.

wcfields · on July 20, 2023

Sounds like a budget car sold in SE Asia / Latin America in the base trim level.

"Can't decide between a Nissan March, Jpeg SX, or the Honda Jazz EX"

aidenn0 · on July 20, 2023

JPEG DX2/66?

PaulHoule · on July 20, 2023

I am not sure I believe the results from models like SSIMULACRA.

It might be I am not encoding properly but when I did trials with a small number of photos with the goal of compressing pictures I took with my Sony α7ii at high quality I came to the conclusion that WEBP was consistently better than JPEG but AVIF was not better than WEBP. I did think AVIF came out ahead at lower qualities as you might use for a hero image for a blog.

Lately I've been thinking about publishing wide color gamut images to the web, this started out with my discovery that a (roughly) Adobe RGB monitor adds red when you ask for an sRGB green because the sRGB green is yellower than the Adobe RGB green and this is disasterous if you are making red-cyan stereograms.

Once I got this phenomenon under control I got interested in publishing my flat photos in wide color gamut, I usually process in ProPhotoRGB so the first part is straightforward. A lot of mobile devices are close to Display P3, many TV sets and newer monitors approach Rec 2020 but I don't think cover it that well except for a crazy expensive monitor from Dolby.

Color space diagram here: https://en.wikipedia.org/wiki/Rec._2020#/media/File:CIE1931x...

Adobe RGB and Display P3 aren't much bigger than the sRGB space so they still work OK with 8-bit color channels but if you want to work in ProPhotoRGB or Rec 2020 you really need more bits, my mastering is done in 16 bits but to publish people usually use 10-bit or 12-bit formats which has re-awakened my interest in AVIF and JPEG XL.

I'm not so sure if it is worth it though because the space of colors that appear in natural scenes is a only bit bigger than sRGB

https://tftcentral.co.uk/articles/pointers_gamut

but much smaller than space of colors that you could perceive in theory (like the green of a green laser pointer. Definitely Adobe RGB covers the colors you can print with a CMYK process well, but people aren't screaming out for extreme colors although I expect to increasingly be able to deliver them. So on one hand I am thinking of how to use those colors in a meaningful way but also the risk of screwing up my images with glitchy software.

adrian_b · on July 20, 2023

Display P3, which is what most good but still cheap monitors support, is very noticeably much bigger than sRGB, i.e. the red of Display P3 looks reasonably pure, while the red of sRGB is unacceptably washed out and yellowish.

Adobe RGB was conceived for printing better images and it is not useful on monitors because it does not correct the main defect of sRGB, which is the red.

Moreover, if I switch my Dell Display P3 monitor (U2720Q) from 30-bit color to 24-bit color, it becomes obviously worse.

So, at least in my experience, 10-bit per color component is always necessary for Display P3 in order to benefit from its improvements, and on monitors there is a very visible difference between Display P3 (or DCI P3) and sRGB.

There are a lot of red objects that you can see every day and which have a more saturated red than what can be reproduced by an sRGB monitor, e.g. clothes, flowers or even blood.

For distributing images or movies, I agree that the Rec. 2020 color space is the right choice, even if only few people have laser projectors that can reproduce the entire Rec. 2020 color space.

The few with appropriate devices can reproduce the images as distributed, while for the others it is very simple to convert the color space, unlike in the case when the images are distributed in an obsolete color space like sRGB, or even Adobe RGB, when all those with better displays are still forced to view an image with inferior quality.

zokier · on July 20, 2023

Personally I think these days ideally you should be able to just publish in Rec2020 and let devices convert that to their native colorspace. I'd consider AdobeRGB purely legacy thing that doesn't really have relevance these days. Display-P3 makes sense if you are living and targeting exclusively Apple ecosystem, but not much otherwise. ProPhoto is good in itself, but idk if it really makes sense to have separate processing (rgb) colorspace anymore when Rec2020 is already so wide. Of course if you have working ProPhoto workflow then I suppose it doesn't make sense to change it.

adrian_b · on July 20, 2023

I agree with you, except that Display P3 is not exclusive to Apple.

A lot of monitors from most vendors support Display P3, even if it is usually named slightly erroneously as DCI P3.

Display P3 differs from the original DCI P3 specification by having the same white color and the same gamma as sRGB, which is convenient for the manufacturers because all such monitors can be switched between the sRGB mode (which is normally the default mode) and the Display P3 mode.

Nonetheless, even if today most people that have something better than junk sRGB displays have Display P3 monitors (many even without knowing this, because they have not attempted to change the default sRGB color space of their monitors), images or movies should be distributed as you say, using the Rec. 2020 color space, so that those with the best displays shall be able to see the best available quality of the image, while the others will be able to see an image with a quality as good as allowed by their displays.

ndriscoll · on July 20, 2023

I don't think it's fair to equate colors in natural scenes with the space of colors you find with diffuse reflection. There are tons of things (fireworks, light shows, the sky, your 1337 RGB LED setup, fluorescent art, etc.) people may want to take photos of that include emission, scattering, specular reflection, etc.

In practice that larger space of things you could perceive "in theory" is full of everyday phenomena, and very brilliant colors and HDR scenes (e.g. fireworks against a dark sky) tend to be something people particularly enjoy looking at/taking pictures of.

PaulHoule · on July 20, 2023

Fireworks look like a good subject but specular reflections give me the creeps. (e.g. blow out the sensor of my camera and/or not being well reproduced.)

chungy · on July 20, 2023

> I came to the conclusion that WEBP was consistently better than JPEG

This surprises me greatly if you're talking about image quality. I've always found WebP to be consistently worse than JPEG in quality.

I only use WebP for lossless images, because at least then being smaller than PNG is an advantage.

brucethemoose2 · on July 20, 2023

Eh... The Apple ecosystem is relatively isolated.

They adopted HEIF, and have not adopted AV1 video.

est31 · on July 20, 2023

They also adopted HEIC which is actually quite dangerous thing for the open web to be supported by a browser, given how heavily patented the standard is.

pixelesque · on July 21, 2023

They adopted it at the OS level years ago (when iPhones started saving it as an option), but I think it's only been in iOS 17 Safari this year (i.e. in beta) that Safari itself has started supporting it...

(At least Safari 16.5.2 on Ventura 13.4.1 won't open .heif / .heic files for me).

brucethemoose2 · on July 20, 2023

Yeah thats what I was thinking of, I forgot the distinction (and the other HEIF variants)

alwillis · on July 20, 2023

> Eh... The Apple ecosystem is relatively isolated.

Sure, Apple shipped the first consumer computer that supported Display P3 in 2015 [1].

And while there are several other vendors including Google with devices that support Display P3, Apple’s 2 billion devices is not nothin’.

[1]: https://en.m.wikipedia.org/wiki/DCI-P3#History

jokoon · on July 20, 2023

I wish they would include the BPG format from Bellard, even though I don't know if that format is free from any inconvenient https://bellard.org/bpg/

Note that jpg xl is different from jpg 2000 and jpg xr

The_Colonel · on July 20, 2023

BPG is based on HEVC which has patent/licensing baggage.

ksec · on July 20, 2023

In previous test by Cloudinary BPG were surprising better than AV1 and JPEG XL in some / many categories. Which lead me to believe VVC in BPG would have done even better.

reyqn · on July 21, 2023

I can't find what you're referring to, but I would be interested if you could share it.

awestroke · on July 20, 2023

I truly think jpeg xl would have done better with a better name.

Clamchop · on July 20, 2023

It's a pretty stylish and parsimonious file extension even if long file extensions are well-supported (and they are), but I think *.jpegxl should also be recognized. Have a feeling it will be de facto recognized eventually, if the format itself gets traction.

dmbche · on July 20, 2023

I like JXL better than Jpeg XL - if it is brought to the masses I can imagine "jexel" to replace "jaypeg"

yboris · on July 20, 2023

The extension for the file format is .jxl and is pronounced "jixel" :)

jedberg · on July 20, 2023

It's hilarious to me that 25ish years after the death of DOS, we still define dot-three-letter file extensions in new standards.

pmarreck · on July 20, 2023

Right?

Well, at least in the tiny part of the IT world I get to control, I always try to validate based on both the three letter extension and any common or sensible expansion of that. So ".jpg" or ".jpeg", ".jxl" or ".jpegxl" etc. etc. (And in most cases, I actually try to parse the binary itself, because you can't trust the extension much anyway.)

sedatk · on July 20, 2023

"I thought we were free to choose our filename extensions to our liking?"

"Well, three characters is the bare minimum. If you feel that three characters is enough, then okay. But some people choose to have longer filename extensions, and we encourage that, okay? You do want to express yourself, don't you?"

mrguyorama · on July 20, 2023

Especially funny since JPEG images commonly have .jpeg as an extension!

skeaker · on July 20, 2023

If it ain't broke...

wongarsu · on July 20, 2023

Ah, following the footsteps of .png (which is pronounced ping, according to the standard)

The_Colonel · on July 20, 2023

I actually think it referencing JPEG is smart as it's immediately recognizable even by regular users as "image" and positions the format as a successor.

awestroke · on July 21, 2023

JPEGX or JPEG2 would be more marketable than JPEG XL.

The_Colonel · on July 22, 2023

TBH I just don't see a difference.

mihaic · on July 20, 2023

Maybe dumb question: If JPEG XL beats avif, and both are royalty free, shouldn't the AV group create a new video format based on av1 that for I-frames uses JPEG XL?

I mean, it feels like the same static image codec should be used in whatever free standard is being pushed for both video I-frames and images, since the problem is basically the same.

rhn_mk1 · on July 20, 2023

IIRC, JPEG XL beats avif on high-quality images, and avif is better on low quality. For typical vieo encoding, you don't care about perfection that much.

formerly_proven · on July 20, 2023

Specifically, the biggest users of video codecs are video streaming websites/services where bitrate, not quality, is king. Logically, their codecs of choice are optimized toward the low bitrate, "sorta acceptable, but I thought I bought a 4K TV?" quality corner.

The_Colonel · on July 20, 2023

Yeah, while in the case of images, the quality requirements are usually way higher.

bcatanzaro · on July 20, 2023

Oh man. We’re still dealing with the .heic debacle where you can’t use photos from your iPhone with many applications (like Gmail) unless you manually convert them to .jpg

So crazy to me that Apple and Google fight over image formats like this.

I guess this is just the next round.

worrycue · on July 21, 2023

Can’t you just set your iPhone to use the most compatible format (forcing it to use JPEG/h.264) for its camera?

willtemperley · on July 20, 2023

A short explanation of what JPEG XL is or does at the beginning of the article would have been nice. Saying:

"""Google PIK + Cloudinary FUIF = JPEG XL"""

Before saying what it is, was a little of-putting.

Dylan16807 · on July 20, 2023

It's a section header in an article written as a story. It's normal for those to not be understood until you read the section. And the explanation begins with the first sentence of that section. I don't think this is a reasonable complaint.

willtemperley · on July 21, 2023

But they didn't even explain what FUIF or PIK might be in that section or even the entire article!

To understand that article required me searching for FUIF [1], PIK [2] and a brief explanation of what JPEG XL is trying to achieve.

I double down on my "complaint" - I'd call it constructive criticism - that article was poorly written. It's actually quite a good story that their Free Universal Image Format (FUIF) has achieved what it has. That's a great acronym, especially for a world that thinks JPEG XL is a good acronym! Why not put in in the article.

To save anyone else time:

[1] https://github.com/cloudinary/fuif [2] https://github.com/google/pik

Dylan16807 · on July 21, 2023

That's a fair criticism, though personally I feel like "it was a candidate to be the next JPEG" is enough information here, even if it's a bit barebones. There are several levels of detail you could go into about how the codec works, and this article decided to stay quite light.

chad1n · on July 20, 2023

I feel like Apple came to support JPEG XL too late, it will never take over like JPEG did, because Google dropped support for it in Chrome to support their own webps and avifs.

vanderZwan · on July 21, 2023

JPEG had a pretty slow start too though. I remember getting a viewer program on some shareware CD-ROM that would then take forever to decode a high-res image of an astronaut in the early nineties and not understandig what this was useful for.

jshier · on July 20, 2023

Oddly, Safari Tech Preview on Ventura advertises support for JXL but the images don't actually render. So the linked page has almost no images, just broken placeholders.

markdog12 · on July 20, 2023

The image formats (at least newer ones) in Safari defer to OS support, so you'll need Sonoma to view JXL in Safari.

jshier · on July 20, 2023

Then you'd think STP wouldn't offer support for JXL. But it does, both to the site itself and in the release notes.

malnourish · on July 20, 2023

The images shown to me are .avif (Firefox and Chrome on Windows)

ComputerGuru · on July 20, 2023

The header image I see is indeed an AVIF but it depends on what your browser sends in the `Accept` header. Chrome sends image/avif,image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8 but if you drop image/avif from there you get a webp and if also drop image/webp from the header you finally end up with a jpeg.

However if you manually request that image with a custom `image/jxl` at the start of the `Accept:` header, you get a JPEG XL result. So GP is correct, but you won't see that behavior except on their PC (errr, Mac) -- unless you use Firefox and enable JPEGXL support in about:config, of course.

jshier · on July 20, 2023

The STP request includes jxl.

image/webp,image/avif,image/jxl,image/heic,image/heic-sequence,video/;q=0.8,image/png,image/svg+xml,image/;q=0.8,/;q=0.5

ComputerGuru · on July 20, 2023

Yeah, I was explaining how you could reproduce those results without being on the Safari preview.

yread · on July 20, 2023

How does mozjpeg compare to libjpeg-turbo? at what quality is jxl faster than mozjpeg/libjpeg-turbo?

jahav · on July 20, 2023

> MozJPEG is a patch for libjpeg-turbo. Please send pull requests to libjpeg-turbo if the changes aren't specific to newly-added MozJPEG-only compression code.

https://github.com/mozilla/mozjpeg#mozilla-jpeg-encoder-proj...

JyrkiAlakuijala · on July 21, 2023

Libjxl includes a better jpeg encoder/decoder, too, called jpegli. It can be used as a drop-in replacement of mozjpeg or libjpeg(-turbo). It gives ~25 % more density in the high end and allows for 10+ bits (important for HDR use).

chungy · on July 20, 2023

Apples and oranges. JPEG XL is a new codec entirely (though it allows lossless conversion from JPEG).

Joel_Mckay · on July 20, 2023

In general, the legacy hardware codec deployments are more important than what some ambitious software vendors think is "better". The primary inertia of media publishing markets, is content that will deliver properly on all platforms with legacy compatibility.

Initially, a new software codec will grind the cpu and battery-life like its on a 20 year old phone. Then often becomes pipelined into premium GPUs for fringe users, and finally mainstreamed by mobile publishers to save quality/bandwidth when the market is viable (i.e. above 80% of users).

If anyone thinks they can shortcut this process, or repeat a lock-down of the market with 1990s licensing models... than it will end badly for the project. There are decades of media content and free codecs keeping the distribution standards firmly anchored in compatibility mode. These popular choices become entrenched as old Patents expire on "good-enough" popular formats.

Best of luck, =)

tambre · on July 20, 2023

Doesn't seem to much too relevant for image codecs though, no? Decoding 10s of still images on a CPU for a webpage that'll be used for minutes versus 10s of delta images in a much more complicated video coded aren't quite comparable.

I don't think we have much deployment of WebP or AVIF hardware decoders yet the formats have widespread use and adoption.

Joel_Mckay · on July 20, 2023

WebP has problems (several unreported CVEs too), and is essentially another tga disaster under the hood. Anyone smart keeps those little monsters in a box. =)

What I am seeing is the high-water-mark for WebP was 3 years ago... It is already dead as a format, but may find a niche use-case for some users.

Consider built-in web-cams that have hardware h264 codecs built-in, as the main cpu just has to stream the data with 1 core. Better battery life for both the sender and receiver.

Keep in mind a single web page may have hundreds of images, and mjpeg streams are still popular in machine-vision use-cases. As most media/gpu hardware is now integrated into most modern browsers, the inertia of "good-enough" will likely remain.. and become permanent as patents expire. =)

brucethemoose2 · on July 20, 2023

You are thinking of video codecs.

Is hardware avif decoding done anywhere? The only example I can think of where this is done is HEIF on iOS devices, maybe.

Some cloud GPUs have jpeg decoding blocks for ingesting tons of images, but that'd not really the same thing.