Those ideas fail for anyone with a modern screen, which goes far beyond sRGB and it's ancient 80 nits brightness. I doubt there's a phone, laptop, PC monitor, or TV made with such low limits now.
Ah so that's why so many movies, shows and even videogames got so dark you can barely see a thing, unless you're viewing them on a relatively recent TV?
Sooo... there's a whole story here of interacting forces, technology advancements, etc...
I've recently dipped a toe into this space because I got myself a camera that can shoot 8K HDR "raw" footage with 12-bit color. As I learned to edit and process this into something I can upload to YouTube I got a glimpse into the madness that's going on in the industry.
Basically, digital cameras got really good really quickly, and at the same time OLED monitors and super-bright LCDs with local dimming become available for mere mortals.
The industry meanwhile was stuck in the mindset that there is only one standard, and it is basically "whatever the RGB phosphors of a CRT TV made in the 1980s did". The software, the hardware, the entire pipeline revolved around some very old assumptions that were all violated by advancing technology. The changes were so rapid that mistakes are still common.
Essentially, video editing tools had to "grow up" and deal with color management properly, but there was an awful lot of push-back from both the editors/colorists, and the vendors themselves.
Examples:
- Professional grading monitors have buttons on the side to emulate various color spaces. These buttons generally don't "report back" the active color space to the OS or the grading software. It's possible to complete an entire project and not notice that what you see on your setup is not at all what anyone else will see. ("Oops.")
- Some OLED grading monitors are so good now that in a dark room you won't notice that you've accidentally packed the brightness range into the bottom 10% of the signal range. (This is basically what happened with that episode of Game of Thrones.)
- Both recording devices and color grading software like DaVinci Resolve treats color "artistically" and are basically incapable of the equivalent of "strong typing". Video from the camera is often not tagged with the color space used, which is just crazy to me, but this is the "way things are done"! Similarly, these tools generally don't have a strict mapping into the working space, they allow overrides and it's all very flexible and squishy.
- Colorists are so used to the specifics of old formats that they think in terms of RGB values in the encoded output space. Same as Photoshop users who think in terms of 255,127,0 being a specific color instead of a specific encoding. ("In what space!?") This extends to tooling like Resolve that shows the output-space values instead of the actual color space in all controls.
- Video cards and monitors "do their own thing". Without closing the loop with a hardware calibrator, there is absolutely no way to know what you're actually seeing is what others will see.
- The software has a mind of its own too. It's absurdly difficult to avoid "black crush" especially. It just happens. Forums are full of people basically fiddling with every combination of checkboxes and drop-down options until it "goes away"... on their specific monitor and driver setup. Then it looks too bright on YouTube... but not NetFlix. Or whatever.
- Basic assumptions one might have about HDR editing are still mired in the SDR world. E.g.: I would expect a fade-out to be equivalent to turning down exposure of the camera. It isn't! Instead it's the equivalent of blending the output tone mapped image with black, so bright parts of the scene turn grey instead of less bright.
For reference, outside of the video editing space, tools like Adobe Lightroom (for stills photo editing) are much more strict and consistent in the way they work. RAW stills photos are always tagged with the input gamut and color space, are automatically mapped to a very wide gamut that won't "clip" during editing, and all editing controls operate in this ultra-HDR space. It's only the final output step that tone maps into the target HDR or SDR space.
As a random example of this, switching from SDR to HDR mode in Lightroom will just make the highlights get brighter and more colorful. In DaVinci Resolve, unpredictable brightness shifts will occur throughout the image.
Let me add two things to this madness that come to my mind out of the bat:
- observer metamerism (somewhat obvious)
- I don't have a name for that, but once (at some point tried it again and couldn't reproduce), I dragged a window between my 2 monitors. As it was mostly on the 2nd monitor, the image colors suddenly shifted in hue. Not just on the 2nd monitor, on the 1st one too. And the colors were obviously wrong (way too purple) on both monitors.
If you shoot video raw, do yourself a favour and use proper development tools to deal with footage. As an example, RawTherapee uses probably the most “strongly typed” approach to colour (and RawPedia is a treasure trove of advice for nearly every step starting with pre-production, such as creating relevant calibration profiles and flat/dark frames).
Still image RAW editing has mostly been correct for the popular tools for about a decade now, maybe longer.
The history behind this is that a 12-bit or even 14-bit still image (photo) is big but "not that big" (100 MB at most), so it was possible to process them in a wide-gamut HDR space for a long time now even on very slow hardware.
For video, even 10-bit support is pushing the limits of computer power! Sure, playback has been sorted for a while now, but editing of multiple overlapping pieces of footage during a transition can be a problem. Unlike with a still photo, the editing software has to be real-time, otherwise it stutters and is unusable.
Consider that Lightroom does everything in a floating point 32-bit linear space, because... why not? It takes a lot of computer power, sure, but cheap enough not to matter in practice.
Video editing tools try to keep video in the original 8-bit or 10-bit encoding as much as possible for performance.
There's also another reason: older cameras would output only 8-bit files with maybe 7-bits of dynamic range in it, so if you manipulated them too much things would fall apart. Just the floating point rounding error converting to a linear space and back would cause visible banding! So all editing tools tried to operate in the camera's native encoding space so that they minimise precision issues. E.g.: Many video editing tools have controls that just add a constant value to the output space! E.g.: 255,127,10 is mapped to 255-5,127-5,10-5 = 250,122,5. This is naive and "wrong" but it preserves the dynamic range as much as possible.
This isn't just "fine", it's essentially the only sane thing to do. That intermediate space ought to be "linear light" so that operations like resize or blur work properly.
This is the only mode in which Lightroom operates. There's basically no other way of using it.
It's not only not the default in video editing tools, it's decidedly non-standard and you have to go out of your way to enable it. Then everything breaks because the tools still have too many baked-in assumptions that you're operating in the output space or some other non-linear space.
In a tutorial video one professional Hollywood colorist said something like: "This color space has a gamma curve of blah-blah-blah, which means that the offset control will apply an exposure compensation."
That blew my mind: edit controls that change behaviour based on the input and output formats to the point that "addition" becomes "multiplication"!!
Addition becoming multiplication is simply the difference between linear and gamma encoded space. Gamma curves are mostly log based since it maximizes bit usage in formats, and given 2 linear colors A,B, and an addition operator, applying it in gamma space is logA + logB = log(A*B).
Sure. So back to your previous point, when you develop the source CinemaDNG footage you do not deal with any transitions, are not cutting it, and do not need it to be real-time. It is simply a bunch of stills, which you process into a straightforward sequence of pre-graded PNGs in your desired colour space that you are free to edit together however you like at blazing fast speeds. Unless you go crazy on some very specific transitions (and who uses transitions anyway in film these days) that don’t work in final colour space, you do not need raw footage at edit time, do you?
I guess I know the answer, if you work in this industry then punching in/out or stabilizing in post is a common requirement and you mentioned some tools can mess up even at resizing stage.
Something I’ve noticed in fields where I am a professional is that it’s the “WTFs per hour” from intelligent beginners that best measures how broken some ecosystem is.
Then try workflow I described. Too cumbersome and exotic for a pro, somewhat clumsy or nonexistent GUIs, but almost entirely open-source and very few WTFs.
Madness is the right word. The situation is mad, and you'll drive yourself mad trying to get things to "look right."
I hate this situation because all the hardware available now is great, but none of the software works properly together.
And of course, it looks great on your hardware calibrated HDR OLED display with every piece of software using the correct color profiles, but then it looks like a turd on grandma's Windows 7 PC.
Don't even read this or it's straight to the asylum:
Sharing content in HDR content as-intended is basically impossible outside of a major streaming vendor like NetFlix or Apple TV. That's academic anyway, because Dolby Vision is unavailable to mere mortals. YouTube mostly works most of the time, but still has maddening issues like processing SDR first, and then HDR some unspecified time later. Eventually. Maybe. Just keep refreshing!
It blows my mind that pretty much the only easy consumer-grade HDR content sharing that Just Works is Apple iMessage.
There's basically nothing else.
Try uploading a HDR still image or a HDR video to any social site and see what happens!
Or email it...
Or put it in a web page.
Or anything that involves another, unspecified device.