I'm all for people forging ahead on AV1, but I'm not personally bothering with it yet. Most of my library is currently in HEVC and I'm not going to switch until I don't even have to think about whether all my equipment supports hardware AV1 decoding.
The more stuff you store away in a format that you can't decode for free, the worse off you're gonna be, trust me. I recently started self-hosting my family's photos and videos, and the HEVC and HEIF stuff was a huge pain in my ass to deal with. Lucky for me everything before 2016 was untainted, any no one had yet bothered to turn on 4k-60p on any of their phones (making transcoding on my poor NAS's CPU nearly impossible), or I'd have been in real trouble.
> I recently started self-hosting my family's photos and videos
No offense, but HEVC vs AV1 is not in any way consequential to this trivial scenario. Those photos and videos will be decodable and re-encodable for you _forever_. Don't think you need to store all that stuff in some half-baked encoding just because it's theoretically free-er.
But they're referring to _current_ pains with HEVC and HEIF, and the pains are directly related to the non-free nature; most software (especially web browsers) doesn't support HEVC and HEIF due to their non-freeness, which means the server has to transcode the files on the fly, which decreases quality, requires lots of CPU resources, and might not even be possible to do in real time if the files are big enough.
> Those photos and videos will be decodable... forever.
It will be possible forever, but not easy. JPEG is built into the standard library of every scripting language, and viewers come with every OS. I'm just lazy, that's all. I've tried "better" formats, but my life is easier when I stick with JPEG, MP4 (even though it's non-free), MP3, FLAC, etc.
Just yesterday I came across an old backup I made of some CDs 20 years ago... in Monkey's Audio. The files were 2% smaller, but they also reached out from the past and annoyed the hell out of me. haha
What do you mean? Open-source, free-as-in-beer implementations exist for HEVC. They may or may not be patent-license-compliant (I don't know either way), but they exist, and I sincerely doubt it's even remotely possible for these decoders to be scrubbed from the internet.
I agree that I would prefer to use a patent-unencumbered codec, but I'm not willing to use AV1 until it has common hardware decoding support. My battery life and power consumption is much more important to me than a theoretical, unlikely, future inability to decode it.
macOS and Windows 10 have included a license for H.264 and H.265 for a few years now (encode/decode are OS API calls). Linux does not, however GPUs like Intel's QuickSync, nVidia's NVEnc/NVDec and AMD's VCEEnc/VCEDec have one. If you paid for the GPU or CPU with a built-in codec, then that included a license.
ffmpeg can be compiled with support for your hardware encoder.
Windows 10 does not include a license for H.265; you have to purchase it separately[1]. There is a very well hidden, free "from device manufacturer" version[2], so unless you know about it, you have nearby zero chance finding it.
Linux distributions and Firefox do provide H.264 support via openh264; it is sponsored by Cisco, who are already in the flat license territory, so all they have to do is track binary downloads (that's why they are separate download, or repository respectively). It also plugs seamlessly into linux multimedia framework (Gstreamer), so it is on the same level of integration, as Media Foundation codecs in Windows. There is no H.265 equivalent, still.
The hardware based encoding is not equivalent to software one; it is optimized for dumping compliant streams with low latency, it does not concern itself with effective use of the bits available. I.e. exactly what software based, offline encoders are good at.
> The hardware based encoding is not equivalent to software one; it is optimized for dumping compliant streams with low latency, it does not concern itself with effective use of the bits available. I.e. exactly what software based, offline encoders are good at.
This isn't my experience with NVENC for h.264. Sure, software encoding can do somewhat (but not a lot) better (quality per bits), but at a fraction of the performance and using much more energy. For most cases were you're personally encoding video it's likely better to use a hardware encoder.
I have to agree with NVENC. I've benchmarked dozens of HW and SW encoders for quality (not just encoding speed) and it gives the best BD-Rate on average.
For H.264, hardware can brute-force every possible Intra encoded macroblock and make rate-distortion-optimised QP/prediction decisions in real-time or better. This gets much harder for H.265, where brute-force is fairly impractical and the clever algorithm wins.
A well-designed hardware codec can also run inter-frame block matching searches vastly more efficiently than software can.
Even if you somehow are pro-IP, implying that archival shouldn't use patented codecs makes no sense since they are going to expire sometime in the future anyway (and especially for archival, the timescales for patent expiry seem very soon --- 20 years at most, basically.)
I believe the fashionable thing in the patent world is "evergreening" - Ie. Doing everything to make sure that patent doesn't expire. That often involves filing a few more patents 19 years later. Yes, there was prior art, but do you have deep enough pockets to go up against their lawyers to prove that?
Other techniques involve using copyright (the bitstream decode tables could be considered copyrightable for example).
Future techniques might even involve trying to use trademark law (for example, the first frame of any encoded video is the 'MPAA' logo in big bold letters, but without decoding that frame you can't decode the other frames). Trademarks don't expire.
> That often involves filing a few more patents 19 years later. Yes, there was prior art, but do you have deep enough pockets to go up against their lawyers to prove that?
At most that means you'd need to use old versions of software. An old publicly available file can't violate a patent issued 5 years later.
For the rest, Sega v. Accolade? And just putting a trademark inside the video wouldn't mean a decoder is using the trademark...
The world bloats storage by 50% which harm user experience and ecology. That's a lot of regress while all it take is one symbolic dollar for a lifetime license..
The user experience and ecology of the internet is built on royalty-free open standards. It's up to video codecs to meet that standard.
H.265 isn't interested in being royalty-free. But, luckily, AV1 is very interested in being royalty-free. So this time around we don't need to repeat the mistakes and compromises of the past.
There's some law which states that when the price of something halves, the usage doubles.
Look at LED lighting, it's more energy efficient, so now we deploy more of it, ending up using exactly as much energy as before (but now everything is well lit).
The same with video codecs, as they get better, we generate more video.
Though with LED lighting, it really depends. I still am using far less power for all of my lighting vs 1 incandescent light bulb. I am thinking specifically of the room/desk lamp of my childhood. Though I haven't counted the blinkenlights on my router, switch, etc. Maybe all of those added up will tip the balance.
There are free and open source transcoders for H264 and H265 available now, most notably FFmpeg. The symbolic dollar is for a commercially released decoder binary, but that's not the only option.
These closed formats are typically patent protected, not kept secret through proprietary encoding/decoding software.
As you probably know: (1) Those patents will eventually expire, (2) The price of a patent is immediate and complete disclosure of the technical method underpinning the claims.
Are you legally allowed to distribute an open-source implementation?
For last, say, 25 years I've seen many parties trying to hamper that, even in non-commercial setting (e.g. the original mp3 format encoder). If they don't, good!
Some years ago when H.264 was new and uncommon, I thought about archiving tapes in XviD so I could share more easily with friends and family. I tested it and found that the quality vs x264 was much too bad. I begrudgingly went ahead with using x264 even if my friends and family wouldn’t be able to play it back as I didn’t want to save smudged videos.
Now extrapolate to today: today it sounds ridiculous to use XviD and even H.264. For ease of use I’d use x265-10 bit, for future proofing I would need to read up on av1. Think what it will look like in 10 years (2032) as you will have those files in 10 years for sure.
There's a slightly counter-intuitive thing with H.264 and H.265 encoding: for any given bit-rate, you'll get better quality if you encode at 10-bits instead of 8-bits, even if the source clip was 8-bit.
The reason for this is the DCT transform results in 16-bit numbers for every pixel. The reason that doesn't make things worse is (partly) because it only sends the non zero values.
There's too many more reasons why, but that's the quick summary.
How is this possibly true? The argument that 16-bit DCT somehow gives better precision AND doesn't change the size of the encode makes no sense. If you get better precision you need to keep those extra-precise bits that are no longer zero due to truncation.
I haven't seen this argued as 16-bit DCT, but in color space conversion. The gist is that all 8-bit RGB values cannot be represented properly in 8-bit YUV420, so you're supposed to use 10-bit to get "proper" YUV values. But if you start with an 8-bit encode you've already thrown away the extra precision, so why waste the (considerable) extra compute on 10-bit just to make sure you don't truncate the already-truncated YUV?
I have a project in progress to measure all of the variations, but from quick testing with CRF encoding the same value results in much longer compute AND a larger file in 10-bit versus 8-bit. The larger file has a slightly higher VMAF score, as would be expected from spending more bits. The work is in finding a set of encoding parameters to measure the quality difference at the same output size, and to measure the relative improvement across CRF vs size vs bit depth.
The process is: Raw input pixels (8 or 10 bit) minus predicted pixels (8 or 10 bit) -> residual pixels (8 or 10 bit + 1 sign bit).
You take these residual pixels and pass them through a 2D DCT, then scale and quantise them. At the end of this, the quantised DCT residual values are signed 16-bit numbers - you don't get to choose the bit-depth here; it's part of the standard (section 8.6). For every 16x16 pixel input, you get a 16x16 array of signed 16-bit numbers.
The last step is to pass all non-zero quantised DCT residual values through an entropy coder (usually an arithmetic coder), then you get the final bitstream.
The key point is that it didn't matter if the original raw pixel input was 8-bit or 10-bit; the quantised DCT residual values became 16 bits before being compressed and transmitted. This is also true for 12-bit raw pixel inputs.
This seems impossible; for 8-bit inputs, you've doubled the size of the data (slightly less than double for 10-bits), so you must be making things worse! The key is that after scaling and quantisation, most of those 16-bit words are zero. Those that are non-zero are statistically closer to zero so that the entropy encoder won't have to spend a lot of bits signalling them.
The last part comes when you reverse this process. The mathematical losses from scaling and quantising 10-bit inputs into the transmitted 16-bit values are less than the losses for 8-bit inputs. When you run the inverse quant, scale and iDCT, you end up with values that are closer to the original residual values at 10-bit than you do at 8-bit.
You'd be better off using the highest profile h.264 today. _MOST_ of the way to h.265 and is way better on patents and slightly better for still-frame quality.
For long term archive work the above or AV1 (if you have infinite time / energy budget) are probably better, depending on settings.
why would it be ridiculous to use today H.264 ? it's still used everywhere, it takes more space than newer codecs, but there is nothing ridiculous about it, if you want wildly supported format with low requirements for decoding and very reasonable amount of space taken, it's not even that bad compared to H.265 for same quality
No. If I did it wouldn't be a problem, I'd just move format to format over time, but I know from experience that an encode from someone with talent and experience is a whole different experience than an encode from someone who's just clicking a preset.