Hacker News new | past | comments | ask | show | jobs | submit login
AV1 Video Codec (aomedia.org)
132 points by doener on May 9, 2022 | hide | past | favorite | 125 comments



I think a more relevant front page link/summary would be that the open-source AV-1 codec has reached the v1.0.0 milestone (18 days ago).

https://github.com/AOMediaCodec/SVT-AV1/releases/tag/v1.0.0


For clarity: this is one encoder, not the codec itself.



Complicating matters slightly more, the AOM recently adopted the SVT codebase as their official implementation for future work.

Though obviously other encoders (and decoders) are available for the AV1 format. I think for individual encode enthusiasts the old libaom codebase is still the goto, but if you are Facebook or Netflix you probably want to be looking at SVT-AV1.


Funny seeing this here. I was playing with ffmpeg settings earlier and decided to encode a game highlight with AV1. It's only 20 seconds and seven hours later, it's just about done. I'm certain I can twiddle more settings to make it faster, but at this point I can't give in to the slow speed. Only ~21 frames to go!


The ffmpeg page for AV1[1] seems to recommend the SVT-AV1 encoder. Using the SVT-AV1 encoder and -crf 17 I was able to get an 8x encoding speed (about 250 FPS) with a very basic set of options similar to this:

ffmpeg -i input.mp4 -c:a copy -c:v libsvtav1 -crf 17 svtav1_test.mp4

Hopefully using the SVT-AV1 encoder helps speed up your encoding.

[1]: https://trac.ffmpeg.org/wiki/Encode/AV1


I concur, both options help a lot. I noticed that the ffmpeg packages are sometimes a bit outdated, yielding a gigantic gap between what the stock ffmpeg can do, and what a freshly compiled one does.

Also, having more cores helps a lot with SVT-AV1.



For some reason, none of the AV1 encoders choose sensible (practically useful without adjustment) settings as the default, but such settings do exist, rest assured.


I've only played with some of the earlier codecs and studied them in detail, but I wonder what exactly makes it so slow? Is it basically doing a lot of bruteforcing to find the best way to compress each frame?


yes.


Probably worth it at scale.



I'm all for people forging ahead on AV1, but I'm not personally bothering with it yet. Most of my library is currently in HEVC and I'm not going to switch until I don't even have to think about whether all my equipment supports hardware AV1 decoding.


The more stuff you store away in a format that you can't decode for free, the worse off you're gonna be, trust me. I recently started self-hosting my family's photos and videos, and the HEVC and HEIF stuff was a huge pain in my ass to deal with. Lucky for me everything before 2016 was untainted, any no one had yet bothered to turn on 4k-60p on any of their phones (making transcoding on my poor NAS's CPU nearly impossible), or I'd have been in real trouble.


> I recently started self-hosting my family's photos and videos

No offense, but HEVC vs AV1 is not in any way consequential to this trivial scenario. Those photos and videos will be decodable and re-encodable for you _forever_. Don't think you need to store all that stuff in some half-baked encoding just because it's theoretically free-er.


But they're referring to _current_ pains with HEVC and HEIF, and the pains are directly related to the non-free nature; most software (especially web browsers) doesn't support HEVC and HEIF due to their non-freeness, which means the server has to transcode the files on the fly, which decreases quality, requires lots of CPU resources, and might not even be possible to do in real time if the files are big enough.


> Those photos and videos will be decodable... forever.

It will be possible forever, but not easy. JPEG is built into the standard library of every scripting language, and viewers come with every OS. I'm just lazy, that's all. I've tried "better" formats, but my life is easier when I stick with JPEG, MP4 (even though it's non-free), MP3, FLAC, etc.

Just yesterday I came across an old backup I made of some CDs 20 years ago... in Monkey's Audio. The files were 2% smaller, but they also reached out from the past and annoyed the hell out of me. haha


av1 is open source and most major gpu vendors are adding hardware decoding support for it in their newer chips. it's a far safer bet


What do you mean? Open-source, free-as-in-beer implementations exist for HEVC. They may or may not be patent-license-compliant (I don't know either way), but they exist, and I sincerely doubt it's even remotely possible for these decoders to be scrubbed from the internet.

I agree that I would prefer to use a patent-unencumbered codec, but I'm not willing to use AV1 until it has common hardware decoding support. My battery life and power consumption is much more important to me than a theoretical, unlikely, future inability to decode it.


Why was it a pain? I use Plex for remote streaming and it's never had an issue with HEVC.


Well, now you're stuck with Plex forever. ;)


HEVC/H.265 will be around a lot longer than Plex will be — or you or I will be, for that matter.


macOS and Windows 10 have included a license for H.264 and H.265 for a few years now (encode/decode are OS API calls). Linux does not, however GPUs like Intel's QuickSync, nVidia's NVEnc/NVDec and AMD's VCEEnc/VCEDec have one. If you paid for the GPU or CPU with a built-in codec, then that included a license.

ffmpeg can be compiled with support for your hardware encoder.


Windows 10 does not include a license for H.265; you have to purchase it separately[1]. There is a very well hidden, free "from device manufacturer" version[2], so unless you know about it, you have nearby zero chance finding it.

Linux distributions and Firefox do provide H.264 support via openh264; it is sponsored by Cisco, who are already in the flat license territory, so all they have to do is track binary downloads (that's why they are separate download, or repository respectively). It also plugs seamlessly into linux multimedia framework (Gstreamer), so it is on the same level of integration, as Media Foundation codecs in Windows. There is no H.265 equivalent, still.

The hardware based encoding is not equivalent to software one; it is optimized for dumping compliant streams with low latency, it does not concern itself with effective use of the bits available. I.e. exactly what software based, offline encoders are good at.

[1] https://apps.microsoft.com/store/detail/hevc-video-extension... [2] https://apps.microsoft.com/store/detail/hevc-video-extension...


> The hardware based encoding is not equivalent to software one; it is optimized for dumping compliant streams with low latency, it does not concern itself with effective use of the bits available. I.e. exactly what software based, offline encoders are good at.

This isn't my experience with NVENC for h.264. Sure, software encoding can do somewhat (but not a lot) better (quality per bits), but at a fraction of the performance and using much more energy. For most cases were you're personally encoding video it's likely better to use a hardware encoder.


I have to agree with NVENC. I've benchmarked dozens of HW and SW encoders for quality (not just encoding speed) and it gives the best BD-Rate on average.

For H.264, hardware can brute-force every possible Intra encoded macroblock and make rate-distortion-optimised QP/prediction decisions in real-time or better. This gets much harder for H.265, where brute-force is fairly impractical and the clever algorithm wins.

A well-designed hardware codec can also run inter-frame block matching searches vastly more efficiently than software can.


Even if you somehow are pro-IP, implying that archival shouldn't use patented codecs makes no sense since they are going to expire sometime in the future anyway (and especially for archival, the timescales for patent expiry seem very soon --- 20 years at most, basically.)


I believe the fashionable thing in the patent world is "evergreening" - Ie. Doing everything to make sure that patent doesn't expire. That often involves filing a few more patents 19 years later. Yes, there was prior art, but do you have deep enough pockets to go up against their lawyers to prove that?

Other techniques involve using copyright (the bitstream decode tables could be considered copyrightable for example).

Future techniques might even involve trying to use trademark law (for example, the first frame of any encoded video is the 'MPAA' logo in big bold letters, but without decoding that frame you can't decode the other frames). Trademarks don't expire.


> That often involves filing a few more patents 19 years later. Yes, there was prior art, but do you have deep enough pockets to go up against their lawyers to prove that?

At most that means you'd need to use old versions of software. An old publicly available file can't violate a patent issued 5 years later.

For the rest, Sega v. Accolade? And just putting a trademark inside the video wouldn't mean a decoder is using the trademark...


The world bloats storage by 50% which harm user experience and ecology. That's a lot of regress while all it take is one symbolic dollar for a lifetime license..


The user experience and ecology of the internet is built on royalty-free open standards. It's up to video codecs to meet that standard.

H.265 isn't interested in being royalty-free. But, luckily, AV1 is very interested in being royalty-free. So this time around we don't need to repeat the mistakes and compromises of the past.


There's some law which states that when the price of something halves, the usage doubles.

Look at LED lighting, it's more energy efficient, so now we deploy more of it, ending up using exactly as much energy as before (but now everything is well lit).

The same with video codecs, as they get better, we generate more video.


Perhaps Jevon's Paradox? https://en.wikipedia.org/wiki/Jevons_paradox

Though with LED lighting, it really depends. I still am using far less power for all of my lighting vs 1 incandescent light bulb. I am thinking specifically of the room/desk lamp of my childhood. Though I haven't counted the blinkenlights on my router, switch, etc. Maybe all of those added up will tip the balance.


> Though I haven't counted the blinkenlights on my router, switch, etc. Maybe all of those added up will tip the balance.

No way those total over a watt.


A blinkenlight LED is about 0.7V at 20mA, totaling about 0.14mW per LED, or 7142 LEDs per watt.


Sounds like a form of Parkinson's Law https://en.wikipedia.org/wiki/Parkinson%27s_law


For a lifetime licence for a binary blob, or for code you can compile as you move to a new hardware platform?


There are free and open source transcoders for H264 and H265 available now, most notably FFmpeg. The symbolic dollar is for a commercially released decoder binary, but that's not the only option.

These closed formats are typically patent protected, not kept secret through proprietary encoding/decoding software. As you probably know: (1) Those patents will eventually expire, (2) The price of a patent is immediate and complete disclosure of the technical method underpinning the claims.


What have binary blobs got to do with it?


Are you legally allowed to distribute an open-source implementation?

For last, say, 25 years I've seen many parties trying to hamper that, even in non-commercial setting (e.g. the original mp3 format encoder). If they don't, good!


Some years ago when H.264 was new and uncommon, I thought about archiving tapes in XviD so I could share more easily with friends and family. I tested it and found that the quality vs x264 was much too bad. I begrudgingly went ahead with using x264 even if my friends and family wouldn’t be able to play it back as I didn’t want to save smudged videos.

Now extrapolate to today: today it sounds ridiculous to use XviD and even H.264. For ease of use I’d use x265-10 bit, for future proofing I would need to read up on av1. Think what it will look like in 10 years (2032) as you will have those files in 10 years for sure.


There's a slightly counter-intuitive thing with H.264 and H.265 encoding: for any given bit-rate, you'll get better quality if you encode at 10-bits instead of 8-bits, even if the source clip was 8-bit.

The reason for this is the DCT transform results in 16-bit numbers for every pixel. The reason that doesn't make things worse is (partly) because it only sends the non zero values.

There's too many more reasons why, but that's the quick summary.


How is this possibly true? The argument that 16-bit DCT somehow gives better precision AND doesn't change the size of the encode makes no sense. If you get better precision you need to keep those extra-precise bits that are no longer zero due to truncation.

I haven't seen this argued as 16-bit DCT, but in color space conversion. The gist is that all 8-bit RGB values cannot be represented properly in 8-bit YUV420, so you're supposed to use 10-bit to get "proper" YUV values. But if you start with an 8-bit encode you've already thrown away the extra precision, so why waste the (considerable) extra compute on 10-bit just to make sure you don't truncate the already-truncated YUV?

I have a project in progress to measure all of the variations, but from quick testing with CRF encoding the same value results in much longer compute AND a larger file in 10-bit versus 8-bit. The larger file has a slightly higher VMAF score, as would be expected from spending more bits. The work is in finding a set of encoding parameters to measure the quality difference at the same output size, and to measure the relative improvement across CRF vs size vs bit depth.


Replying to just your 1st paragraph:

The process is: Raw input pixels (8 or 10 bit) minus predicted pixels (8 or 10 bit) -> residual pixels (8 or 10 bit + 1 sign bit).

You take these residual pixels and pass them through a 2D DCT, then scale and quantise them. At the end of this, the quantised DCT residual values are signed 16-bit numbers - you don't get to choose the bit-depth here; it's part of the standard (section 8.6). For every 16x16 pixel input, you get a 16x16 array of signed 16-bit numbers.

The last step is to pass all non-zero quantised DCT residual values through an entropy coder (usually an arithmetic coder), then you get the final bitstream.

The key point is that it didn't matter if the original raw pixel input was 8-bit or 10-bit; the quantised DCT residual values became 16 bits before being compressed and transmitted. This is also true for 12-bit raw pixel inputs.

This seems impossible; for 8-bit inputs, you've doubled the size of the data (slightly less than double for 10-bits), so you must be making things worse! The key is that after scaling and quantisation, most of those 16-bit words are zero. Those that are non-zero are statistically closer to zero so that the entropy encoder won't have to spend a lot of bits signalling them.

The last part comes when you reverse this process. The mathematical losses from scaling and quantising 10-bit inputs into the transmitted 16-bit values are less than the losses for 8-bit inputs. When you run the inverse quant, scale and iDCT, you end up with values that are closer to the original residual values at 10-bit than you do at 8-bit.


Sorry, I don't understand. I can see why it wouldn't be worse, but why would it be better?


If I remember correctly that is only with x264, and no longer the case with x265.


You'd be better off using the highest profile h.264 today. _MOST_ of the way to h.265 and is way better on patents and slightly better for still-frame quality.

For long term archive work the above or AV1 (if you have infinite time / energy budget) are probably better, depending on settings.


Patents on XviD's underlying standard are also going to expire within a year:

https://meta.wikimedia.org/wiki/Have_the_patents_for_MPEG-4_...


why would it be ridiculous to use today H.264 ? it's still used everywhere, it takes more space than newer codecs, but there is nothing ridiculous about it, if you want wildly supported format with low requirements for decoding and very reasonable amount of space taken, it's not even that bad compared to H.265 for same quality


It's not obvious. HEVC is dead on web browser and not looks like become available. Jump AVC to AV1.


Ten years from now your first AV1 video might be finished encoding. /s but not really.


This is where I am with h264. Pretty much everything supports it, maybe even your toaster. HEVC and AV1, not so much.


Do you re-encode your media or something?


No. If I did it wouldn't be a problem, I'd just move format to format over time, but I know from experience that an encode from someone with talent and experience is a whole different experience than an encode from someone who's just clicking a preset.


What's the state of encoding? I'd like to try moving some videos to AV1, but I'm not willing to wait... IIRC, it was literally days of encode time for minutes of video last time I tried.


With ffmpeg 5 using libaom it's actually reasonable. Obviously it depends on quality and whatnot, but I'm able to get about .25X of realtime playback speed with AV1 encoding. The magic switch is setting -cpu-used to something higher than the default of 1, which is go-on-vacation-and-maybe-it-will-finish slow.

Rough example: `ffmpeg -i video.mkv -c:v libaom-av1 -cpu-used 5 av1_test.mkv`


On my PC I get .329x of realtime playback speed with AV1, and 4.85x with libx264. So AV1 encoding is 15 times slower than H.264 encoding !

I used ffmpeg "-cpu-used 8" for AV1 (higher than this and I get "Error setting option cpu-used to value X"). I removed the -cpu-used command for H.264 as the encoder defaults to autodetecting the number of threads to max out CPU usage. My CPU is an 8-core 16-thread AMD Ryzen 7 PRO 5750G. My source material was the first 20 seconds of the H.264 blu-ray rip of Titanic. top(1) shows the AV1 encoder uses around 500% CPU (meaning only 5 of the 16 hardware threads are utilized) while the H.264 encoder uses around 1600% (exactly 16 of 16 threads utilized). So there is a lot of potential parallelization optimization that could be exploited. But even assuming perfect scaling from 5 to 16 threads, the AV1 encoder would get to 1.1x of realtime playback speed. It would still be about 5 times slower than the H.264 encoder.


cpu-used is (somewhat confusingly) not for how many threads it uses, but how good the quality of the video should be in a scale from 0 to 9, where 0 is best quality and slowest, and 9 is worst quality and fastest. Very similar to how -preset was used for x264.

https://github.com/AOMediaCodec/community/wiki#how-to-make-e... explains the different settings that most change the speed of encoding.


Shouldn't we be comparing AV1 to HEVC, not H.264? Seems like an unfair comparison. I would expect that AV1 is computationally (and quality-size-ratio) much more comparable to HEVC than to H.264.

If we're going to do AV1 vs. H.264, and complain that AV1 is much slower, we might as well compare XviD and H.264 and complain about the latter.


HEVC is in principle the same as VP9.

AV1 is one generation newer. It was made by merging the projects for VP10, Thor, and Daala.

AV1 is computationally better than HEVC/VP9 and more comparable to H.264. Visually it’s better than either.


We should rather compare AV1 to VVC aka H.266.


Try adding `-row-mt 1 -tiles 2x2`. That will essentially tile the video into parallel streams that can be encoded on different threads. You should get better CPU utilization that way.

I don't think AV1 will ever get to the encoding speed of a codec like h264, because in a very general sense; simpler math is easier to do. AV1 can encode the same information into fewer bits, but that efficiency has a computational cost.

It really depends on what you're doing. If you're trying to livestream it could be a big issue. If you're going to encode a video once and then store it for years, it's probably not a big deal. Totally up to you. For me AV1 has become a pretty good option.


Nah, tiles hurt quality. Use Av1an (or Nmkoder on Windows if you want a GUI) with AOM to multithread the encoding by scene for offline encodes.

For livestreaming, SVT-AV1 1.0 is now usable on good CPUs (eg. AMD 5800x+ desktop CPUs, Intel 12600+ desktop CPUs) at higher CPU presets (8+, depending on your CPU and what you're streaming), just currently no one allows for AV1 ingest for livestreams.


That is not entirely true. Using scene-chunking to multi-thread the encode means that the video buffer can not be transferred between scenes making the start of each scene visually worse in video streaming situations where the bit-rate is capped.

For example, a 2Mbps video stream will usually have a 4Mb video cache that can be used to preload data. Having this video cache means that a 2Mbps video stream can burst up to 6Mbps for one second while still maintaining a 2Mbps cap.


My bad, I assumed -cpu-used controlled the number of threads.

I retried with "-row-mt 1 -tiles 2x2" (keeping "-cpu-used 8" so the benchmark is comparable to my previous test): the encoding speed is 0.443x of playback speed; top(1) shows about 8 of my 16 cpu threads are utilized.

Without "-row-mt 1 -tiles 2x2" the encoding speed was 0.329x. So these options only increases speed by 34%. This doesn't match the increased in cpu utilization of +60% (5 to 8 threads). Contention on shared data structures? Looks like it's better to just spawn multiple ffmpeg instances working on different source files instead of leveraging the encoder's multi-threading. That way I could get close to 1x of playback speed.

I have 3500 hours of video content. At 1x I need 5 months to reencode all. Heavy. But doable I guess.


What cpu do you have?


The current preferred encoder is SVT-AV1 [1]. It has pretty good CPU/space tradeoffs, so you can pick how long you are willing to wait for a video to encode.

[1] https://github.com/AOMediaCodec/SVT-AV1


I've used this with FFmpeg with great success. Encoding time and video quality is comparable to other encoders, and file size almost always smaller.


This chart is well know, has kt been reproduced on other datasets?


You can see nVidia's hardware support here: https://developer.nvidia.com/video-encode-and-decode-gpu-sup...

The take-away is that nVidia only has hardware support for AV1 on the RTX 3000 Ampere series cards.

Intel supported AV-1 decoding in its Xe-LP GPUs in 2020.

Intel also reached v1.0.0 of its open-source codec 2 weeks ago, here: https://github.com/AOMediaCodec/SVT-AV1

I am not aware of any other CPU or GPU that has an AV-1 codec built in.


AFAIK state of the art is using fast presets of SVT-AV1, combined with Av1an [0] for parallelism.

[0] https://github.com/master-of-zen/Av1an


https://people.videolan.org/~unlord/SVT-AV1_BD-rate.png

Comparation between codecs and encoder presets. At the same speed SVT-AV1 gives better quality vs h264, h265. M10-M8 look like nice spot


Worth highlighting the hardware used for that graph had 96 logical cores.

Newer codecs can generally take better advantage of that parallelism and SVT in particular has this as a core design element.

Still cool, but might not fully apply depending on your use case.


Can you really save much space by re-encoding an already encoded video in AV1? Wouldn’t the efficiency improvement be offset by the doubling up of compression artifacts?


I just googled a bit to see how much space could be saved by converting Blu-ray AVC content (1080p non-HDR) to AV1 with subjectively same-ish quality, and from what I gathered one can expect to save 40-50% storage space on average (with a spread of maybe more like 20-80% depending on the video). So it’s substantial, but maybe not substantial enough to run and reencode all your movies, unless moving to a free codec is the important motivation.


That seems like a cq based encode. Try av1an in its target quality mode - I've gotten some movies to a tenth the Blu-ray size with 98 vmafs and with only smudged detail at pixel level scrutiny.

The trick is if your encoder can adapt the quantizer based off a metric to compare the encode quality you can save a ton more space without visual degradation.


You wouldn't save much going from h.265 -> AV1 but h.264 -> AV1 should be a significant savings.

AV1 is ~30% more efficient than h.265. And h.265 is ~40% more efficient than h.264. The lower the resolution of the video, the less savings you will get.

I personally will wait until more devices support hardware AV1 decoding. I believe only Intel 12th gen, and Nvidia 3000 support AV1 currently. I don't plan on purchasing a new laptop or smartphone for 3-4 years, at that point I may start encoding in AV1.


AMD 6000 series cards also support hardware decoding of AV1.


Except those based on Navi 24 (6400/6500) IIRC.


See here for performance benchmarks on dav1d (av1 e̶n̶c̶o̶d̶e̶r̶):

https://openbenchmarking.org/test/pts/dav1d#results

Edit: It’s false! This is just a decoder, not an encoder! I just learned it.

Here is the link for the performance benchmarks for SVT-AV1, the most popular encoder as I just learned:

https://openbenchmarking.org/test/pts/svt-av1


dav1d is an av1 decoder


We just need to hold out hope for the release of dav1e


Does Team Videolan have anything like it on their roadmap?


It's not a videolan project, but is rav1e the kind of thing you're looking for?


But is the psychovisual output any good or do they only target, say, speed? Because back in the days, both Videolan projects x264 and x265 (especially x264) had much better psychovisual quality than commercial encoders.

Back in those days, x264 felt to me like a software written by aliens from the future.


(sorry for venting time :)

> Back in those days, x264 felt to me like a software written by aliens from the future.

Indeed it may be :) H.264 might not be the latest and best video coding standard anymore, but in my opinion x264 is, and will always be by far the best encoder ever written for a video format.

Which brings me to...

> Because back in the days, both Videolan projects x264 and x265 (especially x264) had much better psychovisual quality than commercial encoders.

Unfortunately x265 isn't a Videolan project (its developed by MulticoreWare Inc.) and it's a very mediocre encoder which doesn't hold a candle to x264 and IMO kind of a shame considering its legacy.

Also after 2018 it's practically became maintainenance-only and was surpassed by proprietary encoders in MSU encoder tests in the following years. That's a big loss considering x264 was still seeing significant efficiency and performance improvements as late as 2013 (when the H.264 format was 10 years old), so when compared to x264 I assume a good 4-5 years of potential improvements have been left at the table for x265.


Yup.

I think the big issue is x264 was very obviously a labor of love from some very talented developers. I just haven't seen that sort of love dumped into other encoders.

Newer codecs have been relying on the format to provide more obvious tools for compression (and mostly giving benefits for HD+ resolutions).


Would you mind giving a few more details as to why x264 is so good, and x265 isn't?


I wouldn't necessarily say x265 bad. Rather, they simply haven't been taking it to the extreme levels of optimization that the x264 devs took x264. [1]

Up until the end of development, x264 was hyper focused on getting the best possible subjective quality with the smallest possible bitrate. To date, the x264 CRF metrics are (IMO) unparalleled in consistency. With other codecs a similar CRF mode is simply, well, shit. I can't just set stuff to "CRF 20" and expect the output to hit roughly the same level of quality. VP9, in particular, is terrible with this. In VP9 CRF is more closely related to the bitrate than the actual quality of the scenes being encoded.

To be clear, even with these critiques you SHOULD choose x265, vp9, or AV1 over x264 for your encoding choices. They have better specs that allow for better compression. However, they are also leaving a lot on the table for what they COULD do.

I current do VP9 + vmaf on each scene to set a CRF value (using my own thing similar to AV1AN). That gives good consistent results at minimal bitrates. It's just a little terrible (IMO) that I have to do so much work that the encoder should theoretically be able to do better.

[1] https://web.archive.org/web/20100105000031/http://x264dev.mu...


My understanding is that there is no intrinsic meaning to "crf", and that it is just a rough way of controlling bitrate (in that it refers to internal variables in the specific implementation of the encoder), am I mistaken about this?

What are up-to-date AV1 encoders still leaving on the table as far as optimization is concerned?


> My understanding is that there is no intrinsic meaning to "crf", and that it is just a rough way of controlling bitrate (in that it refers to internal variables in the specific implementation of the encoder), am I mistaken about this?

You are not mistaken. The difference is in how reliable the control is regardless of input video.

For libvpx, the CRF control is garbage. A CRF of 30 will be good for some scenes and horrible for scenes that are too dark or have too much motion. It means if you want to just use libvpx (or ffmpeg), you are often setting that CRF way lower than you need to so scenes where it fails don't end up looking like smooth color blobs. It's bad enough that they introduced a "minimum bitrate" flag.

x264 is not that experience. The amount of adjustment you have to do for CRF for a given input are extremely minor, I found between 20 and 24 to be more than acceptable. For vpx, you need to come up with a value anywhere from 10 to 50 depending on the source.

I get that a lot of this is subjective experience, but it's what I've experienced doing a bunch of dvd rips.

> What are up-to-date AV1 encoders still leaving on the table as far as optimization is concerned?

The biggest seems to be good quality controls that have been tuned by someone with a good subjective eye for that sort of thing. Beyond that, IDK, the bitstreams allow for a LOT more transformations than H.264 allowed for, yet the codecs don't seem to have the same level of complexity. For example, x264 came up with a bunch of motion vector search patterns over it's evolution. You don't see those sorts of developments with the other encoders.

Heck, you even saw that sort of care for quality output in the fact that x264 has tuning guides for (at the time) common objective measures of quality, SSIM and PSNR. (which returned worst quality than the x264 subjective quality metrics.

IDK, this may also be that I don't have as much time to geek out over video codecs :).


Thank you for the writeup.


Dark Shikari's blog was great. Defunct now, but I think it's all on the Internet Archive. https://web.archive.org/web/20100104193513/http://x264dev.mu...

I don't know of any benchmarks, but rav1e does have --tune psychovisual, and there are issues raised against it, so it seems they take it seriously.


svt-av1 is pretty fast. Well, depending on crf/preset etc of course, but in general, the more cores you have, the easier it is to get realtime encoding with decent quality. You may want to try ab-av1 first, and then go and play with ffmpeg manually if the result does not satisfy your needs.

For other encoders that do not do parallel encoding that well there are things like av1an.


I tried a few days before (libaom, svt, and rav1e), and I'm getting much slower speed than with vp9, at comparable quality.


AV1 is great, and I'm excited about it, but for my purposes I'm even more excited about avif. I hope apple adds support to their OSes soon (which in turn means safari supporting it)


I'm pretty excited for it. I encoded my blog's images to AVIF a year or two ago, and redid them a few weeks ago. (I use <picture> to allow browsers to choose which format to get.)

JPEG XL looks pretty good, too, but current browser support is a flag in Chrome. I wonder if they can squeeze better quality out of low filesizes (0.3~0.6 bits per pixel) to compete better with AVIF.


I find it incredibly difficult to find out from this site what exactly differentiates AV1 from, say, x264/5.

The introduction appears to only be a legal disclaimer?


For starters, x264/x265 are software video encoders, h264/h265 are the respective standards. AV1 is a standard, libaom is the reference software encoder.

H264 and h265 are patent encumbered, so depending what you are doing you need to pay royalties to use them. You are free to use AV1 with paying any royalties. Also, AV1 can deliver better quality video at the same bitrate compared to h264/h265 (at the expense of encoding time). This makes it useful for things like netflix where they can put lots of effort into encoding something once and then reap the bandwidth reduction reward for each person who watches it.


> H264 and h265 are patent encumbered

Same as AV1 [1].

> use AV1 with paying any royalties

Same as H.264/5 for almost all normal usage.

https://www.streamingmediaglobal.com/Articles/ReadArticle.as...


Yeah, a bunch of parasites who had nothing to do with the development of the codec and yet they're desperately trying to extract rent.

It doesn't really matter though, because AOmedia's patent license here:

https://aomedia.googlesource.com/aom/+/refs/heads/main/PATEN...

has a defensive clause:

> Defensive Termination. If any Licensee, its Affiliates, or its agents initiates patent litigation or files, maintains, or voluntarily participates in a lawsuit against another entity or any person asserting that any Implementation infringes Necessary Claims, any patent licenses granted under this License directly to the Licensee are immediately terminated as of the date of the initiation of action

So if you sue anybody for patents you essentially lose access to all of the relevant patents by any of those companies:

https://aomedia.org/membership/members/

So please stop spreading baseless FUD.


Basically, it's NOT patent-free, it is completely patents encumbered, but since most usages of AV1 are YouTube, Google will help you defend yourself against suers.

However, make a competitor of YouTube, and I wish you good luck not paying those fees.

With regards to "from people who had nothing to do with development", there are many research centers, not patent trolls, who actually published publicly their researches. Maybe the people who did AV1 didn't read the literature, but I'll presume they did.


> Google will help you defend yourself against suers

They don't indemnify you from patent claims.

They merely offer you the ability to use their patents to help defend yourself. But against the likes of NTT, Orange, Phillips etc patents aren't the issue, it's running out of money to litigate the issue.


> Yeah, a bunch of parasites

I think it's worth tempering the language here.

These aren't patent trolls but companies who have been involved in codec design for decades and will be involved in ones in the future e.g. NTT, Dolby, Toshiba.

> So please stop spreading baseless FUD.

No one is spreading FUD. You have outlined a worthless termination clause given that it only applies to those who are interested in implementing the codec. That offers zero protection against third party claims like an indemnification would.


Anyone spamming patents and going after independent reinventors is being a parasite. It doesn't matter how much legitimate research they also do. No tempering is needed.

And that's assuming validity in the first place.


> I think it's worth tempering the language here.

Normally I would agree with you, but not in this case. Software patents need to die, and anyone who tries to extract rent through them *is* a parasite. Especially if they employ mafia-like tactics like those patent pools. Oh, quite a nice codec you have there; it'd be a shame if anything happened to it.

Nowadays you can't even fart without infringing someone else's patent. So, sorry, but I'm not going to apologize for calling them parasites. This is one of very few strong opinions I have, and is a hill I'll gladly die on.


This isn’t quite right. While AV1 is “open”, patent holders have still claimed patents covering aspects of it. So there are still issues with trying to use/distribute it as those patent holders can still claim infringement. Linux is another example of this.


Relative to VP9, https://arxiv.org/pdf/2008.06091.pdf is a good reference. The technical details on the official AOM site are pretty much just the raw specs and code.


This is fantastic, do you have one of these for VP9, H265, and H264 perchance?


AV1 is an evolution of VP9 (based on unreleased VP10 + contributions from other codecs).

In terms of compression efficiency it's on par with H.265, but it's free to use.


What would be a good way to go about testing this codec? Isn't video conversation a lossy activity? Would I have to start by re-ripping Blu-ray discs to negate loss in quality?

My usual playback targets are almost exclusively a web browser or Chromecast, so storing media in formats that would allow for more direct play opportunities would be fantastic.


Keep in mind that even your blue-ray disc is highly lossy compressed.

Almost every source you can get your hands on will be lossy compressed - unless you generate your own - but even then, most consumer devices don't even allow recording into lossless formats.

Uncompressed video is mind bogglingly huge.


What is new here?


I assume it's because this hit the front page: https://news.ycombinator.com/item?id=31317989


Given that discussion around licensing, what's the licensing story around AV1?


> non-sublicensable, perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable

According to its license which is very short and can be read here:

https://code.videolan.org/videolan/dav1d/-/blob/master/doc/P...


That is the license by the creators but does not mean other companies/individuals won’t claim that it infringes on their patents. So if you adopt it, you’re still likely to be accused of infringing their patents.


That link is confusing, because it's not a license by dav1d creators, but a copy of the official AOMedia license: https://aomedia.org/license/patent-license/

and it's backed by these companies: https://aomedia.org/membership/members/


That's a hypothetical legal issue, not an actual one. Also, the companies involved in AV1 did their best to avoid that pitfall in development and have also shown a willingness to go after patent trolls who might try to pull something.


https://www.streamingmedia.com/Articles/ReadArticle.aspx?Art...

https://www.sisvel.com/blog/audio-video-coding-decoding/sisv...

It became a real legal issue before the first consumer AV1 hardware was ever launched.


That's true of every license ever. You can treat it as implied.


Is it possible that AV1 will reach the encoding speeds of h264/h265?


No, all else being equal. AV1 is more complex to encode and decode than HEVC/H.265, which is the tradeoff for higher efficiency.

Yes, in the sense that some AV1 codec implementations will be faster than some HEVC/H.265 implementations.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: