But is the psychovisual output any good or do they only target, say, speed? Because back in the days, both Videolan projects x264 and x265 (especially x264) had much better psychovisual quality than commercial encoders.
Back in those days, x264 felt to me like a software written by aliens from the future.
> Back in those days, x264 felt to me like a software written by aliens from the future.
Indeed it may be :) H.264 might not be the latest and best video coding standard anymore, but in my opinion x264 is, and will always be by far the best encoder ever written for a video format.
Which brings me to...
> Because back in the days, both Videolan projects x264 and x265 (especially x264) had much better psychovisual quality than commercial encoders.
Unfortunately x265 isn't a Videolan project (its developed by MulticoreWare Inc.) and it's a very mediocre encoder which doesn't hold a candle to x264 and IMO kind of a shame considering its legacy.
Also after 2018 it's practically became maintainenance-only and was surpassed by proprietary encoders in MSU encoder tests in the following years. That's a big loss considering x264 was still seeing significant efficiency and performance improvements as late as 2013 (when the H.264 format was 10 years old), so when compared to x264 I assume a good 4-5 years of potential improvements have been left at the table for x265.
I think the big issue is x264 was very obviously a labor of love from some very talented developers. I just haven't seen that sort of love dumped into other encoders.
Newer codecs have been relying on the format to provide more obvious tools for compression (and mostly giving benefits for HD+ resolutions).
I wouldn't necessarily say x265 bad. Rather, they simply haven't been taking it to the extreme levels of optimization that the x264 devs took x264. [1]
Up until the end of development, x264 was hyper focused on getting the best possible subjective quality with the smallest possible bitrate. To date, the x264 CRF metrics are (IMO) unparalleled in consistency. With other codecs a similar CRF mode is simply, well, shit. I can't just set stuff to "CRF 20" and expect the output to hit roughly the same level of quality. VP9, in particular, is terrible with this. In VP9 CRF is more closely related to the bitrate than the actual quality of the scenes being encoded.
To be clear, even with these critiques you SHOULD choose x265, vp9, or AV1 over x264 for your encoding choices. They have better specs that allow for better compression. However, they are also leaving a lot on the table for what they COULD do.
I current do VP9 + vmaf on each scene to set a CRF value (using my own thing similar to AV1AN). That gives good consistent results at minimal bitrates. It's just a little terrible (IMO) that I have to do so much work that the encoder should theoretically be able to do better.
My understanding is that there is no intrinsic meaning to "crf", and that it is just a rough way of controlling bitrate (in that it refers to internal variables in the specific implementation of the encoder), am I mistaken about this?
What are up-to-date AV1 encoders still leaving on the table as far as optimization is concerned?
> My understanding is that there is no intrinsic meaning to "crf", and that it is just a rough way of controlling bitrate (in that it refers to internal variables in the specific implementation of the encoder), am I mistaken about this?
You are not mistaken. The difference is in how reliable the control is regardless of input video.
For libvpx, the CRF control is garbage. A CRF of 30 will be good for some scenes and horrible for scenes that are too dark or have too much motion. It means if you want to just use libvpx (or ffmpeg), you are often setting that CRF way lower than you need to so scenes where it fails don't end up looking like smooth color blobs. It's bad enough that they introduced a "minimum bitrate" flag.
x264 is not that experience. The amount of adjustment you have to do for CRF for a given input are extremely minor, I found between 20 and 24 to be more than acceptable. For vpx, you need to come up with a value anywhere from 10 to 50 depending on the source.
I get that a lot of this is subjective experience, but it's what I've experienced doing a bunch of dvd rips.
> What are up-to-date AV1 encoders still leaving on the table as far as optimization is concerned?
The biggest seems to be good quality controls that have been tuned by someone with a good subjective eye for that sort of thing. Beyond that, IDK, the bitstreams allow for a LOT more transformations than H.264 allowed for, yet the codecs don't seem to have the same level of complexity. For example, x264 came up with a bunch of motion vector search patterns over it's evolution. You don't see those sorts of developments with the other encoders.
Heck, you even saw that sort of care for quality output in the fact that x264 has tuning guides for (at the time) common objective measures of quality, SSIM and PSNR. (which returned worst quality than the x264 subjective quality metrics.
IDK, this may also be that I don't have as much time to geek out over video codecs :).
Back in those days, x264 felt to me like a software written by aliens from the future.