Hacker News new | past | comments | ask | show | jobs | submit login

Related question: is there any decent OS software that can intelligently pick encoding parameters that preserve quality while minimizing size?

I recently tried to implement video uploading for an open source project, but naively choosing ffmpeg parameters can often result in noticeable quality loss / large output file size / long encoding time. And easily all three of those at the same time.




This is a known and hard problem. A lot of companies are trying to internally do some sort of analysis of their own content libraries and intelligently tune their encoders. Netflix is definitely the most known for their tech related to video analysis and introduction of VMAF[1], but other metrics also exist which enable you to compare the original/master and the encoded variant (PSNR, SSIM, etc). Bottom line is that you need a lot of trial and error and fitting different curves on your bitrate/quality graphs and often times what metrics consider to be good quality human visual system doesn't agree 100% with. It's a very interesting problem nonetheless, and I recommend [2] if you want to learn more.

[1] https://en.wikipedia.org/wiki/Video_Multimethod_Assessment_F...

[2] https://netflixtechblog.com/toward-a-practical-perceptual-vi...


> but naively choosing ffmpeg parameters can often result in noticeable quality loss

Welcome to the world of video compression. If you're not able to take the time to learn the ins/outs of how to use a codec as well take each incoming video's specifics into consideration, then you'll be needing to borrow someone's middle of the road presets. Dedicated settings make decisions based on the frame size, bitrate restrictions, things like HLS vs download/play, 1pass/multipass etc. All of that determines GOP size, reference frames, etc.


The problem really is encoding from compressed video to compressed video again. It really isn't great whatever you do (Facebook/Twitter definitely haven't solved this problem either).

The quality from going from source -> 15mbit/sec h265 -> 2mbit/sec h264 (like on a classic social media or whatever site) is absolutely terrible compared to going from source -> 2mbit/sec h264.


Facebook/YouTube don't care about how their compressed video looks. They just need it in the format so that they can get in front of their millions of user's eyballs. I'd be willing to bet that >95% of users don't "care" about compression quality. They just want the content, hence their decisions. There's just no way to properly encode that much content with a "we care about compression" thoughts.


Why is that? Isn't 15 mbit to 2 mbit a big enough difference in quality that you'd expect to retain all the good parts?


Not really - "raw" 1080p video is ~3gbit/sec - so going from 3000 to 15 is a huge amount of data compression.

The problem is when you go from 15 to 2 in my example is that the encoder spends all the time trying to basically encode the artefacts. It really doesn't work well at all.


Hm, that's too bad. Is there a better way to compress already-compressed streams? I find having to do that a lot due to a use case that's not very relevant here.


Handbrake [0] is a pretty user-friendly GUI with some good presets. It's not quite the automatic optimizing tool you're describing but does a good job for most basic tasks.

0: https://handbrake.fr/


I spent about 1-2 years on this, built a ML model. Pretty deep problem, there's no clear winner in terms of OSS software to do this.


But what exactly was the problem you were trying to solve?


The parameters for x264 present everything you need in the correct different dimensions (speed, compatibility, quality, type of content like animation/film/screen recording). The problem is solved, the issue is people built frontends on top of it and present the options in a wrong way that ruins them.


Yes, the x264 params are all useful. It's selecting the optimal combination for a given use case that presents a challenge. Vimeo's video quality (and therefore filesize) is much higher than FBs, because their use cases are completely different.


I second this question. I also recently had this experience [0] and the results are disheartening.

[0] https://stackoverflow.com/q/61784204/741970


The correct answer for that is crf 22 or so plus a speed preset tolerable for your hardware. The presets come with x264/x265/ffmpeg.


I am not sure if this is still the case, but be aware that a specific CRF on x264 vs x265 might mean a different (perceived) quality. It was certainly the case a few years ago.


I think so. It's also different in 8-bit and 10-bit.


Plus maybe denoise in front to keep the size lower.


I have been interested in this and to my surprise I was able to find just a single project that does something similar: https://github.com/master-of-zen/Av1an

It has the ability to do trial compression (w/ scene splitting) and evaluate quality loss up to a desired factor.


this is where 2-pass encoding would be of benefit


There really isn't a point in using 2-pass when you have CRF/CQP.


unless file size is a factor, like here


Minimizing size isn’t a constraint that needs 2pass encoding. Only targeting a specific size is.

(Which could include max size over time constraints like VBV.)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: