Digging in a little, based on similarities in the readme and the mailing list being the same, it appears to be related to Intel[1]. Looks to be an extension of Intel's Visual Cloud Computing efforts[2].
Edit: Feeling dumb, but confirmed, Intel is also in the license[3] :)
Anyway can't wait to test this. Right now my library still is mainly h264. Wanted to get everything to vp9 a while back which was too slow, then tried hevc which was faster, but not really satisfactorily either. Hope this will get down to vp9 encoding speed so that it's at least feasible...
I was dissapointed that your submission hadn’t caught traction, so I reposted it with “Intel releases ...” in the title because for me that was the newsworthy bit. I was interested in what the community had to say about this Intel code being optimized for Intel cpu’s.
It seems the mods changed the title again though, and reading the comments people seem to be surprised that this is Intel’s.
That is still way too much, a 4k RGB 8 bit frame is about 25 MB, and many frames could be operated on at once, but I doubt the equivalent of around 2000 uncompressed 4k frames (less depending on 10 bit color) need to be in memory all at once.
Scaling video encode to 112 CPU cores is hard. I haven't looked too hard into this encoder but the normal method to scale that high is to encode entire segments in parallel. (YouTube in particular supposedly does each segment single-threaded which is why libvpx has terrible scaling.) Which effectively means encoding up to 112 independent 4k streams.
Each stream could need:
- one source frame
- additional source frames for reordering (3-7 is pretty
normal)
- additional source frames for rate control (x264's default is 40)
- recon for the frame being encoded
- reference frames (IIRC AV1 allows up to 8 to be stored)
Plus MVs, modes, maybe subpel caches, etc.
That's easily 50-60 frames per stream. Times maybe 112 streams for 6000 frames. Easily tunable of course, especially with even a little intra-segment parallelism.
I understand how an encoder could eat up so much memory and justify it in some way, but I can't buy that it's a neccesity or even acceptable in the long run (maybe this is stated to be in the prototype stage).
From what I've seen AV1 breaks frames/segments up into a kd-tree and brute forces these leaves to find the transformation that looks the best with the smallest size. An over simplification obviously, but with everything that encoders are doing I still think it is naive to design them with such a simplistic view of concurrency that they have to be treated as a hundred small files for a hundred CPU cores.
Edit: Feeling dumb, but confirmed, Intel is also in the license[3] :)
[1]: https://github.com/intel/SVT-HEVC
[2]: https://www.intel.com/content/www/us/en/cloud-computing/visu...
[3]: https://github.com/OpenVisualCloud/SVT-AV1/blob/master/LICEN...