1.) Why can't you buffer on the client side for WebRTC? That sounds like a clien...

sumy23 · on June 6, 2022

1) The point of WebRTC is that it’s real-time. If you buffer then it’s not real-time.

2) Adding key frames increases the bitrate greatly which exacerbates problem 1.

Sean-Der · on June 6, 2022

1) I don't think WebRTC has a specific point. Lots of users came together with their use cases and was designed by consensus. WebRTC can (and does) have toggles around latency/buffering.

2.) I am not aware of a way you can no keyframes, but be decodable at anytime. I just have done it 'HLS Style' or WebRTC 1:1. Curious if anyone else has different solutions.

sumy23 · on June 6, 2022

1) WebRTC and RTP both have RT in their name. RT stands for real-time. If I recall correctly, the only buffer WebRTC has is the jitter buffer, which is used for packet ordering, not for ensuring that enough has buffered to handle bitrate spikes.

2) Yes, you either need a high keyframe interval or some type of out-of-band signaling framework to generate keyframes. WebRTC uses RTCP. A good question is why does WebRTC feel RTCP is necessary at all? Why not generate a keyframe every N seconds like you do with HLS and remove the complexity of RTCP entirely? The answer is that many clients cannot handle the bitrate at real-time speeds.

saurik · on June 6, 2022

1) That is a specific implementation, and has nothing to do with the protocol, which certainly doesn't define a "jitter buffer". People routinely use RTMP--which also has RT in the name--to transfer content to streaming services with massive buffers at every step in the pipeline.

vr000m · on June 6, 2022

Most common browser implementations use an Open GOP. That means an IFrame is implemented when needed. On scene change or when there’s high motion.

Only naive implementations would burst an IFrame on to the network, most pace them. And if needed, you could split your iframe into several frame intervals and decode them without creating a burst by bit rate.

Actually a lot of webrtc implementations use 1s or 2s GOP length. Again depends on how much control you’ve on your pipeline. Browsers implementations do make some assumptions on usecase.

solar-ice · on June 6, 2022

That is not what open GOP means. Open GOP means pictures can reference IDR frames other than the most recent one in decode order, and is a pain in the ass for various reasons, but is technically more efficient. You're referring to a dynamic GOP.