Having attempted to WebRTC as a generic video transport, I can say that WebRTC h...

vr000m · on June 6, 2022

Webrtc protocol doesn’t dictate 1 or 2. Although browsers do implement some of their own assumptions for this. By default the client side buffer can be orders of 100s of milliseconds. this is as you pointed out tuned for real-time or live applications.

If you’re doing something like YouTube/Netflix and want to avoid going to a lower definition of the stream, that too can be tuned, albeit you’d want to use simulcast and implement your own player (to feed the video and audio frames for decoding at the pace you dictate).

Karrot_Kream · on June 6, 2022

None of these problems are specific to WebRTC. You'll run into them in a WebRTC implementation, you'll run into them with QUIC, even with ffmpeg on the CLI you'll need to specify buffer sizes. As you mention these are both problems with livestreaming and the more you buffer, the less "live" your stream becomes. If you're interested in transmitting static videos, then why not go with HLS or even just making the static file available for direct download through HTTP instead of a live technology?

sumy23 · on June 6, 2022

The buffer sizes in ffmpeg are more about ensuring that the calculated bitrate is accurate iirc than ensuring smooth streaming (although you need your bitrate enforced to guarantee smooth streaming).

Karrot_Kream · on June 6, 2022

IIRC (it's been a bit since I've configured this), you can specify both codec buffers and buffers for streaming to smooth out issues reading from the codec output. I could be wrong though.

Sean-Der · on June 6, 2022

1.) Why can't you buffer on the client side for WebRTC? That sounds like a client issue (what library were you using?) not the protocol.

2.) I use the same tactic as HLS. Generate your video with a reasonable (~2 seconds) keyframe interval. When a new client connects start sending at the keyframe.

sumy23 · on June 6, 2022

1) The point of WebRTC is that it’s real-time. If you buffer then it’s not real-time.

2) Adding key frames increases the bitrate greatly which exacerbates problem 1.

Sean-Der · on June 6, 2022

1) I don't think WebRTC has a specific point. Lots of users came together with their use cases and was designed by consensus. WebRTC can (and does) have toggles around latency/buffering.

2.) I am not aware of a way you can no keyframes, but be decodable at anytime. I just have done it 'HLS Style' or WebRTC 1:1. Curious if anyone else has different solutions.

sumy23 · on June 6, 2022

1) WebRTC and RTP both have RT in their name. RT stands for real-time. If I recall correctly, the only buffer WebRTC has is the jitter buffer, which is used for packet ordering, not for ensuring that enough has buffered to handle bitrate spikes.

2) Yes, you either need a high keyframe interval or some type of out-of-band signaling framework to generate keyframes. WebRTC uses RTCP. A good question is why does WebRTC feel RTCP is necessary at all? Why not generate a keyframe every N seconds like you do with HLS and remove the complexity of RTCP entirely? The answer is that many clients cannot handle the bitrate at real-time speeds.

saurik · on June 6, 2022

1) That is a specific implementation, and has nothing to do with the protocol, which certainly doesn't define a "jitter buffer". People routinely use RTMP--which also has RT in the name--to transfer content to streaming services with massive buffers at every step in the pipeline.

vr000m · on June 6, 2022

Most common browser implementations use an Open GOP. That means an IFrame is implemented when needed. On scene change or when there’s high motion.

Only naive implementations would burst an IFrame on to the network, most pace them. And if needed, you could split your iframe into several frame intervals and decode them without creating a burst by bit rate.

Actually a lot of webrtc implementations use 1s or 2s GOP length. Again depends on how much control you’ve on your pipeline. Browsers implementations do make some assumptions on usecase.

solar-ice · on June 6, 2022

That is not what open GOP means. Open GOP means pictures can reference IDR frames other than the most recent one in decode order, and is a pain in the ass for various reasons, but is technically more efficient. You're referring to a dynamic GOP.