Hacker News new | past | comments | ask | show | jobs | submit login

Great article, thank you! Minor nit about footnote #10 (the very bad network failure mode): it really depends on where the bottleneck is and how the client has been implemented to react - there's a huge spectrum of client implementations out there, ranging from nearly dumb to those having wizard level heuristics.

If it's the user's last mile connection (between e.g. their home and their ISP), then the big HLS/DASH/etc buffer translates into a lot of time to react. So clients have the option to shift quite low - and do so quickly - if there are some very low bandwidth options, in theory even switching down to an audio-only or nearly-audio-only stream if one is provided, and can also choose to be optimistic/aggressive to resume playback as soon as one full chunk is downloaded - and some implementations will even resume playback when less than a full chunk is downloaded. The client side logic has a lot of latitude here to balance fast start/resume times vs sustaining playback.

When the bottleneck or failure is elsewhere, HLS can be incredibly durable. For extremely high profile events, for example, there are typically multiple CDNs involved, multiple sources going to independent encoders, etc. So an HLS/DASH client might talk to many different servers on a given CDN, as well as servers on alternate CDNs, and even grab what amount to being different copies of the stream spit out by different encoders. It's not uncommon for a client to be testing different CDN endpoints throughout playback to migrate away from congestion automatically.




That's a really great point about multi-CDN architectures.

Partly because big parts of WebRTC are not standardized (session setup signaling, of course, but also in practice lots of necessary state management) it's a little bit hard to imagine how to build an equivalent for WebRTC.

Relatively recently, I would have said that our experience running large-scale WebRTC stuff in production made "core" infrastructure failure relatively low on our list of concerns. The two components of the last mile connection, on the other hand, are always a huge pain point because of the long tail of bad ISPs and bad Wifi setups.

However ... many of us who try to deliver always-available video services got something of a wakeup call in November and December last year when AWS had two pretty big outages two months in a row.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: