Hacker News new | past | comments | ask | show | jobs | submit login

I liked this article very much (and I do share many points it's making).

However there are some claims in that article that bothered me:

> if you’re watching 1080p video and your network takes a dump, well you still need to download seconds of unsustainable 1080p video before you can switch down to a reasonable 360p.

A client can theoretically detect a bandwidth fall (or even guess it) while loading a segment, abort its request (which may close the TCP socket, event that then may be processed server-side, or not), and directly switch to a 360p segment instead (or even a lower quality). In any case, you don't "need to" wait for a request to finish before starting another.

> For live media, you want to prioritize new media over old media in order to skip old content

From this, I'm under the impression that this article only represents the point of view of applications where latency is the most important aspect by far, like twitch I suppose, but I found that this is not a generality for companies relying on live media.

Though I guess the tl;dr properly recognizes that, but I still want to make my point as I found that sentence not precize enough.

On my side and the majority of cases I've professionally seen, latency may be an important aspect for some specific contents (mainly sports - just for the "neighbor shouting before you" effect - and some very specific events), but in the great majority of cases there were much more important features for live contents: timeshifting (e.g. being able to seek back to the beginning of the program or the one before it, even if you "zapped" to it after), ad-switching (basically in-stream targeted ads), different encryption keys depending on the quality, type of media AND on the program in question, different tracks, codecs and qualities also depending on the program, and surely many other things I'm forgetting... All of those are in my case much more important aspects of live contents than seeing broadcasted content a few seconds sooner.

Not to say that a new way of broadcasting live contents with much less latency wouldn't be appreciated there, but to me, that part of the article complained about DASH/HLS by just considering the ""simpler"" (I mean in terms of features, not in terms of complexity) live streaming cases where they are used.

> You also want to prioritize audio over video

Likewise, in the case I encountered, we do largely prefer re-buffering over not having video for even less than a second, even for contents where latency is important (e.g. football games), but I understand that twitch may not have the same need and would prefer a more direct interaction (like other more socially-oriented media apps).

> LL-DASH can be configured down to +0ms added latency, delivering frame-by-frame with chunked-transfer. However it absolutely wrecks client-side ABR algorithms.

For live contents where low-latency is important, I do agree that it's the main pain point I've seen.

But perhaps another solution here may be to update DASH/HLS or exploit some of its features in some ways to reduce that issue. As you wrote about giving more control to the server, both standards do not seem totally against making the server-side more in-control in some specific cases, especially lately with features like content-steering.

---

Though this is just me being grumpy over unimportant bits, we're on HN after all! In reality it does seem very interesting and I thank you for sharing, I'll probably dive a little more into it, be humbled, and then be grumpy about something else I think I know :p




I'm glad you liked it.

> A client can theoretically detect a bandwidth fall (or even guess it) while loading a segment, abort its request (which may close the TCP socket, event that then may be processed server-side, or not), and directly switch to a 360p segment instead (or even a lower quality). In any case, you don't "need to" wait for a request to finish before starting another.

HESP works like that as far as I understand. The problem is that dialing a new TCP/TLS connection is expensive and has an initial congestion control window (slow-start). You would need to have a second connection warmed and ready to go, which is something you can do in the browser as HTTP abstracts away connections.

HTTP/3 gives you the ability to cancel requests without this penalty though, so you could utilize it if you can detect the HTTP version. Canceling HTTP/1 requests especially during congestion will never work through.

Oh and predicting congestion is virtually impossible, ESPECIALLY on the receiver and in application space. The server also has incentive to keep the TCP socket full to maximize throughput and minimize context switching.

> From this, I'm under the impression that this article only represents the point of view of applications where latency is the most important aspect by far, like twitch I suppose, but I found that this is not a generality for companies relying on live media.

Yeah, I probably should have went into more detail but MoQ also uses a configurable buffer size. Basically media is delivered based on importance, and if a frame is not delivered in X seconds then the player skips over it. You can make X quite large or quite small depending on your preferences, without altering the server behavior.

> But perhaps another solution here may be to update DASH/HLS or exploit some of its features in some ways to reduce that issue. As you wrote about giving more control to the server, both standards do not seem totally against making the server-side more in-control in some specific cases, especially lately with features like content-steering.

A server side bandwidth estimate absolutely helps. My implementation at Twitch went a step further and used server-side ABR to great effect.

Ultimately, the sender sets the maximum number of bytes allowed in flight (ex. BBR). By also making the receiver independently determine that limit, you can only end up with a sub-optimal split brain decision. The tricky part is finding the right balance between smart client and smart server.


Interesting points and insight, thanks. I should probably look more into it.

My point was more about the fact that I found the article unfairly critical towards live streaming through DASH/HLS by mainly focusing on latency - which I understand may be one of the (or even the) most important points for some use cases, but weren't much on the cases I've worked on, where replacing DASH would be very complex.

This was kind of acknowledged in the tl;dr, but I still found the article unfair at this level.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: