Hacker News new | past | comments | ask | show | jobs | submit login
The WebSocket Handbook (ably.com)
303 points by AlexTDiaconu on Jan 11, 2022 | hide | past | favorite | 119 comments



We use WebSockets in two regards: handling live page updates via Phoenix Live View for users (eg: real time chat messages, viewer count, etc) and as a transport medium for our real time API. The former is very easy to handle because for the most part users are navigating around pages which can terminate the ws connection and creates a new one (though most of the times not). The advantages Live View provides us is not having to write duplicated logic in the client & server, and instead just push data to users and their browser automatically reflects the changes.

However the latter use offers very powerful benefits with some difficult downsides. On the positives side you get a "real time" API to work with, and you can handle events as they happen and send updates back to them. In some cases our API users can even respond to a chat message faster than we can!

Since WebSockets are virtually just a transport, it's up to you to write a protocol for handling heartbeats, authentication, and communication. In addition when you have a horizontally scaled service it can make balancing the WebSocket connections a bit more challenging since they are long lived. Deployments are even more inconvenient since (in our case) we disconnect the WebSocket consumers whenever a server is restarted for the update. It can also be difficult to fully measure and understand how many WebSocket connections you have open, and how many resources they are consuming. It's important to really push down the number of computations you are doing for users who are subscribed to the same topics so that when you send out 10,000 updates with the same message it's just the text being sent, not 10,000 DB queries :D.


> it can make balancing the WebSocket connections a bit more challenging since they are long lived. Deployments are even more inconvenient since (in our case) we disconnect the WebSocket consumers whenever a server is restarted

May I suggest that the solution could be to design your WS protocol to be reconnect-friendly, like the Phoenix LiveView protocol? Maybe make a protocol which assumes and expects that a connection may be dropped by any side at any time, with robust context-restoration API such as “full snapshot” or “all updates since event ID”


I think your last paragraph is a good point for using services like the OP. All three of the examples you mentioned are already solved. Granted, you're buying into their paradigm particularly regarding communication model but that's a small price to pay for the upsides.

Disclaimer: we've been using Ably for years and their service and reliability has been outstanding, and we have worked closely with their engineers and I've found their expertise to be above what you may expect from experience with other company's support personnel.


I'm generally reluctant to add dependencies on third-party services such as Ably in the core application, because that would add another barrier to providing an on-prem deployment option, or even allowing a customer to deploy on their own AWS infrastructure instead of ours. I'm actually contemplating this decision right now for a new product.


This is one reason I'm making my up and coming SaaS entirely open source, so you can run a clone in any environment: http://www.adama-lang.org/


That's an understandable concern you have IMO. As with anything, the solution should fit the problem. Those aren't issues for us but I can empathize with the concern with adding more external services. In our case, we actually tried to implement our own socket "stuff" in one project and ended up switching that to Ably as well just because it was very difficult to both implement and debug over AWS.


I am surprised to hear about on-prem deploys when the direction of travel is towards serverless/cloud solutions that don't require orchestration. Out of interest, why is on-prem so important for your use case? Separately, I'd be interested in how you'd view that decision if there was a less scalable but mostly fully functioning open source version of the third-party service you could deploy if you needed to?

Matt, co-founder of Ably


Sorry to butt in like this, but cloud is just one direction, not the direction. Pretty much any company that values their privacy and independence will want on-prem. This should not come as a surprise to anyone.


For one of my company's products, we're already working with a security-sensitive customer on a deployment on their own AWS infrastructure (in their own VPC). This customer wouldn't be comfortable using our public cloud deployment unless we get a SOC2 certification or similar. For the product I'm developing now, some potential customers will definitely want to keep it physically located within their corporate network. That's all I can say publicly.

If there was a less scalable, but still mostly functioning open-source substitute for Ably, then I'd be way more comfortable using Ably in the main public cloud deployment.


I understand the SOC2 requirement, which is why we're SOC2 compliant. A long and painful process, but one that has opened up opportunities like the one you said would otherwise require an on-prem installation. We have in fact also got some VPN-like links (AWS PrivateLink) with customers who are particularly sensitive to security and data leaving their network, and that coupled with SOC2 has made it possible.

> If there was a less scalable, but still mostly functioning open-source substitute for Ably, then I'd be way more comfortable using Ably in the main public cloud deployment.

Thanks for the feedback, I will pass on to the product team!


Where I'm based (in Asia) on-prem is definitely a need for some sectors. Banks and financial institutions, healthcare, government etc. They may not need it for all solutions, but they sure want to know they can have it if needed. Its the very reason we (SaaS company) make sure we don't rely on any cloud PaaS services for our product.


Thanks ddoolin. I am not sure which company you are, but I appreciate your comments and even more happy to hear you feel we're doing a fantastic job! You've sort of encapsulated why we exist: we know software engineers can build and run all sorts of things on their own i.e. we're not the only option from open source solutions to custom builds. However, we built Ably to be the easiest, most scalable, and reliable realtime option which for developers that removes complexity and helps them manage costs as they grow (pay for what you use).

Matt, co-founder at Ably


>we disconnect the WebSocket consumers whenever a server is restarted for the update

Isn't avoiding this the main selling point for BEAM? As in Erlang: the movie. Can't that be done with websockets?


You could totally do this with BEAM, via hot swapping. The WebSocket processes don't really change, it's the implementation of what happens when messages comes in that changes. So you'd setup your hot swap with this in mind. (The connection processes stay connected and the behavior module is swapped out?)

However, hot swapping is not super common in practice. Mainly because it's added complexity that most people can live without.


Right, this is the comment I was going to make as well. It's certainly possible, but there's tradeoffs (mainly around complexity) to code hot-swapping. We haven't implemented it because restarting servers to upgrade code helps everyone prepare for when servers go down for unexpected reasons, or so I like to think :P!


I use websockets quite a lot, for real-time dashboard kind of purposes.

The one thing i really wish websockets had is some kind of application-level acknowledgement or backpressure.

At the server end, you're blasting out messages to the client, but you have no idea if it is keeping up with them. Most of the time, it will be, but if there is a sudden spike of activity, suddenly all your dashboards are going wild, and the client may start to struggle. At that point, you want to be able to shed some load - delay messages a bit, then drop any message which gets superseded (eg if "reactor core temperature is 1050K" is buffered and you get "reactor core temperature is 1100K", you can drop the former). To do that, you need feedback about how far the client has got with processing messages.

You can build a feedback mechanism like this into your application protocol on top of websockets easily enough. But you probably want to do that from the start, or else you will, like me, one day look around and realise that retrofitting it to all your dashboards is a monumental effort.

The RSocket protocol might be a good start - it provides reactive streams semantics, and has a binding to websockets:

https://rsocket.io/guides/rsocket-js


There should be a law for message passing systems, which says that everyone will eventually want ordered delivery, multiplexing (with priorities), exactly-once semantics, acknowledgements and backpressure. (Maybe more?)

I'm pretty convinced all these popular features could be layered in a reasonable way that could be implemented in most messaging systems, and have standardized semantics and conventions. It seems like every time, we're reinventing the wheel, and half the time people talk over one-another because we're using inexact language.

Basically, what I want is a "message passing a la carte" paper.


"Every sufficiently-complex system eventually includes a badly-implemented email / lisp / kafka (/zmq/etc)"

Something like a combination of Greenspun's Tenth Rule[1] and Zawinski's Law[2]. Plus whatever would include your queueing system of choice.

Though honestly I've seen more bad queues than emails or lisps. By an order of magnitude or two.

[1]: https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule [2]: https://en.wikipedia.org/wiki/Jamie_Zawinski#Zawinski's_Law


I built omnistreams[0] primarily because of the lack of backpressure in browser WebSockets (lots of background information and references in that README). It's what fibridge[1] is built on. We've been using it in production for over 2 years, but I never ended up trying to push omnistreams as a thing. I believe the Rust implementation is actually behind the spec a bit, and the spec itself probably needs some work.

At the end of the day I think RSocket is probably the way to go for most people, though the simplicity of omnistreams is still appealing to me.

EDIT: I just learned about WebSocketStreams[2] from another comment[3] and sounds like they may solve the backpressure issue natively.

[0]: https://github.com/omnistreams/omnistreams-spec

[1]: https://iobio.io/2019/06/12/introducing-fibridge/

[2]: https://web.dev/websocketstream/

[3]: https://news.ycombinator.com/item?id=29894938


Nice work on omnistreams!

The WebSocketStream API is a small improvement, because it can leave backed-up messages in the socket buffer, but it still means you're depending on socket buffers for backpressure, which i think is not enough. There's still no way to actually set the receive socket buffer size in the browser, is there?


I believe socket backpressure would have worked for my use case. Curious why you think it wouldn't be enough?

As far as I know there's no way to set buffers. IIRC there's a buffer value you can check which is what I tried to use first but don't think that got me very far. Seems like Chrome and Firefox handled it differently or something.


As far as i know, browsers do not set a receive buffer size for websockets, so if a client is not reading from the websocket fast enough, the kernel will expand the buffer until it reaches its maximum size, which on my machine is 6 MB. Say you are writing 1 kB/sec of data to this websocket. It will take ~100 seconds to fill.

That means that you won't even know that a client is stuck for a minute and a half (plus however long it takes to fill your send buffer!), and even if you then throttle back, the client has a minute and a half of high-rate data to work through before it catches up. If you throttle up again once you see that the buffer is clearing, and the client gets overloaded again, you will keep hovering around that buffer full state, and the client will keep reading significantly stale data.

To get a useful real-time signal from socket buffers, you need them to be really small. But to get nice smooth transfers of bulk data, you need them to be big, so that is what is the default.


If it's streaming data like dashboard statistics then going forward the new WebTransport API might be a much better base: https://github.com/w3c/webtransport/blob/main/explainer.md At this instant it's hot off the assembly line though having just shipped in Chrome 97, Firefox is still working on it.


Oh wow this is going to be useful


I don't know anything about websockets, but isn't it over tcp? Meaning if the client isn't keeping up, their buffer should be full and the server should be blocked from sending more until it drains (unless it's queuing the messages somewhere else?). Or is that not how tcp backpressure works?


TCP provides backpressure but depending on it to provide backpressure over the internet will greatly increase latency, in my experience.

In one application I was streaming jpeg frames over a websocket and by the time the server application experienced backpressure there were 10s of seconds of messages buffered between the server and client. So the message rate would eventually settle into a rate the connection could sustain but messages would take 10+ seconds to reach the client.


Perhaps that sounds like a good time to use TCP_CORK, or TCP_NODELAY flags.

Or, perhaps you need to tune the TCP-Window to your application.


> TCP_CORK, or TCP_NODELAY flags

I'm sending large messages, ~150 kibibytes, so much larger than a typical internet packet. So I'm not sure Nagle's algorithm is the problem.

> tune the TCP-Window to your application.

This is a possibility. I already had to increase wmem_max to handle fast udp connections.

I'll try to put together a minimal test case when I get the chance.


One problem is that WebSockets in the browser are 100% asynchronous, ie they don't block on send. So if you have a large amount of data client-side you can easily crash a browser window by sending WS data in a tight loop.


It is, but since you don't control the receive buffer size in the browser, or the TCP window size (or, with many web frameworks, the socket buffer size in the server!), you can't rely on socket buffers giving you timely feedback. By the time the server's send buffer fills, there is already masses of stale data buffered in between you and the client.

In my apps i do indeed detect the socket buffers filling, just as you suggest, but pretty much only as a way of detecting completely wedged clients.


That’s right, but the WebSocket browser API is event driven so the browser HAS TO recv from the socket and dispatch a JS event as soon as data is available.

You’ll get proper back pressure on websockets with synchronous clients that read messages actively.

This is a browser API spec problem more than a protocol problem.


It is better to implement it at front end. Actually it is not better it is kinda has to be that way. Because page will be already lagging and if you implement it at front at there won't be any lags. And it is better to have limits about update frequency at the back-end. If reactor temperature already announced 100 ms ago and it went to 1051K from 1050K maybe it is better to delay it for a second.


It's probably worth mentioning that WebTransport just shipped in Chrome 97 (2022-01-04), which seems to be a worthy successor to WebSockets [0]. It allows for reliable and unreliable modes which is a problem for games using WebSockets, among other things.

[0] https://web.dev/webtransport/


Only Chrome so far it seems. Firefox's position:

> We are generally in support of a mechanism that addresses the use cases implied by this solution document. While major questions remain open at this time -- notably, multiplexing, the API surface, and available statistics -- we think that prototyping the proposed solution as details become more firm would be worthwhile. We would like see the new WebSocketStream and WebTransport stream APIs to be developed in concert with each other, so as to share as much design as possible.

https://mozilla.github.io/standards-positions/

Unclear what the WebKit (Safari) folks think, based on https://lists.webkit.org/pipermail/webkit-dev/2021-September... that has no replies.

Microsoft is just doing whatever Chrome is doing with Edge, so I guess it'll appear there sooner or later, but can't find any public information.

Bit early to start using WebTransport seems to be the conclusion.


I know, we're very excited about WebTransport and what it can offer. As you say, on the surface it seems to provide both a more performant and reliable transport for realtime communications. Once it hits mainstream, we'll be certainly adding it as another transport we support in our stack (currently websockets, HTTP, SSE, MQTT etc)

Matt, co-founder of Ably


I skimmed the page, but couldn’t quite tell: is this what the WHATWG Streams API is becoming? (ie streams with back-pressure?)


Is this a new standard, or yet another thing Google is pushing and everyone else have to implement?


Linked in OP: https://w3c.github.io/webtransport/

It's a proposal, 2 editors are from Google the other one is from Microsoft.


A couple months ago I posted the "Implementer's guide to WebSockets" that I wrote, but it seemingly got shadowbanned. [1]

I wrote the guide with example code for people wanting to know how to implement the complete WS13 protocol from scratch, so you can try it out, fiddle around and modify it to your needs.

The guide is more in-depth and assumes that the reader is willing to read the RFC when they're stuck :)

[1] https://cookie.engineer/weblog/articles/implementers-guide-t...


This is pretty excellent. I'll hold on to both of these resources as I think websockets would be just wonderful for the work I'm doing right now.

> willing to read the RFC when they're stuck :)

I owe RFCs for just about everything I do, they play a small but necessary role, I can't imagine not wanting to dip into one of them even if you aren't stuck.


This looks nice. I think I'll start with it instead of the OP ebook. Thanks for writing and posting it.


Hi HN! I'm Alex, and I've been researching and writing about WebSockets for a while now. I'm the author of the recently released WebSocket Handbook. AMA about the WebSocket tech, the realtime web, Ably or anything related to Liverpool FC.


Hi! I want to create a web app like Google docs where multiple users can collaborate in real time to edit the document together (using a special link like gdocs generate). I want to save the docs in a MySQL db (no firebase)

My questions are:

1) Since multiple people are working together how does one manage conflicts, i.e. 2 people sending different edits simultaneously.

2) If one clients gets disconnected (4g) and then reconnects later how does it sync the changes it made during it was offline?

I recently watched this RAFT presentation (1) and I think I would need to use something like this?

What other alternatives are viable?

Also can I make it happen using just PHP, Javascript and MySQL?

Thanks

(1) http://thesecretlivesofdata.com/raft/


I made a document editing demo here: https://github.com/fanout/editor

It uses operational transformations ("OT") to manage conflicts, and it saves the data in MySQL. Technically any Django DB backend will work for storage, but the public demo instance uses MySQL.

One of the reasons I made this thing was to show that realtime apps don't need to require heavy frameworks or unusual databases. And it loads super fast.

I don't think you need Raft if you have a central database storing the document. You could also consider using CRDTs instead of OT, which may be more powerful but also more challenging to develop.



Big question! It deserves its own blog post haha.

CRDTs can be the answer. We actually wrote about them recently (https://ably.com/blog/crdts-distributed-data-consistency-cha...), and there is more coming soon as our Chief Of Scientist and his team are researching CRDTs and building demos!


Have a look at https://m-ld.org/ , its a CRDT implementation that's all Javascript. it could help


Looks very promising. Thanks for sharing!


Yup, and it supports Ably too, see https://js.m-ld.org/#ably-remotes!

Matt, Ably co-founder


CRDT solve a part of your problem, and an important consideration is whether or not you want off-line editing. If you don't need off-line editing, then a WebSocket can do it.

I'm actually using my project to build a collaborative IDE (designer like Figma): http://www.adama-lang.org/

I'm going to be launching it as a SaaS soon so people can spin up a new back-end without managing an infrastructure.


What is the right way to handle authentication over web sockets?


As with REST APIs, you'd want to be able to authenticate to a websocket-based API using either basic auth, or a bearer token-based auth scheme. Unfortunately, the browser websocket API doesn't allow you to specify arbitrary headers in the websocket request, so it's typical instead to have credentials supplied via a query param (such as "accessToken" for a bearer token) in the wss request.


> so it's typical instead to have credentials supplied via a query param (such as "accessToken" for a bearer token) in the wss request.

If someone ends up actually doing this in a production system, remember to not to log the accessToken if you're logging full paths/URIs somewhere, as query params usually is a part of that type of logging.


Yup, true, although tokens should be ephemeral so less of a risk. Authenticating inline over the Websocket connection is valid too, but it does expose the socket connections to slightly more surface area of attack i.e. if you pass in a token as a param, the Websocket request can be rejected immediately. If however you authenticate after establishing a Websocket connection, then there is an attack vector where you simply open Websocket connections and never authenticate. Of course timeouts can be used to disconnect rogue actors, but it is a consideration.

Matt, Ably co-founder


It's the same thing as HTTP. Websocket starts off as an HTTP request with cookies, headers etc. Use those just like HTTP to authenticate, and your Websocket server should pass the user data to the websocket object


Don't have access to the headers from JS.

Best solution might be to generate a short-lived one-time-use ticket and pass it in the querystring.


If you make a normal HTTP request first, the server can issue a standard HTTP cookie to the client. That cookie will then be included when the browser makes the websocket request.

However, websockets are not subject to the same-origin policy, so this exposes you to CSRF [1]. To protect against that, you should check the Origin header on the server side.

[1] https://christian-schneider.net/CrossSiteWebSocketHijacking....


Cookies will be forwarded though, or..?


In a Golang app I'm writing now, I have middleware that authorizes requests. Authenticated requests have a header with a JWT token. I have an endpoint for websockets where if an authenticated request comes in (the handshake) that request is then upgraded to a Websocket connection. This is the cleanest authentication implementation I've ever used thus far, and I wasn't able to achieve the same thing in Node when I was using socket.io.

I'm sure there are repercussions to this on the client-side, but I haven't gotten to that point yet. I'm still writing the server and testing it using automated integration tests.


If I understand correctly, websockets is a thin layer on TCP that does buffer data so that the application get the whole message instead of chunks. I recommend using wss to secure the websocket so that it can't be hijacked, then you don't need to send a token in each message and can do an application layer handshake once. Basically the first websock message from the client would be an authentication message with a password, token or what not.


Yeah this is the technique I've also used.

The first websocket message is the original request, which will have the users cookies / headers where your session information / bearer token should live.


It's a good introduction, and it's a good document to introduce the problems induced by WebSockets (which Ably can come in an solve at scale).

I recently wrote about the Woes of Websocket: http://www.adama-lang.org/blog/woe-of-websocket with an errata based on HN feedback: http://www.adama-lang.org/blog/more-websocket-woe

The depth of this topic is very interesting, and I'm excited as I'm building some of the final pieces for my SaaS (which could compete with Ably).


Is Ably playing with WebTransport? If it moves up from its current trial status, how do you see that changing Ably’s core?


We are monitoring it closely and super excited about what WebTransport will provide, which is both a more reliable and in many cases more performant transport. However, much like WebSockets, it's still quite low level and only provides a basic communication protocol. As such, like we have done with SSE, HTTP, Websockets and MQTT, our service focusses on what developers can enable on top of these lower level transports, such as presence, deltas, history of streams, limitless scale and fan-out of data, and the list goes on https://ably.com/platform.

When WebTransport reaches prime time, I'm confident we'll be supporting it.


Why does Klopp love to abuse referees and then play dumb?


> or anything related to Liverpool FC.

Do people still watch football?

What are the viewership numbers for Liverpool FC?

Does Man United still matter?

:-)


Man Utd never mattered. Source: City supporter.


YNWA!


Before going all-in on websockets I'd like to caution people to thoroughly consider the server-side scaling challenges that come with it.

HTTP servers have already solved traffic management, load balancing, scaling up and down, zero downtime deployments, A/B tests and experimentation and lots more to such a degree that we don't have to even think about them anymore. All of these problems come to the forefront again when you have to scale websocket connections beyond a single server.


There's a (relatively) easy trick for this: Redis pubsub.

When a message comes into an instance, you push it to Redis and have all of your other instances subscribed to it. Messages sync in real-time and the experience is transparent.

I teach the technique here: https://cheatcode.co/courses/how-to-implement-real-time-data...


To be a bit clearer, you don't want to push websocket messages per se; eg auth negotiation or info requests. You do want to pubsub out conditioned messages that are generated by any given instance for the purpose of broadcast; eg "userX connected" or "userX said Y".


I'm going all-in on WebSockets, but I've also seen how to solve all those problems at massive scale. You're right that these challenges are hard, and I don't believe we have an ideal shared offering yet.

We have a cultural challenge of how to manifest the opportunity and benefits presented by context rich communication over the entrenched ideology of statelessness and HTTP.


In case the email collection form gets hugged to death, here's a mirror https://web.archive.org/web/20220111162712/https://files.abl...

In our experience, many enterprise networks/vpns/firewalls still break websocket connections even when using wss, and it should not be used as the only communication channel even if you target evergreen browsers.


Disclaimer: I work for Ably. I agree in principle, so the libraries that handle websockets and also fallback transports using comet (eg SocketIO) are still widely used for that reason, and the commercial pub/sub service providers generally also support comet fallbacks. However, we now find that it is really very rare that clients are unable to use wss.


WSS breaking is surprisingly common for clusters of our enterprise users, to the order of 5-10%. Specifically, when business users connect through security networks like Zscaler, often their employers will MITM all connections (similar to how AdGuard works) but in a way that breaks WSS. We have rigorous monitoring for both frontend and backend, and can trace these failures with accuracy - consistently, both the frontend and the ingress firewall think that the other one cancelled the connection attempt, so some network hop in-between did that. Every other network connection during that time works as expected (long polling/SSE over H2), but WSS get interrupted and sometimes can't reconnect. We still love the aliveness of web sockets, but have to architect data fetching in a way that doesn't depend on them working.


That's surprising to hear, we definitely don't see anywhere near the order of 5-10%, at best on order of magnitude less.

Out of interest, what geography and industries are you operating in where you see such a high rate of incompatibiltiy?

Matt, Ably co-founder


China and LATAM are definitely overrepresented. We deal with conglomerates and manufacturers, so it's vaguely industrial + heavy industry, but all these users are corporate folks using standard-issue BigCo laptops (almost certainly loaded with standard "security" layers from a dozen different vendors, making detailed diagnosis nearly impossible for us as a SaaS provider).


Thanks for the explanation, useful to know.


Thank You for mirror!


I wrote a document explaining how to implement your own web socket service: https://github.com/prettydiff/wisdom/blob/master/websocket_s...

Implementing your own service logic is incredibly helpful in the cases where you have multiple sockets to manage and custom logic associated with identity, reestablishment, custom data handling, and so forth. There are features in the protocol that aren't used in the browser, for example, and allow for custom scaling.

Here are my learnings about web sockets:

* They are session oriented so that means both end points have to agree to connect. That mitigates many security risks associated with HTTP traffic.

* Web socket messages cannot be interleaved. In 99% of cases this isn't an issue, because control frames unrelated to a web socket message can occur anywhere without interruption. This becomes a problem if you are transfer a large file that takes a substantial amount of transfer time. All other messages must wait in a queue, which means long delayed microservice status updates or you just break things.

* Web sockets are so much faster to process than HTTP. A web socket is primitive. There is no roundtrip (request/response), no headers, and no additional negotiation. I reduced some test automation in my personal application from 45 seconds to 7 seconds by fully converting from HTTP to web sockets for messaging.

* Reliance on web sockets simplifies so much of a service oriented application. I used to rely upon callbacks to HTTP responses to verify message completion and perform next step actions in an application. Instead I am switching to specific messaging for everything. A response is a specific message when the responding machine is ready. This eliminates response timeouts, flattens the architecture, and eases service testing by moving all messaging concerns to a single listener as opposed to listening for responses versus requests from other machines.

* Since web sockets are session oriented they are potentially more fragile than HTTP. If the pipe drops you have to reestablish the connection before sending/receiving service messages.


Mercure is an alternative to WebSocket that is especially useful for REST/GraphQL APIs. It's a protocol that builds on HTTP and Server-Sent Events thus is supported out of the box by browsers, mobile apps and IoT clients, and it doesn't suffer from most WebSocket limitations (e.g. header/cookie based authorization works):

https://mercure.rocks


I love Server-Sent Events, but just as a heads up in case anyone doesn't know there are some limitations to SSE:

* Doesn't natively support binary data.

* If you're using HTTP/1.1 in the browser, you'll be severely limited in the number of SSE connections you can have going at a time. If you're on HTTP/2 then it's not a problem.


> it doesn't suffer from most WebSocket limitations (e.g. header/cookie based authorization works)

Don't WS connections send headers? What's the limitation here?


Actually neither the WebSocket nor the SSE browser APIs allow to send custom headers (see https://stackoverflow.com/q/4361173/4363634, https://stackoverflow.com/q/36201347/4363634). Point is that the Mercure protocol specifies the authorization part deeply so it's handled out of the box. With WebSocket you are on your own.


Can anyone explain why using websockets with anything other than a webstack is so much hard than using regular ol' POSIX sockets, given that at some level websockets live right alongside a UDP or TCP socket? When we looked around for a library to use in Ardour to add websocket support, the choices were slim and none of them provided an API as simple as the one for UDP/TCP sockets.


I think it mostly has to do with:

1) the large "implementation surface" of three components (TLS+HTTP+WebSocket), each of which by itself requires an API more complicated, if provided by a userspace library, than a kernel-provided TCP socket, and maybe

2) the fact that non-Web servers are rare enough, and WebSockets are still recent enough, that no "standard" library has emerged and had its edges honed down over time to support multiple applications, especially when the current era and funders of open source are probably less incentivized to create application-independent libraries than in earlier eras where "let's work together to create a free OS with minimal effort" was a larger share of the driving forces.

- With TLS you can certainly link with OpenSSL, but the API is more complicated than a kernel-provided TCP socket, and async/nonblocking TLS requires an API much more complicated. TLS sometimes requires a write in response to a read, and to do that in an apparently nonblocking fashion either (a) the application needs to include callsites back into library in its event loop to tell the library when the underlying socket is writeable just in case the library had something buffered it was hoping to write, (b) the library needs to run its own thread that blocks on the underlying socket, or (c) the library can only be used with languages that support async behavior in a more composable way, which is not C. None of those are good options.

- Parsing the incoming HTTP request is tricky and there's no "standard choice" for this either, e.g. a library that's been distributed in Debian/RedHat/Homebrew for >10 years and is depended-on by a bunch of applications.

- The WebSocket protocol requires that a server write a pong in response to an incoming ping. As with TLS, this means a nonblocking implementation requires a thread or integration with the application's event loop, but it's arguably even worse because WebSocket wants the server to respond soon to a ping. (By contrast, TLS-on-TCP is mostly designed so that an app can ignore the read or write direction as long as it wants.) So you don't just need to possibly queue up that pong and later call into the library when the socket becomes writeable; you need to make sure no other event is going to run or block for a long time in the meantime.

So I think the comparison here may not be, "Why isn't there a library that provides an API for WebSockets that's almost as simple as a kernel-provided TCP socket?" (where the kernel basically runs its own thread and does the async work behind the scenes), but maybe more like, "Why isn't there a user-space library that implements QUIC [or nonblocking TLS, or user-space TCP] with a simple API?"

We have implemented a nonblocking C++ WebSocket/TLS server in the cleanest fashion we could (https://github.com/stanford-stagecast/audio/tree/main/src/ht...), also for a low-latency audio project, but it's still a ton of code and has to make its own assumptions/demands on how it gets invoked. If you wanted to adopt a WebSocket implementation into Ardour, I'd be happy to help you make that happen, but it sounds like you very reasonably were looking to outsource this to a library where your application isn't the only user.


Thanks for that very clear and informative answer.

We ended up using libwebsockets and it's fine. It runs in its own thread and the rest of the code doesn't have to care much about it. It would have been nice to just use the socket directly from liblo (OSC library), but the current arrangement seems perfectly OK.


websockets unfortunately have to start at the HTTP layer and negotiate down to the TCP wrapper layer, so you need at least a partial HTTP stack and everything else that involves to get there. This complicates things a lot. It's like stuffing a turkey, cooking it, then throwing away the meat to just eat the stuffing.

In my experience, Boost Beast[1] is the easiest library to just get going with but you have to deal with all the Boost-isms that comes with. libwebsockets is the 'standard' C implementation but unless you know the websocket RFC front to back it's quite difficult to work with and has a lot of foot-guns.

[1] https://www.boost.org/doc/libs/1_78_0/libs/beast/example/web...


It's related to what you can buy versus build. When you use a WebSocket, there may not be a great solution that fits your need. If you use just HTTP, then there is a wealth of options available to you.

Fundamentally, there is nothing special about a WebSocket over a socket rather than a special handshake, some framing, and the layering within an existing HTTP server. The problem is that the market of developers is vastly different. If you are a systems person, then chances are good you know sockets decently. If you are a typical web-dev, then the chances are not so great and its easy to make a mess that is then exposed to the world.

I've mentored teams, and the key challenge isn't technical but education on all the gotchas.


Anyone looked at the book? I feel a little bit spammed by this post. The linked page is a pitch for an ebook that is a free download, but you have to sign up for promotional mailings in order to get it, and they want your first and last name. Yes you can unsubscribe but this is still obnoxious. A direct link to a pdf would be much more attractive.

I haven't programmed anything with websockets yet, but I read the wikipedia page about them recently and found it sufficient to understand what they were. The rest is a matter of javascript programming that I've avoided messing with so far.


I just provide a spam email account for things like this.

I have yet to read through it much, but it is interesting.

https://files.ably.com/website/documents/ebook/the-websocket...


If you want something simpler for real-time communication you can use comet-stream. It goes through all firewalls and scales better than most single threaded websocket servers: https://github.com/tinspin/rupy/wiki/Comet-Stream


How does that work in the browser context? Just one long-living HTTP request that the server streams messages too? How does the browser reply? It's hard to understand how it's duplex and real-time over just HTTP without making more than one HTTP request.


You have two sockets, one eternal response with transfer-encoding: chunked, which is basically <hex_length>\r\n<data>\r\n\r\n over and over again, so very compact.

On the browser to server it gets a bit heavier because you need GET /path?data=<message> HTTP/1.1\r\nHost: kinda.verbose.com\r\n\r\n and then each request gets a response that can be either zero so 200 OK\r\nContent-Length: 0\r\n\r\n or contain a sync. response. It looks bad but trust me that verbosity is a rounding error when it comes to the real bottleneck which is CPU concurrent atomic parallelism, and for that you basically need to use Java:

https://github.com/tinspin/rupy/wiki (Most people disagree but the VM + GC and Javas memory model allows for atomic shared memory like none other, not even C/C++ can compete because you need a VM with GC to make that memory model work, they tried to copy it into C++11 and that was a faceplant of epic proportions that is still C++ memory model).


Diving a little deeper down:

1 - A websocket "frame" has a variable-length header. Client->Server the header can be 6, 8 or 14 bytes. Server->Client it can be 2, 4 or 10. This is to support payloads < 125 bytes, < 2^16 and up to 2^64. I wish it was just a fixed 4-byte length.

2 - Frames can be fragmented to support streaming (where the sender or possibly a proxy doesn't know/want to buffer the entire response ahead of time). I feel like this is unnecessary in 99% of the cases. It wouldn't be so annoying..except control frames can be interspersed within fragmented frames. This is so that you can send a "ping" while streaming a large message over multiple fragments. Why didn't they just use one of those reserved bits for this?

3 - Client->Server payload is masked with 4 bytes (bitwise xor) so every message your server gets has to be unmasked.


Does someone know, why Websockets are sometimes considered a security risk and blocked by corporate firewalls, even when the rest of the website is considered to be trustworthy?


Anyone considering augmenting an existing web service with Websockets should take look at the Nchan nginx plugin.

https://nchan.io/


Does anyone know when to choose http2 vs websockets?


I'm assuming you mean HTTP/2 with Server-Sent Events, since raw HTTP/2 frames aren't exposed in the browser?

My answer would be use HTTP/2 + SSE whenever you can get away with it. The primary limitation of SSE in this case is you can't natively send binary data (you would have to base64 encode it or something). If you're just using JSON or another text format anyway this isn't an issue.


Wouldn't HTTP2 imply https://github.com/grpc/grpc-web instead?


Maybe. I've never seriously considered grpc-web since it requires a special proxy (Envoy) to work. Seems like too many layers of complexity at that point.


Anyone know of any examples of cool cases where websockets have been used? (Maybe other than games.) I feel like in most cases I see them used the latency gained is basically added back with bloat in other parts.


Just this past weekend, I created a service for transferring files that is built on websockets (https://qr.dibble.codes / https://github.com/acdibble/qr-transfer). It's still rough and very MVP, but it's functional.

My use case was I wanted to transfer a PDF to an Android-based tablet but in the moment didn't want to log into my email. I couldn't think of any quick, easy, and cross-platform solutions so I decided to write the service.

It's built on socket.io which is a godsend for websockets because it automatically handles so much grunt work and your app just works.


Nice! I'm curious what you think of https://patchbay.pub.


I just launched this site with websockets: https://hackernews.pro Websockets used for updating Story/Comment data, and User presence


Feel free to sign up for an Ably account and we'll help support your project with a community package, we love what you're doing!


We use them for a variety of intranet realtime dashboard type things. Both user-facing production UI, where users are looking at screens covered with dozens of dashboards all updating several times a second, and developer-facing monitoring.

The client-side code is generally very simple. A dashboard opens a websocket and attaches a handler which parses the message (all our payloads are JSON), then routes the update to the right bit of the UI. We wrote a thin wrapper round the browser websocket API to handle disconnections. We wrote server-side libraries to support our patterns of use.

I initially did a bunch of developer-facing dashboards using server-sent events, because they're slightly easier to work with. However, websockets have a significant advantage over SSE: you can have a lot more of them open at once. Browsers will limit you to a few (six?) connections per origin, including long-lived SSE connections, whereas you can have dozens or hundreds of websockets open [1]. If you are serving lots of different dashboards off a single server, you rapidly get to the point where this matters!

[1] https://stackoverflow.com/questions/26003756/is-there-a-limi...


Crypto exchanges almost exclusively use websockets for pushing out real-time price and orderbook changes. They provide rest apis also, but with limits that prevent bots from keeping the data as close to in sync as possible, which is important for algorithmic trading.

Whether this qualifies as a 'cool' case is probably subjective, but it is important in practice.

I'm not sure if it's as widespread, but many exchanges use websockets in the front-end to make the same data available to the site users without frequent polling.


Pretty much every app that has a real-time communication component relies heavily on websockets. Slack and other messaging apps, document editors, financial tickers, sports sites.


wrt gaming, there was a websocket game of spaceships flying around posted to HN a few times where the clients open two websocket connections to iirc get around tcp head of queue blocking. Maybe they were sending round robin updates across them.

Couldn't find the Show HN just now but searching "subspace" comments might reveal it ("this reminds me of subspace"). I always wanted to look more into the approach.


Our customers use Ably across such a diverse set of use cases, but here's a few:

1. Sports events streaming live updates (see https://ausopen.com/live-scores, we are streaming live scores for the Tennis Australian Open right now). Companies like Toyota even use us to facilitate engineers tweaking the performance characteristics of their cars in realtime remotely.

2. Edtech - we have numerous customers using us to drive live classroom environments, think shared white boards, collaborative tests, presence, teacher engagement. You may have used Codewars in the past, that uses Ably under the hood for example, https://www.codewars.com/.

3. Live screen sharing and collaborative applications. You may have used the amazing Tuple.app, that uses Ably under the hood https://tuple.app/.

4. Collaborative and live web and mobile SaaS applications, where changes need to occur concurrently, notifications need to be presented, and other realtime updates are needed in the interface. You've probably heard of Hubspot, they use Ably under the hood to power their collaborative and live features, https://hubspot.com

5. Developer infrastructure and platforms that need realtime capabilities at scale. You have probably come across Split, the leading feature flag company backed by Atlassian and Microsoft. They use Ably under the hood to power billions of feature flag updates in realtime each month. https://www.split.io/

6. Financial market data - typically streaming updates to thousands or millions of subscribers with very low latency, and sometimes using features like Deltas (https://ably.com/documentation/realtime/channels/channel-par...) to keep bandwidth & latency as low as possible.

I could keep going, but I hope that gives you the idea, realtime is not just for Christmas or for Games :)

Matt, co-founder of ably.com


I have an app based around some long running resource intensive processes that run on scalable microservices. So you click a button and may have to wait 5+ minutes before you get a response. Websockets let you do notifications and make everything async. They've served me well for request/response that might take 3-30 minutes.


A bunch of things we (Fanout) have seen as a WebSocket provider, other than games: shared document editing, field worker management, live voting, parking space tracking, feature flags, sports updates, home security, ridesharing, financial tickers, notifications, news feeds, exercise classes, realtime audio analysis, and remote controls.


These are great examples. I'd add delivery tracking/any geolocation tracking works great with websockets. And of course chat messaging.



HN is generally reserved for interesting articles. Not self promoting your services with a cookie cutter email harvester for your drip campaign.


I bit and got the book, hoping for something interesting. It’s more or less the documentation on MDN or a WS library docs rephrased.

Finally, after 60 pages of docs I can Google, there’s a section called “Scaling Websockets”, which is an interesting and challenging topic.

Turns out it’s one paragraph long, saying - “Yeah it’s hard. You should consider using Ably. Next book will cover it.”

Shameful.


Hey Aparsons, thanks for your feedback. As mentioned in the post, the Handbook is not finished and we'll continue to evolve, adding more chapters, meat to the current chapters, and we have other ideas like exercises.

It's good that you are interested in Scaling WebSockets, because that is the chapter I am writing now! I hope to get it live in a couple of weeks, once it is I can send you the new version.


Thank you for taking the time to confirm my suspicion, greatly appreciated!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: