Hacker News new | past | comments | ask | show | jobs | submit login
Heroku WebSockets Now in Public Beta (heroku.com)
159 points by coloneltcb on Oct 8, 2013 | hide | past | favorite | 51 comments



Documentation for utilizing this with Python and Flask :)

https://devcenter.heroku.com/articles/python-websockets


I wonder how many WebSocket connections one dyno can take?

Also seems like this would make heroku's routing problems[1] even worse.

[1] http://news.rapgenius.com/Jesper-joergensen-routing-performa... et al.


> Also seems like this would make heroku's routing problems[1] even worse.

No, this will not make routing issues worse.

The routing issues described in that post specifically apply to single-threaded / non-concurrent applications (or those with very low concurrency, such as Unicorn w/ 2 workers).

WebSocket connections are like requests that last forever. If you're using WebSockets in your app, you'll need your app to be highly concurrent in order to maintain lots of open connections to your users. You don't want regular requests to block behind never-ending WebSocket requests.

Random routing should actually work pretty well on apps with high concurrency. Node.js, Play, Go, Erlang, and even Ruby apps with Faye should all work great.

If you're concerned about this for your app, the best way to find out is to test it!


That's a little disingenuous. You've selected a very specific set of frameworks, whereas most of the users here are probably thinking "great! I can run my Rails stack with no problems! Heroku engineer said so!"

Note to readers: the only thing that "fixes" this is if the request-handling code is asynchronous in such a way that it doesn't block a process when connections are held idle. Most of the common web frameworks don't do this because the coding required to make a fully asynchronous stack is nasty. Even apps written in nominally asynchronous frameworks (like node.js) could be in trouble if the request path is pathological (i.e. the websocket periodically makes long-running, blocking database queries).

That said, most of you will never encounter this problem, because it's the sort of problem that's "nice to have" -- by the time concurrency issues become a limit, your app will be popular.


I don't think that's disingenuous.

It's safe to assume that anyone who hopes to leverage websockets will not be using a blocking application architecture.

We can also assume people are running this software on multiprocessing machines with connections to the internet.

Of course there's always people "doing it wrong," but caveating every potential misunderstanding is a slow way to communicate.

edit: Your post is still valuable though! Thanks for highlighting what makes these frameworks sensible for use with websockets.


"It's safe to assume that anyone who hopes to leverage websockets will not be using a blocking application architecture."

No, it isn't. I'll wager that right this very second, there's someone out there incorporating websockets into their Heroku-based Rails app and not thinking about (or understanding) the consequences.


Wouldn't such an app be hosed on almost any platform due to massive memory waste?


I don't think the memory waste is the problem in this case, a websocket is a long lived connection. If you mix it with regular requests and don't think about the concurrency consequences you'll be able to serve 1 request and then allow for 1 websocket connection and your done. All other connections will be pending until the websocket is closed.


bgentry's comment didn't seem disingenuous to me at all, and I was surprised by sync's question. WebSocket connections are long lived, thus if your framework only supports one (or a few) concurrent connection you're gonna have a bad time.

Heroku's past routing problems with certain low concurrency frameworks/servers doesn't apply with WebSockets because you'd be crazy to use such a framework for WebSockets.


"if your framework only supports one (or a few) concurrent connection you're gonna have a bad time."

Rails only supports one concurrent connection per process (by default...for good reasons), and there are a great many people using it at scale, including on Heroku. Asynchronous stacks are becoming more common, but they're still exotic in terms of deployment -- and most of those probably aren't written very well.


I'm specifically talking about WebSockets. Do you really want to run one process for every client to connected to your WebSocket server? The answer is no. Even one (OS) thread per connection can get unwieldy.

And I think lots of people would disagree that async stacks are still "exotic" or "not written very well".


"Do you really want to run one process for every client to connected to your WebSocket server? The answer is no. Even one (OS) thread per connection can get unwieldy."

Yes, no kidding. But people will still try to do this with frameworks that don't support anything else (like Rails), because that's the shortest path to a working product.

"And I think lots of people would disagree that async stacks are still "exotic" or "not written very well"."

Well, those "lots of people" can disagree all they want, but they're wrong. The problem isn't that the frameworks are badly written, necessarily -- it's all the stuff in the stack, including the app-specific logic. Virtually no one knows how to write asynchronous web apps of any complexity. It's a very hard problem.


This is why greenlets exist.


As of Rails 4, the current release if Rails, you're incorrect.


Yes, rails 4 has threading turned on by default now. That eliminates the absolute stupidity of needing one process per concurrent request (finally!).

It's nice that you guys are adding this stuff, but it doesn't invalidate the larger point.


Can anyone explain why Heroku had to enable WebSockets in the first place?


Since we launched the Cedar stack, we've used AWS ELBs as the front layer in our routing stack. Since we had only ever allowed regular, short-lived HTTP requests through our stack, we opted to use these in HTTP(S) listener mode [1]. When used in HTTP(S) mode, ELBs have historically been very strict about what traffic they allow through.

As the WebSocket standard is very recent, it has never been supported by ELBs in HTTP mode.

ELBs in TCP mode can support any TCP traffic. It's become clear that we need this flexibility, so we're moving to TCP-mode ELBs now. This was not trivial, though, as long-lived connections (like those used for WebSockets) have different implications for our HTTP routers. That had to be taken account.

Nonetheless, we've had a private beta for a long time that worked as described above. But we deemed it insufficient for general customer use because TCP-mode ELBs mean that you lose the client's source IP and protocol. Fortunately, ELBs now have Proxy Protocol support [2], which allows us to keep that request information that Heroku apps typically rely on.

[1] More info on ELB modes: http://docs.aws.amazon.com/ElasticLoadBalancing/latest/Devel...

[2] http://aws.typepad.com/aws/2013/07/elastic-load-balancing-ad...


How did you solved ELB 60s disconnections for idle Websocket connections? I know it can be increased, but not by too much.

EDIT: also when 2 years ago I did load testing with ELB and created 1 million concurrent WebSocket connections, I did received mail from Amazon asking what the hell I doing and was asked to stop it ;)


The ELB idle connection rule still applies: https://devcenter.heroku.com/articles/heroku-labs-websockets...

You'll need to send some data every ~55s or so in order to keep the connection alive.


This idle timeout can be increased by at least an order of magnitude now, upon request to AWS. Open a ticket.


> As the WebSocket standard is very recent, it has never been supported by ELBs in HTTP mode.

WebSocket has been around since 2010. This is just confirmation to me that Heroku isn't a good fit for apps targeting new web technologies. It's a good fit for apps targeting IE8.


By that logic AWS isn't a good fit for new web technologies either, since that was the blocker here.


This is why I would never use any PaaS, only IaaS. IaaS providers don't do blog posts proudly declaring that they now support 3 year old technology.


> IaaS providers don't do blog posts proudly declaring that they now support 3 year old technology.

http://aws.typepad.com/aws/2013/07/elastic-load-balancing-ad...


AWS is an IaaS ??


ELB is their PaaS offering.


I think you're thinking of Elastic Beanstalk


They're very different from typical HTTP requests: very long lived and generally cannot be buffered.


Woohoo! Finally the weiqi on nodejs+socket.io app we built a year ago is ready for prime time:

http://evening-meadow-5281.herokuapp.com/

It's pretty shoddy, but this might give me some inspiration to move forward.

The nice thing about socket.io, is it prefers websockets, but will fall back to long polling if not available on both the client and server.

So this didn't even require a redeploy, just a `heroku labs:enable websockets`

And like magic I can support more than 10 concurrent users ;)


Is it possible to use something like Socket.io that falls back to long-polling on older browsers?

The biggest problem I've had with Socket.io on PaaS's is it pretty much expects every request from a particular client to be routed to the same backend (i.e. sticky sessions), which works ok with a single backend but obviously doesn't scale.

Given Heroku's stateless routing architecture, I'd guess no, but maybe there's something I'm missing?


My little weekend project (http://chatzilla.herokuapp.com/) that I put on Heroku a few weeks ago uses socket.io with Flask, gevent and socket.io would automatically fallback to XHR polling. Using socket.io is great for these cases since it's all done automagically.


If it does require sticky sessions, it will not work with more than one dyno: https://devcenter.heroku.com/articles/heroku-labs-websockets...


It looks like socket.io has a `RedisStore` that you can use on loadbalancing setups without sticky sessions (like Heroku): https://github.com/LearnBoost/Socket.IO/wiki/Configuring-Soc...


I'm not sure that solves the sticky session problem, I think it may just be for broadcasting to channels. See the last line of this answer: http://stackoverflow.com/a/9275798/113

"A potentially big drawback with this is that socket.io sessions are not shared between workers. This will probably mean problems if you use any of the long-polling transports."


Awesome! Maybe I will try out some SockJS on it! I decided last night to use SockJS for a personal web application for streaming some stock analysis.


I suspect some of SockJS's fallback transports won't work, see my other comment on Socket.io: https://news.ycombinator.com/item?id=6517138

Hopefully I'll be proven wrong.


Since they're already likely decoding and parsing the websocket protocol to proxy it, it'd be nice if they offered something like "SockJS-as-a-service", where Heroku handles all the different connection termination types, and just exposes a plain websocket (or maybe a raw TCP socket, even!) to your backend.


The way I've handled it in the past in Varnish is just to look for the upgrade request and do basically: "Whao they wanna do websockets? I give up, return(pipe);" which means Varnish will from then on just ferry bytes from one sock to the other.


What is the average latency for the service?

From my benchmarks, there is a measurable difference between cloud VPS WebSocket latency/bandwidth and dedicated server WebSocket latency/bandwidth.


With a simple echoing Node.js WebSocket server I'm getting about 50 ms (each way) on Heroku, DotCloud, and Engine Yard. That's probably not much more than the network latency.


This is awesome! I was literally working on a project today that needed websockets and I was so bummed out that I had to use Socket.IO and fallback into long polling.

If you Heroku guys are still listening... I'm so fucking impressed that you already updated the docs I was using earlier. I was going to suggest you update them... but then I went to grab the link and it was updated. Well done.


I'll forward your remarks to the devcenter staff and the steward of the changes; I'm sure both parties will appreciate it a lot.


Any particular reason why blogs.heroku.com would want to know my location? Isn't the IP generally enough?


Scroll down and you'll see a geo-demo that is better off with your browsers reported location.


Whelp, looks like they just killed Pusher: https://addons.heroku.com/pusher


Good; websockets are a feature not a company.


Not at all, you still need to manage state across processes /dynos. Either you reinvent that yourself and use something like Redis to do it yourself or you outsource it to something like Pusher and relieve the load on your own web service in doing so.


I needed this feature 1.5 years ago. Thanks Heroku.


Thanks Amazon.


Thanks Amazon?

This is probably in reference to Amazon forbidding the use of another software load balancer than ELB.

I also read on reddit that there is in fact just ELB. Especially when if you require a loadbalancer and don't host on AWS, you're pretty much screwed.

[Note, sarcasm.]


I've been using this with a Play Framework + Akka + Scala app and it's working great! Thanks Heroku.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: