Hacker News new | past | comments | ask | show | jobs | submit login
Scaling Secret: Real-Time Chat (medium.com/davidbyttow)
136 points by danielalmeida on May 13, 2015 | hide | past | favorite | 57 comments



50000 concurrent users? That's small enough to do with a single thread on a modern server.... What are they scaling?


Your point is valid. This isn't ultra-high scale, just the name of the series I intend to publish more into. It's more about building a simple service for instant usage upon launch.

We also used Pusher for global in-app notifications, which had to scale past 1,000,000 concurrent connections.

Aside: 50000 concurrent requests on a single thread, huh? :) Unless I'm missing something, that's 50 seconds assuming a request takes only 1ms to process. Sounds like magic.


1ms is high for a request, assuming that you touch only RAM in a pattern with high locality (which you can usually manage when routing chat messages). Take a look at the TechEmpower benchmarks: most of the Java and C++ frameworks can manage 1M req/sec for JSON serialization:

https://www.techempower.com/benchmarks/#section=data-r10&hw=...


> Aside: 50000 concurrent requests on a single thread, huh? :) Unless I'm missing something, that's 50 seconds assuming a request takes only 1ms to process. Sounds like magic.

1ms to route a chat message is plenty.

e.g. on a single AWS m1.small prosody[1] can process 40,000 stanzas per second.

[1] http://prosody.im


It's not just routing. It hS to read modify write from DB first


Clarified the 1MM number in the post.


It really depends on a lot of things such as how often users send messages on average and what kinds of messages (JSON or plain text?). The latest version of SocketCluster can handle 25000 concurrent users per CPU core each sending a JSON message every 6 seconds so the 50K number does match my own benchmarks if you assume that each user sends a JSON message every 12 seconds on average.


50,000 people chatting + persistence, albeit temporary.


"It grew week-over-week to well over 1,000,000 concurrent connections (chat + notifications). Luckily, Pusher worked well and was fairly inexpensive" I'd love to find out what they mean by inexpensive because from pusher.com/pricing 10,000 connections is $399/month. I have no idea what 100x connections would cost (hopefully not $39,900/month). Frankly, the pricing of pusher and other competitors seems insane to me. I was considering pusher for my startup, but I realized that if my startup became mildly successful, I'd be stuck with a bill that would force my business to rapidly build another solution or take on excessive debt. Why begin with prototype technology that could lead your startup (if successful) to become insolvent?

It's not as if there is a shortage of available tech for handling socket connections. It took about a week for me to learn enough node.js and build a service with socket.io that did everything I needed.


10,000 connections is $399/month is about the price of a dedicated server just for chat.

Depends on your application. Generally, if you have 1,000,000 concurrent users, shouldn't you be making some money or have investment to tide you over?

My app has ~3000 concurrent users, which comes out to 250,000 users per day making in the order of $XX,XXX per month.

At 1,000,000 concurrent, we'd expect to have 83 million users per day visiting the site. At that rate, we'd be making at the very least hundreds of thousands of dollars per month and could easily afford a $39,000/mo chat service.


Some apps have lower profit margins than others. Also some companies need to be able to integrate with their realtime channels on the backend (E.g. for doing custom realtime analytics, integrating with third-party services, applying transformations/filtering to messages in realtime, capturing certain messages for storing in your database, launching various realtime-based background tasks, etc... You do lose a lot of flexibility by going with a service like Pusher. Though it is well designed and very convenient.)


"10,000 connections is $399/month is about the price of a dedicated server just for chat." So you're saying that pusher is breaking even on their service and that a custom socket server should also expect to require similar hardware in order to serve 10,000 connections? If so, I highly doubt either is the case. I've seen 50,000 sockets served on a $20/month VPS using socket.io without issue.


For those who think Pusher $399 for 10k concurrent is a lot you have Realtime (http://framework.realtime.co/messaging) with $250 for 25k (including 251 million messages every month). Just saying...


My friend has started making Pushman, a free alternative to Pusher. Worth looking at http://pushman.dfl.mn/


Secret raised $35M.

$399/month is a tiny drop in that bucket. It is almost certainly not a good use of engineering resources for Secret to build their own in-house Pusher alternative, even if it would only take a single engineer a few days.


I suppose if you have boat loads of cash to burn, a couple thousand a month would not be a big deal. For a bootstrapped startup, money is a concern.


SocketCluster's API is similar to that of Pusher so it can be a good self-hosted alternative: http://socketcluster.io/ - It's also designed with scalability in mind. You do have to manage presence yourself though but that shouldn't be too difficult (You can keep track of user presence in your own database - That could actually be an advantage since you can go ahead and use that presence data easily on your back-end to do other stuff - Easy to integrate).

We're currently working with a couple of fast growing startups so it's getting traction (still early adopter stage though). We'll be announcing our native iOS client near the end of this week. (disclosure: I'm the main author)


If they really had 1 million concurrent active users "assuming 5% of our simultaneous actives were chatting, that we needed to be ready to support up to 50,000 users chatting at once", and were seeing growth "chat was an instant hit. It grew week-over-week to well over 50,000 concurrent connections", why did they shut down?


I would say Secret did not have enough imagination to monetise those numbers or had a moral reason against that kind of money making..


the simple answer is "not enough growth"


There isn't enough of this information out there. Scaling things is incredibly hard, a lot more difficult than some would think. Not only do you have limitations around the software, but hardware and the problems that come with scaling hardware. Great article, looking forward to some more specifics (load balancing, memory/cpu/disk issues) in future articles.


This isn't great for Layer. I think most companies want to own their chat stack. Its not complex enough, like say Stripe, to need a 3rd party. Has anyone used Layer and felt they added value?


They used Google's database service and Pusher, so only in a limited sense do they "own their chat stack".

Are there any true peer to peer chat apps left, or do they all go through some server now, even after call setup?


I really doubt it. Remember when you could just send someone a file, because their IM client allowed you to connect to each other directly?

I wonder if there's a good way to do this now, as some plug in somewhere... Even XMPP's file forwarding functionality fails for me.


Broadband and wireless routers kinda killed this. It's now typical for every computer on your home network to be behind a NAT, so to enable direct connect, you need to manually punch a hole in your router's firewall. The spyware situation of the early 2000s (which caused Microsoft to bundle a personal firewall with WinXP) didn't help either. Most users are not willing to fiddle with their network connection just so someone on the Internet can directly connect to them.


WebRTC allows for a peer-to-peer data channel and does the heavy lifting of NAT punch through for you. I've mostly been messing with the video chat side of things recently, but would be curious to see if the data channel could be used for large file transfers.


Hardly killed it. There's a reason UPnP exists.


"Are there any true peer to peer chat apps left"

Bittorrent recently released 'Bleep' which supports encrypted peer-to-peer chat including image transfer. It holds the message locally until a direct connection to the receiver is established.


How is this news? Chat has been around for a long time and Pusher is nothing more than a cash grab for lazy developers. You can easily get better performance, scalability, and customizability, for much cheaper with socket.io and and a caching and/or messaging service (Redis, *MQ).


Lazy is smart. Allows you to focus on adding value. Not needing better performance, scalability and customizability is not only reasonable: it's highly probable. Building and maintaining is not cheap


I find it distressing that a "mid-brow dismissal" such as yours is the top comment.


Is it "mid-brow dismissal"? I'll submit to your criticism of my tone, but I believe a strong argument lies behind it. Its not that I mean to discredit what Secret has accomplished, only that people shouldn't be wooed into thinking Pusher is the way to accomplish it. Its a waste of money in my opinion, and the DIY alternatives will, I assume, prove to be much better options in the long run.


>>I'll submit to your criticism of my tone, but I believe a strong argument lies behind it.

The gist of your argument is that Pusher is "nothing more than a cash grab for lazy developers," which insults not just the creators of the service but also people who use it.

Just because DIY alternatives exist for something does not mean that thing should not be used, or that people who use it are "lazy."


Well, its not exactly "DIY" if you consider that many years of work have been put into building the libraries and software that solve these problems.

I personally find it ridiculous to pay for a service like Pusher, when I know that I can accomplish the same goals with FOSS without much extra effort. For the record, I feel the same way about Heroku, so take my opinion for what its worth... I don't intend to hide my bias.

I believe people should understand and take responsibility for their systems (when possible) instead of just defaulting to relying on proprietary, closed services and software.

Will Pusher still be supported in 5 years? Will it still be affordable? Will it remain stable? Will the number of connections I'm allowed on my current plan stay the same? These are questions I don't have to ask.

I admit my assertion that Pusher is "...nothing more than a cash grab..." is a little hyperbolic, and I mean no disrespect to the people who created and maintain it, but in some way it is true. Pusher is a business that relies on its users not having the time or know how to implement the service it provides on their own.

Max 20 connections... Want 21? pay $20/month...

I can copy and paste the example chat from socket.io and push it to a VPS and instantly do way better for much less.


> a business that relies on its users not having the time or know how to implement the service it provides on their own.

By that rationale, pretty much every business on the planet that provides a service is nothing more than a cynical cash grab. Maybe you need to grow up a bit.


Of course this is how business operates... but consider the fact that you could pay someone to build you an intricate watch, and that you could also pay someone to brush your teeth for you because its a little easier than doing it yourself.

I respect Pusher and what they do to make money. There is a market, and Pusher is there to fulfill its needs. I cannot take issue with that. All I'm saying is that there are better options.

Anyways, I see a lot of tearing apart specific things I say, but not a lot of reasons why relying on Pusher is a better option than rolling your own... I welcome that sort of discussion.

For now, its time for bed.


I'll give the answer: because at that stage you're still trying to find product-market fit. If you can save $1k (say - and the dev time to develop a Pusher alternative probably costs more than that) in the validation stage by using an approach that will cost you 5x more in the long run - then that's a good business decision, because only 10% of the ideas that you try to validate are going to become real products.


This is a mistake that a lot of (diy-ish) people diving into entrepreneurship get wrong. Time is the most rigid resource we have and the number one asset. Just because you can build or maintain everything yourself (and developers, engineers etc. take a big pride in that - and I get it, it's an awesome feeling), doesn't always mean that you should.


I recently discovered Pusher and was able to build and ship an entirely new feature in under a week. They have good documentation, thought provoking example apps and simple pricing.

Couldn't be more impressed with them. Keep it up!


Out of curiosity, if you were to implement the features of Pusher yourself, would it have added too much to your implementation? I guess you would need to set up your own Socket.IO server, and push data from that.

I've never implemented this, which I think is the reason I am underestimating the service that Pusher provides. It would be great if someone that did implement it would convince me that it would be a large hassle to do that internally.


I have come across Slanger[1] while researching for a good websocket server and it worked well for us. It's basically an open source version of Pusher backend. You may be interested in their code.

[1] https://github.com/stevegraham/slanger


I haven't used Pusher or Socket.IO, so I don't know specifically what they offer. However, I have made a simple unidirectional websocket-type server and it was not a terrible hassle.

With a go server on a GCE n1-standard-2 you should be able to handle over 100k connections per server. I haven't done any benchmarks for throughput, but using redis as the messaging backend you can do pretty well.

You take 100k connections on each of the endpoint servers and make a single redis connection, using redis pubsub to subscribe to channels. Publishers can connect to the same redis and publish to those channels.

If you wanted bi-directional it could work a similar way I think, I'd guess the specific architecture depends on your requirements.


A single Socket.IO server is easy.

A scalable cluster of Socket.IO servers isn't.


Really interesting that Pusher didn't choke on that!

Also patting my own back for thinking about chat session based on sorted user id's (for 2+ person chat) on my own once also. Felt like a nice way to do it, unsure if that breaks down at any point.


So basically they used pusher to scale to around 50,000 concurrent users.


I'm making a real-time communication iOS app, we're using Firebase[1] to quickly build it's real-time features.

There's to be a lot of hate on here towards using such services. Is it really a better option (flexibility, owning your data & price wise) to spend the time building your own system and having to hire engineers later on to manage/develop it?

[1] https://www.firebase.com


Hey Pusher guys; how much of that flow went through Haskell? Are you using it for your main service yet?


None of it sadly, though the application fetches its config from a haskell service.

The bits we are preparing for the main service are quite large (architecturally) but we hope to be beta testing them soon.


Make sure you keep blogging - I really enjoyed your last post on testing with Haskell!


What is the last post you are referring to? I couldn't find it.



Thanks, I'll pass it on to the author!


Step 1) Take investor's money Step 2) Take $3MM off the table Step 3) Profit Step 4) Shut down


Why not use WebRTC for chat these days?


Secret was a mobile app. WebRTC is not supported on any version of iOS and is supported only on very recent android (5.x)

edit: source: http://caniuse.com/#search=webrtc


That caniuse link refers to mobile safari. WebRTC in iOS is supported for native apps


Was not aware of the native support. Thanks.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: