I appreciate these war stories more than the "look at this great new thing that will take over the world" posts (those have a place as well). We need more war stories in this industry because everything has pros and cons, and our job as software engineers is to be able to make decisions based on limited information. Case studies are great way to glean real-world experience from others without having to implement every new technology yourself in order to make high-level assumptions about that technology.
This article shouldn't say to you: "See, Go is BAD, Python is GOOD!" It should say, "That's an interesting case study. If I'm working on a project that involves lots of sockets and concurrency, I'll want to take what they said into account when I'm making technology decisions."
I should reach out to our team that took python/twisted dealing with sockets and lots of concurrency and ported to Go and see if they would put together a similar presentation. Our case is a bit different, but we saw over 130x improvement in throughput going to Go. While they were in there, they increased monitoring, stability, and maintainability. More case studies to help others make informed choices. Sending that email now :)
I should note, we don't care about throughput for the most part. Our constraint is purely the memory use of holding open the connections. The aim is to hold as many connections as possible within 10-20% of the machines RAM, and not exceed it. As such, we need to be careful about resource usage and spikes.
Goroutines feel cheap, but if you're holding 140k connections, and just 20k of them do something that spins up a goroutine each... you can easily exceed the memory constraint. As such, we had to put goroutine pools in place, careful select statements around them from connections to ensure we didn't overwhelm external resources, etc. It was a huge pain. It has been drastically easier to control resource usage with these constraints under python/twisted.
YMMV, of course, this is just our experience. Part of the reason for putting it out there is that there are already many people who have talked/blogged about going from Python -> Go. I thought maybe the world could handle just one story about going the other direction.
Typically if you wish to limit the number of goroutines you would spawn N workers and have them read from a single channel. If 20k of your incoming connections want to do something they send on the channel, without spawning a goroutine themselves.
Yep, this is what I meant by 'goroutine pools'. The select statements were on the sending side to ensure if the feed channel was full we wouldn't retain too much additional state. It works, but at that point its starting to look like an async event-loop with a thread-pool....
Not exactly related to Go/PyPy, but I'm curious whether you can say something about how you handle memory and bandwidth constraints?
E.g. what do you do if you want to send notifications to lots of clients but for some the connection is very slow (you would probably need to buffer the data)? Do you have hard limits of maximum buffered data until you close the connection? End to end backpressure (for which channels are quite good) doesn't seem like the best option for 1:N broadcasts, because then the slowest receiver slows down all others.
And what do you do with connections which are sending you lots of (probably unexcepted) data? Stop reading from that socket?
We're using twisted, but I believe Python 3's asyncio has a similar feature with use of non-blocking sockets, which is that you can add a hook to be triggered when too much data accumulates in user-space (can't be flushed to the kernel's tcp buffer).
In our case, when notifications buffer for a slow client, this API gets triggered and we mark the client connection as 'paused'. Until that state is cleared by more data getting to the client, notifications go to the database instead with just a flag on the client connection to check the db when the pending data was retrieved.
We do a similar thing on the receiving end to pause reading off the socket if we're already doing more work on behalf of the client at once than desired.
This post reminds me of another post I recently saw on HN, in which the author (someone with an Erlang background) lays out all sorts of reasons why he chose Ruby for a highly concurrent application that launches lots of (heavyweight) threads. Upon seeing the link on HN, my first thought was, Ruby!!?? But then I read the post and the reasons were all very sensible and practical-minded, so in that case Ruby was arguably a much better choice than Erlang, Go, Scala, Rust, etc. for a highly concurrent application.
Just wanted to share my own very small case study. I had a homework assignment to build a polite crawler. I initially built it in Python, and it was awfully slow. I rewrote same thing in Go, it turned out to very very fast (10x at least IIRC). I liked the fact how quickly I was able to write something so quickly (with not so shabby design) in Go with so much less experience in it. Go is definitely awesome for writing concurrent code quickly. Its not big industry story, but as a busy student I still feel great about using Go. Reason being, we had to use same crawler for doing other stuff, for which a fast crawler was really handy and saved me hours.
The problems I observed with Go was that its regex seemed to be slower than Python, and memory usage was way higher. I explicitly added some GC requests.
I thought it would be IO bound (that's why I started with Python at first place), but since I was extracting links as well and working a bit on graph it turned out to be more CPU intensive. But well, maybe I could have written better code, better libraries, maybe multiprocessing (would have been painful though with multiprocessing). I do admit, I didn't look much into how I could improve it within Python. I just went with Go because it was quicker that way for me.
well.. extracting links etc is super fast with lxml's xpath.
It is written in C, and I don't think it would be faster if you write your own parser.
For example, to extract links from hacker news homepage, you would just do
xpath('//tr/td[@class="title"]/a/@href')
This will be really fast. You can do it even faster with a more specific xpath. I extracted about 10k links a second from documents this way and was still network bound. Usually you are primarily limited by websites throttling you.
I was using beautifulsoup with lxml backend I believe. I should have mentioned earlier. There were some other graph manipulation stuff too, like favoring links with more inlinks, keeping web crawler polite but still busy by looking at other domains. This is more expensive that extracting links I guess. I had a submission deadline, but whatever I tried in that time with Python didn't work. It was just easier to write faster code in Go (except maybe where regex are involved, now I remember I used some Go markup parser instead that is now in their library).
This article shouldn't say to you: "See, Go is BAD, Python is GOOD!" It should say, "That's an interesting case study. If I'm working on a project that involves lots of sockets and concurrency, I'll want to take what they said into account when I'm making technology decisions."