It's the Latency, Stupid

zhoutong · on April 27, 2011

As an Internet user in Singapore, I have a very close experience with Internet Latency. On average, even an excellent Internet connection (Fibre 100 Mbit/s) in Singapore has about 240ms latency when communicating with a US server at West, or about 320ms at East.

Skype calls with these latencies are still acceptable. However, there are many websites using SSL without a cache, and causing the overhead time to be 3 times longer (due to much more handshakes).

Unfortunately, Basecamp uses a lot of redirects (3 redirects to a Whiteboard). This means that I have to wait for 2-3 seconds for each page to start loading!

I know light speed is not something we can change, but the key is, don't use SSL unnecessarily. Even if you have to use SSL, try to implement a cache to save users' time.

I believe that many Internet users outside of US are just like me, start to hate HTTPS websites intuitively. To us, they are just much much slower.

Singapore has one of the fastest Internet connections in the world, especially among English-speaking countries (US website visitors). But latency is always a very big problem that many people are not aware of. I'm surprised that this issue was mentioned as far as 15 years ago!

Edit: spelling, expression.

beder · on April 27, 2011

Slightly off-topic, but what is a cache for SSL? That is, what exactly is being cached, and who's doing it?

I ask because I'm writing a little API (both server/client) that uses SSL, and just connecting can be brutally long - so I might be doing something wrong here.

zhoutong · on April 27, 2011

Well, I think I may have used the inaccurate term. The cache for SSL has two parts:

# You need to reuse the session, instead of forcing the client to do the full handshake every time.

# You can consider setting a high Keep-Alive value to improve the performance by avoiding long overhead time.

pornel · on April 27, 2011

In addition to reuse of SSL sessions, files downloaded over HTTPS can be cacheable as well — you just need to add:

    Cache-Control: max-age=<number of seconds>

this way you'll eliminate roundtrip needed for redownload/cache validation.

JoeAltmaier · on April 27, 2011

Infiniband was a solution to the latency issue (looking for a problem). It was supposed to provide low-latency data transfer at infinite bandwidth.

I was part of one of the 1st generation solutions. Everybody was attempting to graft infiniband onto existing IP stacks - which totally blew the latency promise. Every day I had to fight our management about this.

The solution of the day was to provide 'virtual adapter' hardware directly mapped into the application memory space, so transfer latency skipped the kernel switch, memory copy and interrupt processing delays. Which required significant changes to the kernel. Which was easy on Windows, hard on Linux and impossible under SunOS.

bartonfink · on April 27, 2011

Kernel changes easy on Windows and hard on Linux? Can you explain a little more? I would have guessed Windows would fall in the impossible category and Linux would be relatively easy.

JoeAltmaier · on April 27, 2011

There is mature installable driver support in Windows. Things like mappings of locked frames to user space that persist after a kernel call were a flag on Windows, API/kernel changes had to be written on Linux.

bartonfink · on April 27, 2011

Interesting - thanks for the explanation. I didn't know that Windows' driver support was that strong.

scott_s · on April 27, 2011

This issue also matters within the computer - the bandwidth between main memory and a GPU connected by PCI Express is close to the bandwidth between main memory and the CPU. (On one machine I used for experiments PCI Express was 8 GB/s while the front-side bus was 12 GB/s.)

But the difference in latency is so big that you have to use different memory transfer strategies when using the GPU for computations than when using the CPU.

sonoffett · on April 27, 2011

Interestingly, this is still relevant today when doing large data transfers between hosts with long RTTs using TCP: the typical TCP congestion control implementations (i.e. bic (http://en.wikipedia.org/wiki/BIC_TCP)) use ACKs to update their window size (which come back one RTT later) and hence their windows grow much slower than two hosts with the same capacity but a smaller RTT.

Cubic (http://en.wikipedia.org/wiki/CUBIC_TCP) and htcp (http://smakd.potaroo.net/ietf/all-ids/draft-leith-tcp-htcp-0...) are two congestion control methods which avoid this by not increasing the congestion window size by a function of the RTT (and are recommended if doing large data transfers across high capacity links with high RTT). In linux you can typically check your TCP congestion control algorithm by: "sysctl net.ipv4.tcp_congestion_control".

jrockway · on April 27, 2011

Seems that the default is "cubic", so users of a modern Linux are OK.

sliverstorm · on April 27, 2011

On a purely superficial level, I am amused to see a call for more inline compression and a call for lower latency in the same article.

dexen · on April 27, 2011

To the benevolent downvoters of parent post: please explain what is your point?

Don't we agree that compression requires processing input data stream in a window, which introduces extra latency?

(of course, overall transfer time may be lowered via compression, and thus compound latency of whole application can be lowered, but that's another matter -- and sometimes is not predictable without looking at a sizable chunk of the input stream)

asynchronous13 · on April 27, 2011

=== please explain what is your point? ===

Quoth the article: "In fact, since most images and sounds on Web pages are compressed already, the modem's attempts to compress the data a second time is futile, and just adds more latency without giving any benefit."

The point is that the article itself addresses this point.

=== Don't we agree that compression requires processing input data stream in a window, which introduces extra latency? ===

No, we don't agree. Your point is correct for a specific use case of compression. However, the article describes using compression such as JPEG for images and MPEG for videos. Typically, images are not stored in raw form and compressed to JPEG only when requested (which would introduce latency, as you suggest). Rather, images are compressed in advanced and stored in their compressed form. This use case of compression does not introduce extra latency.

trout · on April 27, 2011

The other side effect of this is that some modern applications are significantly hindered by the speed of light. If you start to look at doing globally distributed storage, speed of light alone between two locations can be in the hundreds of milliseconds. I had not seen the 33% slowdown in fiber, which could make that worse.

I believe for voice the human ear can't detect 20ms differences, but when you talk about video I have not seen numbers. Many HD video applications require latency under 100ms for the interaction to feel natural, and we currently lack a better solution for this. Example: HD video between Shanghai and Buenos Aires would be over 150ms in delay with just the speed of light and the fiber delay.

JoeAltmaier · on April 27, 2011

TCP latency is problematical so at Sococo we do our own protocols under UDP. Different problems, sure, but latency is not one of them. And with voice its important to keep working on the latency issue.

vog · on April 27, 2011

I don't understand why fault-tolerant realtime stuff (like voice calls or life video stream) has ever been done via TCP in the first place.

Sure, TCP-only clients such as web browsers might have contributed to the problem. But on a technical point of view, those applications are exactly the use-cases UDP has been designed for.

I guess the next step after web sockets (http://www.w3.org/TR/websockets/) will be a UDP-variant of web sockets, enabling web clients to do what native clients have always been able to do: basic networking stuff via TCP and UDP. Alternatively, a UDP variant of HTTP might emerge.

mleonhard · on April 28, 2011

SIP session management messages are like HTTP over UDP.

rodh257 · on April 27, 2011

Very relevant to mobile applications/websites with 3g connectivity. I recall seeing an interesting talk from Google about how the Google Maps team optimised the latest Maps API to send larger tiles less frequently to perform better among other enhancements.

Just found a video that talks about this here: http://www.youtube.com/watch?v=Rcvx5QHTJ5U

johnzabroski · on April 27, 2011

This is very true, but the flipside is that the patterns of messages over the network matter, too. Improperly designed communication protocols result in propagation delay, which is different from latency. Many real-world applications indirectly suffer from propagation delay, and many of the largest sites have tricks for minimizing delay, and some large companies even want older protocols abandoned due to their inefficient communication.

The reason LANs generally haven't cared about propagation delay is that they are not using networks already operating near the ultimate bottleneck: the speed of light. Some college campuses I know of have less bandwidth than an iPhone. But for Internet-scale computing, propagation delay is evident.

hedgehog · on April 27, 2011

This guy wrote one of my all-time favorite papers for its elegant solution to a common problem:

http://www.stuartcheshire.org/papers/COBSforSIGCOMM.ps

wmf · on April 27, 2011

Except that problem shouldn't exist; IMO byte stuffing is evil and all protocols should use byte counting.

IgorPartola · on April 27, 2011

I tried Clear at one point, thinking I could cut the cord with Time Warner that way. Wi-Max sounds cool on paper, but the latency to the tower was around 30-50ms. This sucks. Browsing the web was not bad, but SSH was a nightmare. Moreover, the latency was inconsistent: 80ms to google.com at one time, 225ms at another. When I talked to the sales rep that sold it to me the plan, he was aware that he had a "ping" in his demo software, but had no idea what it meant. For anyone that wants to try it out: beware their cancellation fees.

apenwarr · on April 27, 2011

30-50ms isn't nearly enough latency to make ssh suck. Even 225ms is fine for ssh. You almost certainly were suffering from high packet loss, which means you get spikes of super high "latency" in TCP (because after a couple of packet losses, it holds off retransmitting until some time much much longer than your ping time, so you experience multi-second delays when typing).

Ironically, connections that are sending bulk data (eg. downloads) don't run into this, because a lot of packets get sent at once; if one of them in the middle gets dropped, the recipient knows it right away (because the next one has the wrong sequence number) so it can immediately send an ACK that re-requests the missing one. So on a busy connection, packet loss isn't as big a problem. On an interactive connection like ssh, it's completely deadly and makes it almost unusable.

All that to say: your problem with Wimax is fixable. It's not a latency problem, but nobody has fixed it yet.

IgorPartola · on April 27, 2011

Thanks for the reply. Just one clarification: the 30-50ms ping was just to the tower, not to the remote host. It was taking the already slowish ping of 80ms and adding another 30-50ms to it. I think this makes ssh suck since I can see it: 100ms latency == 10 frames per second.

squeed · on April 27, 2011

What you were seeing was the dreaded Buffer Bloat. It's really quite common nowadays. Memory is so unbelievably cheap, so device manufacturers all up and down the ISO stack are putting too much memory in their transceiver buffers.

Why is this bad? Because it screws with TCP window adjustments. Remember that your OS will send packets until it starts to see drops, then back off of the transmit rate. You can adjust the parameters (startup rate, backoff rate), but that's the basic principle. However, when there is an overly large buffer somewhere in the middle of the network, this delays drops. This has the effect of making it nearly impossible for your TCP stack to determine the true bandwidth.

For some reason, most mobile device manufacturers are the worst offenders when it comes to bufferbloat.

Anyways, this was a brief summary. For a much better explanation (including pretty graphs), see http://www.bufferbloat.net/

mattgreenrocks · on April 27, 2011

Low latency + adequate bandwidth + well-designed sites make for a pleasant web browsing experience. If any one of those aren't present, it somehow seems less professional. Plus, the time between clicking a link and the next page coming up is hugely important if the visitor is on a nice connection. I feel like a lot of sites with minimalist layouts (esp. blogs) score big in this area with lower-end computer users.