Hacker News new | past | comments | ask | show | jobs | submit login
TCP Sucks (bramcohen.com)
137 points by reacocard on May 7, 2012 | hide | past | favorite | 56 comments



Designing networking protocols that are robust in mathematical sense is unbelievably difficult. In fact, we humans have only found optimal solutions in a few cases if you dig through the mathematics literature. Many real-world networking protocol design scenarios do not have a known non-pathological implementation. Furthermore, there is a large number of decentralized protocol designs that we can prove to have many poor qualities. To bludgeon the equine, people that can significantly advance our understanding of such things tend to win Nobel prizes and similar. It is that difficult.

That said, TCP is not the best we can design given everything we know about designing network protocols. It was good enough for the people that designed it at the time, and possibly (my chronology is fuzzy) was approximately as good as the mathematics would have reasonably allowed when it was developed. We can make it work well enough in many cases -- the economics of inertia. Other narrow use cases are better solved differently but are not general solutions.

It is one of those problems that sounds like it should be easy to solve on the surface but turns into a bloody epic challenge once you start to dig into it. I am not offering a solution, just noting that very few people can.


Hold on a minute

> possibly (my chronology is fuzzy) was approximately as good as the mathematics would have reasonably allowed

Just because Van Jacobsen's papers spew forth great volumes of mathematics, doesn't mean there is any robust mathematics behind TCP. Read to the end of his paper "Congestion Avoidance and Control". Read past all of the impressive plots graphs and equations. Read to the conclusions.

"The 1-packet increase has less justification than the 0.5 decrease. In fact, it’s almost certainly too large."

This statement shows how little formal consideration went into the entire algorithm. The 1-packet increase is not simply too big or too small, it just doesn't make any sense. For starters, how big is the packet? Oh, it isn't defined anywhere. Even if we just go with the de facto internet packet size of 1542 bytes (you know, the old limit for 10Mbit Ethernet)...

Could that one packet increase per roundtrip make equal sense for a 10Gbit path to India with a 400ms round trip time, as it does for a 56Kb link between Berkeley and MIT (his test case)? Of course it doesn't make sense. And it gives lie to any notion that there is a formal underpinning for TCP. They tweaked it until it worked, and then put on a nice mathematical show to feel better about it.

Quoth Van Jacobsen: "We have simulated the above algorithm and it appears to perform well". Oh now I feel better.

Second point, which Braham is covering: TCP makes the assumption that router queue lengths are reasonable. TCP says, fill up the router queues until they drop packets. But router queues have been getting longer and longer as memory gets cheaper. These queues can create additional seconds of delay to layer on top of the 10ms-400ms speed of light delays we see on the internet itself.

EDIT: In that 10Gb to India example, it takes TCP literally DAYS to fill up the pipe because of that "1 packet per roundtrip" window increase. Days, by the way, of no incidental packet loss, because it all gets reset on a loss.

EDIT: I spent 5 years of my life working on the fact that latency was never really factored into the design of most network protocols.


Did you even read the article?


One thing you have to realize about Hacker News is that people can accumulate pretty good karma points for postings that have a definitive/authoritative tone blended with some truisms. Not to bludgeon the equine or anything, but I think people find them comforting.


I don't think his comment in any way contradicts your post. I thought it added to it; his point is that not only are good protocols hard in practice (which you covered), they are theoretically hard (which you did not cover, hence adding to your point).


To people who don't know what the parent is talking about.

Take a simple example.

http://en.wikipedia.org/wiki/Two_Generals_Problem

My summary:

The two generals problem proves that, if there exists any nonzero probability of packet loss, two people cannot even coordinate to both have a state 1 at sunrise tomorrow (attack!!!) or both have the state 0 if it is not 100% mathematically guaranteed that they both believe this has been coordinated (since an uncoordinated attack will be a catastrophic loss for them).

(In other words, the guarantee must be such that by sunrise tomorrow state 0-1 or 1-0, in other words one general thinking the attack has been coordinated with certitude and attacks, but the other general thinking the attack has not been confirmed with certitude and does not attack, must be a mathematical impossibility.)

Take a simple approach. The following packets are all encrypted, but any or all may be lost.

1) first general sends: "Let's both be in state 1 tomorrow (coordinated attack). Since an uncoordinated attack is so catastrophic to us, I will only enter state 1 if I receive your reply. Please include the random number 25984357892 in your reply. As soon as I get this the attack is ON. If I don't get such a packet within the hour I will assume this post was intercepted (lost), and I will send another. I will remain in state 0 until I receive that packet."

2) second general sends: "Got your packet with 25984357892. This is my acknowledgment! I will attack as well. In case you don't get this, I know you won't attack thinking I didn't get your message, so I am sending this message continuously."

Great. But what if all messages from the second to the first are intercepted. Now the first thinks all of HIS were intercepted (has received no acks) and doesn't attack, but the second one does. Failure.

So, we have to emend 2) to:

2) second general sends: "Got your packet with 25984357892. This is my acknowledgment! I will attack as well. In case you don't get this, I know you won't attack thinking I didn't get your message, so I am sending this message continuously. In case you don't get any of THESE messages, however, I will not attack. Therefore acknowledge ANY of them with random number 458972984323..."

Ooops. What if all the first general's ack's of the acks are intercepted or lost? (Perhaps the first general is able to send messages until receiving (2), but just as the first general gets 2) conditions change and the general no longer has any of his messages delivered.)

Now the first general thinks he has acknowledged the ack, but the second general doesn't even know if his ack-(cum-request-for-an-ack-back) message was even delivered...

and so it goes...

Of course, in practice you can simply say: "Let's do this for a certain number of acks of acks of acks, 3 let's say, and then just keep sending the same ack to each other, assuming that if the connection was reliable enough to get to three deep, then it will be reliable enough for one of the final acks to make it through." That's a false assumption (mathematically - what guarantee do you have that if 3 of your encrypted messages made it across, at least one of the next 217 that you send by sunrise all with the same message will), but a reasonable one.

So it is not a practical problem. This is a mathematical problem. Although you cannot even do something as simple as "let's agree to both be in state 1 (or neither if we fail to agree), OK?" over a less than guaranteed reliable connection, if the connection has any reliability at all you can get to within a practical level.

once you reliaze that, PROVABLY, you can't even do the most mundane things no matter what, the mathematics the parent is talking about do not seem all that interesting anymore. :)


While it's true that network protocols are neat challenging puzzles with nontrivial solutions, the hardest parts end up being the mundane: how does any given protocol change interoperate with all the existing implementations out there, especially changes to congestion control, and across the range of optional protocol features.

Uncertainties introduced by packet loss are actually pretty easy to work past.


Shouldn't there be mathematically solid probabilistic solution? Start at 50% certainty, increase it with every positive message, decrease it with every unit of time. Stop when you're close to 100% or 0%.


You could still prove some estimates of probabilities of the outcomes given reliability of the connection and such. It still can be mathematically interesting.


Ok, maybe I'm missing something, but reading the article I see some weird ideas:

RED is hard to deploy, so let's change the base protocol instead. - how does that make sense? Everyone would have to start using new libraries and for backward compatibility we'd have to preserve the tcp layer too. That means standards like http would have to get extensions to use SRV records or suffer delays while utp availability is probed.

There's also a complaint that RED will drop packets once the queue is full. I don't get that at all - it will always happen...

In addition I get an impression there is some tension/implied superiority between us (people doing uTP) and them (ones doing RED). Why does it look so ugly? There's a known problem, there's an interesting solution for new software (uTP) and some plan to migrate old protocols transparently (RED). When did that turn into some bizarre conflict and why?


BitTorrent is using uTP just fine, which is only, you know, most of the upload from consumer internet connections, and we're working on getting the same things crammed into TCP with LEDBAT, but that's a slow process.

I wasn't complaining about RED dropping packets, just describing how it works.

As for the tension, my point is that my solution works and the other one doesn't. If you want to know why the person I quoted was being such a dismissive jerk, you'll have to ask him.


Ok, I don't understand this part then: "With RED the router will instead have some probability of dropping the packet based on the size of the queue, going up to 100% if the queue is full." - that seems to be universally true for any queue with limited capacity. If it's full, it's going to start dropping packets - whether it's those on the queue, or the new ones doesn't matter. Any queue which is full will require dropping as many packets as the number coming in.

Is there some reason this was described as a specific behaviour here?


I believe the focus is more on the first part of that statement. A "standard" queue would only drop packets when it is in actuality full. RED drops packets when the queue is non-full based on some calculated probability. The likelihood of drop simply goes up until the queue is full (and 100% of new packets drop).

Apparently.


> When did that turn into some bizarre conflict and why?

Man you weren't kidding. I tried looking up uTP on wikipedia hoping to come away with some technical understanding. It's full of passive aggressive statements that cite forum posts as their support, with no information on how it actually works. Maybe some of the folks in this thread could go fix that.


There's some detailed writeups of the algorithms in the LEDBAT documentation. As for commentary, that's distinctly lacking, and my post is filling in some of the gap.


  The solution is for the end user to intervene, and tell all
  their applications to not be such pigs, and use uTP instead
  of TCP. Then they’ll have the same transfer rates they
  started out with, plus have low latency when browsing the
  web and teleconferencing, and not screw up their ISP when
  they’re doing bulk data transfers. 
That still doesn't address the problem when you have many users behind the same queue, some of whom care only about throughput and not latency. You need a scheme which will work when all of those users are acting selfishly.


My thought was more fundamental than that: any solution which involves asking users to request different transport protocols is not going to solve the problem. There are far more users who have no idea what a "transport protocol" is than those who do.

With that said, I enjoyed the post. It's an interesting problem, and I do find the base idea attractive: allowing applications to opt to be background traffic.


It wouldn't (at least in my understanding) be the user that would choose, it would be the application. WoW for example would optimize for latency, whereas BitTorrent would optimize for throughput.


Correct, but Bram's argument (as I understand it) was that the users would put pressure on the application developers to opt to be background traffic.


Two nearly-universal truths about users that suggest solutions: they're both rare, and usually wrong. :)

It's more likely that users will state the problem -- such as, "I want to be able to run WoW and BitTorrent at the same time." From there, the software developers would determine the solution (optimization for latency vs. throughput).


They can have their throughput without running a denial of service on their net connection. If you assume that they'll DOS themselves for the fun of it anyway, then I can't help you.


For a user that cares only about throughput and not about latency, it's not a DOS.


Backing a queue up to seconds of depth doesn't increase throughput.

Most of the bandwidth contention is at the edge, right at your DSL connection, so any battle among network connections is a battle among your own usage at any given moment.

Separately, bittorrent is not a latency-sensitive app, it's throughput sensitive, and uTP was designed for bittorent.


If anyone is curious what uTP is, you can find the protocol defined here: http://bittorrent.org/beps/bep_0029.html


The author seems to claim that is is implausible for a router vendor to sell a router that drops more packets.

  The marketing plan is that the because router
  vendors are unwilling to say ‘has less memory!’ as a
  marketing tactic, maybe they’d be willing to say
  ‘drops more packets!’ instead. That seems implausible.
Yet he concludes by suggesting the router should drop all the packets.

  The best way to solve that is for a router to notice
  when the queue has too much data in it for too long,
  and respond by summarily dropping all data in the
  queue. /snip/ Of course, I’ve never seen that proposed 
  anywhere…
Based on his earlier reasoning, that would also be implausible.


That's what you would do IF you were going to be serious about making the router drop packets in a way which actually helps. I don't expect it to happen any time soon.


It seems to me that on the IP level the net has been in a technological paralysis for some time.

We can't get RED or IPv6 deployed, and and the IETF doesn't seem to get anything useful happen these days.

edit: anyone else remember when layer 3 had a bright future ahead of it, IPv6 and end-to-end IPSec (with keys in the DNS) were just around the corner...


uTP carries the bulk of all BitTorrent transfers at this point. This would seem to imply a certain level of success.


Too big to overhaul?


Having uTP running against UTP as an alternative network connection means is a rather unfortunate naming - I imagine a lot of people getting confused to the max, especially as it's pronounced the same. So pretty pretty please: Give the protocol a GOOD name first, then we're talking business! ;-)


I think the catchy title was meant to grab attention to an important present day issue.

But TCP actually does not suck, it's been there for longer than I have and served us pretty darned well up until now.

Never forget that when the TCP protocol was designed, the biggest concern we had was that a nuke would land on top of our heads at any minute and the network should keep working. Also, the "Internet" was thought to be a small niche network of networks among the military and academics.

I guess this is all well known, it's just my reaction to the editorialized title.


Just to be clear, "TCP Sucks", despite successfully running the majority of the global Internet traffic. "TCP Sucks" so bad we're basically going to copy a lot of it: window based congestion control, SACK, timestamps, ability to add new options, etc. "TCP Sucks" because it is not perfect and has an issue, an issue that requires router / switch upgrades. We're going to fix that by breaking backward compatibility with _tons_ of applications and requiring an OS update on _every_ client and/or application. All this assuming our relatively new and unproven thing is as good as TCP in all other ways and fixes this issue of TCP perfectly.

Hmmm. Me thinks that TCP does not suck so much.


Sounds like "it kind of worked so far, so let's use it forever, with multiple layers of band-aids if necessary". Besides, unles I'm missing something, the two protocols can be used side-by-side, I.e. you can slowly phase one out by the other where necessary.


Did you read the article or just the headline?


I did read it, guess I just got too caught up in the title to brush it off (looks like some others here also wondered about that).

Anyway, uTP looks cool, LEDBAT sounds very interesting and BitTorrent is of course, completely awesome. I just don't think TCP sucks. I'm constantly surprised at how well it works for how simple most of it is and how complicated and intractable the rest is.


Shameless plug:

Extremetcp.com is the solution to the congestion problems of TCP. The best part of ExtremeTCP is that it is not a new protocol. It is TCP. It just uses clever algorithms at the sender side to send data while avoiding congestion. (Since TCP does not actually specify which algorithms one should use as long as one avoids congestion, ExtremeTCP is a perfectly legal version of TCP).

Yes, I am involved with this. If interested in testing, please send an email to the contact address in the website.


It's hard to evaluate your claim that ExtremeTCP is "the" solution, given that your website (1) does not compare your solution to the significant amount of literature and prior art in this space (eg, TCP Vegas, which is already implemented in the Linux kernel), (2) doesn't make any claims about friendliness to TCP Reno, which is the hardest part of retrofitting a new congestion control strategy onto the public Internet.


Good points. We do have comparisons with most existing TCP solutions out there (at least the ones we were able to get a hold of). I did not put them up because I did not want to drown people in information. But I should put up a white paper that has more detailed test results.

By the way we do beat the the Linux TCP stack (which now uses Cubic TCP, afaik). The data shown on the site is Compound TCP, which is the current windows implementation.

Let me know if you would like to see the detailed test results.


I would wager that the kind of people who are reading a website about TCP replacements are very much looking to be drowned in information.

Unless I'm mistaken, you're not yet at the stage where you're selling a turnkey solutions to pointy haired bosses.


If you're going to claim "revolutionary new software," you will need a huge amount of information to convince people. Specifically, many experiments with detailed explanations of your methodology.


there is also http://www.fastsoft.com/home/ which professes to do the same, personally I worry about any non-documented non-public congestion control protocols, many years have been spent in the academia researching this subject ... it is easy to be "fastest" - just disable congestion control altogether - trouble is tons will break. In order for me to use a different congestion control algorithm in production I would need some experts to review the protocol to ensure I am being a good web citizen and not breaking the internet.


That is the initial reaction of most people. But it is not the case actually. If you disable congestion control, you will be the fastest for a little bit (maybe a hundred miliseconds) and then you will get drowned in dropped packets. In other words, you will be sending data fast but packets will start dropping on the way, so the data will not be received fast.

When we say we are the fastest, I mean we are the fastest at sending packets all the way through. This is not easy at all, and it is not as simple as sending packets as fast as possible.

We are very good at modulating our speed in order to have full speed while avoiding dropped packets. Thus, in many situations, we send data slower than other TCP versions but the other versions get into trouble and start dropping packets.

There is one universal rule for TCP congestion avoidance algorithms and that is that as soon as you notice a dropped packet, you have to stop and wait for the congestion to clear up. If you do not do that, you will break the internet. But we do follow that rule; furthermore, we avoid large numbers of dropped packets in the first place.

We have tested our software with other standard connections and it does play well with others.

As someone else noted, Fastsoft are respected by the industry and it is well established that they do not break the internet. We are about 30% faster than fastsoft.


Fastsoft have a good reputation amongst web performance people


There are many implementations of improved congestion control for TCP, several of which are implemented in the Linux kernel.

It's trivially easy to modify congestion control to get arbitrarily fast performance in high bandwidth-delay environments. I can tell you from experience that implementing fast performance in extremely lossy environments is harder.

And hardest of all is to come up with a solution that works on a network that is shared with the common congestion control implementations, and that works with billions of end nodes.

I spent years working on this. Feel free to reach out to me on my email address, I'd love to share with you my experience of commercializing such a product.


Is the shout-out the mere mentioning of bittorrent? Or does nick weaver elude to bram some other way in the full article?


Who do you think made the development he's referring to happen?


I obviously know who is behind bittorrent that is why I mentioned it as a possible alternative. But you yourself state that Stanislav Shulanov is the man behind utp. More importantly shout-outs traditionally explicitly reference the person/thing in question. I was not sure if there was a longer/muddier/more-controversial back story that would not be appropriate to discuss in an acm article.


I'm surprised he doesn't mention CurveCP. He's taken ideas from that author before (e.g. netstrings, which you'll find in the .torrent file format).

TCP does suck. If you try to use it for lots of short lived connections. And that pretty much sums up how it's being used nowadays, most the time.

For single, long term connections, TCP is fine.


uTP isn't based on CurveCP, and CurveCP is nowhere near as mature as uTP is.


That said, CurveCP is a quite interesting protocol! It just needs more support and testing.


Perhaps I wasn't clear. What I meant was that I would have expected you to have looked at CCP and evaluated it for your purposes, as you did with RED.


There's also UDT...


Is "from that author" some sort of passive aggressive jab at djb?


Nope. Quite the contrary. I was trying to avoid bringing the strange sentiment to which you are referring to the fore. Mere mention of those initials causes an adverse reaction in a lot of developers and admins. I voluntarily use his software every day. Not perfect (nothing is) but the best there is available. I find his code to be wonderfully lucid if I read it slowly and carefully. My opinion.


I also have a lot of respect for djb. That is why I thought it was strange you did not use his name when you were clearly well acquainted with who the author of netstrings and curvecp is.


> Of course, I’ve always used TCP using exactly the API it provides, and even before I understood how TCP worked under the hood gone through great pains to use the minimum number of TCP connections used to the number which will reliably saturate the net connection and provide good piece diffusion.

BitTorrent must not not have any books on copyediting.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: