Hacker News new | past | comments | ask | show | jobs | submit login
Is your satellite link oscillating? Improving goodput using network coding (apnic.net)
69 points by gunkaaa on March 15, 2015 | hide | past | favorite | 14 comments



One thing I wish they had talked about, was about applying Random Early Detection (RED) to this problem. It seems to be, that RED, while an older technique, was created for exactly this purpose, and is supported in every IP router I've ever worked with. Perhaps there are challenges applying RED here, but I would suspect that if introduced it would also have considerable improvements to the internet performance for those islands.

Now, the solution presented, could be superior to RED, but based on what they've shown, I'm not entirely convinced. It is an intriguing solution though.

Also, an important takeaway they presented, that translates to other software tasks, is that increasing the buffer often makes the situation worse. One thing you can look into, is the research currently underway into buffer bloat, and the suspected impacts it is believed to be having on consumer internet service. My understanding is it appears to be caused by this exact phenomenon, where engineers from equipment manufactures and telecom operators reacted to the problem by drastically increasing buffer sizes.

*Please be aware, I work for a large telecom operator in Canada, but the views are my own and do not reflect any position of my employer.


There are challenges to deploying RED anywhere. That's why we now have a new generation of queue management strategies stemming from CoDel. Still, even with the best queue management there's obviously a benefit to using extra forward error correction like this to further reduce packet drops if the latency is high enough, but I'm not sure it is in this case.


A few random thoughts/questions:

- This sounds similar to the incast problem which occurs in datacenters, but this happens on consumer Internet - cool.

- After reading both this post and the TCP/NC paper, it seems to me like TCP/NC is unfair to vanilla TCP. If all TCP/NC does is send the same number of packets, but the packets are "more sophisticated" encodings of the original data, link utilization would be the same. So apparently, TCP/NC is more aggressive than vanilla TCP, and that's fine, but I think it should be acknowledged (haha). When they say stuff like "TCP doesn’t see the packet loss, and as a result there’s no need for the TCP senders to reduce their sending rates", it's a bit unclear what they mean - you can just as well modify the TCP stack to ignore the packet loss and not reduce the sending rate, without network coding.

- Why not use TCP termination? You could install a performance-enhancing proxy at the Sat gate, and make sure the link is always 100% utilized.

- "Let’s increase the queue memory" - I thought this should theoretically work. See for example http://yuba.stanford.edu/~nickm/papers/sigcomm2004.pdf. If folks familiar with the apnic effort are reading, I would love to know if they tried such measures and what happened.

- Could CoDel improve the situation here?


This is great, spent a lot of time reading about this kind of stuff (from books like this: http://www.amazon.com/Satellite-Technology-Principles-Applic... ) before I went to work on a ship with the task of maintaining the satellite connection. Turned out I didn't have much control over our traffic on the satellite side, but it was interesting nonetheless to think of what was happening to our traffic once it reached there.


Clever solution to a unique problem!

> So how does this help with queue oscillation? Simple: We generate a few extra "spare" combination packets, so that we now have more equations than variables. This means we can afford to lose a few combination packets in overflowing queues or elsewhere – and still get all of our original packets back.

If I understand correctly, this would also work with a more traditional coding scheme (block coding or convolutional coding). I'm curious if there are plans to take advantage of the properties of network codes in the future.


"Network code" as described here isn't the academic definition, I think he's just describing colloquially "i'm using a code in my network". The technical definition is a Packet Erasure Channel [1], which any erasure code [2] can deal with. Network coding is a more unexplored technique where routers in the network combine packets in various ways that are better than regular routing: see the example at [3].

[1] http://en.wikipedia.org/wiki/Packet_erasure_channel

[2] http://en.wikipedia.org/wiki/Erasure_code

[3] http://en.wikipedia.org/wiki/Linear_network_coding#The_butte...


The vendor mentioned in the article, Steinwurf ApS, says on their website that "Our products are based on Random Linear Network Coding[...]" [1]. In this case, it looks like they're using it as an erasure code, though.

[1] http://steinwurf.com/technology/


Their phrasing appears to state they use TCP/NC, a network coding-based variant of TCP introduced in an academic paper by (some of) the same people behind this initiative:

http://arxiv.org/pdf/0809.5022.pdf


Congratulations, you have rediscovered a problem described in a 14 year old RFC: http://tools.ietf.org/html/rfc3135


From my experience (~10 years ago) sat links suffer high latency and VERY low packet throughout. At the time I played with one in Europe it was delivered with special sauce Windows proxy software that had one purpose - cut packets/s and keep retransmissions local.

I got the feeling installing traffic shaping routers with explicit packets/second limit would also do wonders for links mentioned in the article.


Reminds me of Forward Error Correction [1], a technique used by Satellite providers and even WAN optimization vendors like Silver Peak to "erase" packet loss events by injecting parity packets into the flow, which can be used to rebuild lost packets at the receiving end if needed. This prevents TCP synchronization, aka the throughput see-saw described in the article. This problem isn't limited to high latency / satellite links but exists on any path with packet loss, like your internet connection, or even MPLS.

[1] http://en.m.wikipedia.org/wiki/Forward_error_correction


This technique is known as RLNC. It works quite well but it's not the only optimal erasure code, there are others.

Unfortunately, RLNC implementations are patent-encumbered in the US, so good luck using this "simple" linear algebra.


yes, it's really neat. then the more you dig into it the more you realize literally everything and the kitchen sink is patented to all hell and back.

best of luck using any of it without a large team of expensive lawyers.


Thank you for the interesting questions and thoughts on this topic! We tried to answer some of these in a longer comment on the APNIC blog: https://blog.apnic.net/2015/03/13/is-your-satellite-link-osc...

Disclaimer: I am one of developers of the RLNC kernel module at Steinwurf ApS.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: