Loxilb: eBPF based cloud-native service load-balancer

jiggawatts · on Feb 23, 2023

It talks a lot about performance, but the actual cloud load balancers such as AWS ELB or Azure Load Balancer are implemented in the software-defined network (SDN) and are accelerated in hardware.

For example in Azure, only the first two packets in a VM->LB->VM flow will traverse the LB. Subsequent packets are direct from VM-to-VM and are rewritten in the host NICs to merely appear to go via the LB address. This enables staggering throughputs that no “software in a VM” can possibly hope to match.

Personally I wish people would stop with the unnecessary middle boxes. It’s 2023 and there are cloud VMs now that can put out 200 Gbps! Anything in path of a few dozen such VMs will melt into slag.

This is especially important for Kubernetes and microservices in general that are already very chatty and have layers of reverse proxies five deep in surprisingly common configurations already.

betaby · on Feb 23, 2023

> only the first two packets in a VM->LB->VM flow will traverse the LB. Subsequent packets are direct from VM-to-VM and are rewritten in the host NICs to merely appear to go via the LB address

Do you have more details how that's done?

fnordpiglet · on Feb 23, 2023

Generally you can do this if the network is software defined. The boundaries between networks aren’t actually real, which means you can load balance by simply determining if the packets should be able to route then letting them route for the rest of the flow.

baq · on Feb 23, 2023

So, basically magic, just as I expected. Got it.

jvans · on Feb 23, 2023

+1 would be great for more detailed explanation or links out to reading material.

I assume there are some limitations if you actually skip the LB? How do the host NICs rewrite the LB address, does this imply there is hardware support for this kind of bypass routing?

dilyevsky · on Feb 23, 2023

I’m not convinced it offers better cogs actually. You can do line speed on CPU with DPDK these days and even relatively beefy CPUs are probably cheaper than a specialized hardware like a xilinx card.

gpapilion · on Feb 23, 2023

I spent a lot of time thinking about this, and the conclusion I reached is while the cpu may be cheaper, it can generate revenue where the fpga cannot. So even thought the tco may seem upside down the opportunity cost of using a few cores makes up for the additional cost.

Gcp for example has the potential for ~1 k revenue per core over a system lifespan. A smart nic is probably ~1.5k, so saving 2 cores outs you in the black and has other security advantages.

dilyevsky · on Feb 23, 2023

Not sure if you mean this in purely making accounting look good sense but I’m not following - usually you get fixed $$ budget not cpu count budget. Therefore buying all cpus will actually make workload cores cheaper due to higher volume discount.

anandrm · on Feb 23, 2023

Just curious "only the first two packets in a VM->LB->VM flow will traverse the LB. Subsequent packets are direct from VM-to-VM and are rewritten in the host NICs to merely appear to go via the LB address" , how is it possible to change the Load Balancer IP(VIP) to VM IP in a session . Are you talking about DSR(Direct Server Return) here ?

jiggawatts · on Feb 23, 2023

Cloud networking is basically Magic(tm). The packet headers are a mere formality to keep legacy operating systems happy.

In typical data centres the "network" is really just a handful of Cisco boxes. In the cloud, the network extends to the FPGAs or ASICs in the servers themselves, including the hypervisors.

When a packet leaves a VM, the hypervisor host rewrites it, typically in hardware, and then when the remote hypervisor receives it, the packet is rewritten back to what the destination VM accepts.

This allows thousands of overlapping 10.0.0.0/24 subnets, and "tricks" like direct VM-to-VM traffic that appears to go via a load balancer.

The actual load balancer VMs just "set up" the flow, while instructing the hosts to take over the direct traffic in their stead.

anandrm · on Feb 23, 2023

Ok got it , something in lines of OpenFlow. Is there any documentation/links on this being used by AWS / Azure/ GCP .. I would like to read more on this.

nijave · on Feb 23, 2023

Don't have time to look but if you check Gitlab (the company) infrastructure issue tracker (it's open source) they have some details on how GCP cloud networking works with quotes from GCP support staff.

I guess they're seen high amounts of out-of-order packets and there's some detailed write ups on why that happens with GCP SDN implementation.

bbss · on Feb 23, 2023

Can read about GCP here: https://research.google/pubs/pub48645/

nijave · on Feb 23, 2023

It sounds like this is supposed to be a competitor/alternative to MetalLB which you'd generally use outside a cloud environment.

alas44 · on Feb 22, 2023

In case someone else seeks performance benchmarks https://loxilb-io.github.io/loxilbdocs/perf/

vbernat · on Feb 23, 2023

The details for the bare metal benchmarks are sparse. I would have expected an eBPF solution to outperform the "aging" IPVS by an important margin. Moreover, the peak performance of IPVS is far better (115 vs 57 reqs/s). It would be interesting to know if it is an outlier. A benchmark with an increasing workload over time would be more precise on how to compare both solutions.

LinuxBender · on Feb 22, 2023

Is this a blind load balancer similar to the iptables statistics module or are there health checks? Are they active or passive health checks? Asking because I saw a comparison to HAProxy.

aeyes · on Feb 23, 2023

On Kubernetes which they mainly target the Kubernetes control plane would update the list of active service endpoints according to health checks.

Standalone you could do it with the API and a small daemon but out of the box there is no support for health checks (yet).

nijave · on Feb 23, 2023

Seems like it https://loxilb-io.github.io/loxilbdocs/cmd/#endpoint

cyberge99 · on Feb 22, 2023

The name sounds like a pharmaceutical.

Cloudrizi (loxilb-io)

userbinator · on Feb 23, 2023

...or a Pokémon.

https://i.redd.it/krnyqtpgdy221.png

rektide · on Feb 22, 2023

> Optimized SRv6 implementation in eBPF

Does anyone else do Segmemt Routing in kube? This particularly caught my eye. I wonder how much other software & setup users need to take advantage of this Loxilb. It's such a different paradigm, specifying much more of the route packets take!

hujun · on Feb 22, 2023

This looks interesting, specially the support of GTP/SCTP; although it seems quite new, first commit on github is last year, wonder if anyone has used this in production?

goodpoint · on Feb 23, 2023

The build starts with "Install GoLang > v1.17".

For a eBPF based application? Not good.

tecleandor · on Feb 23, 2023

Is it because of being Go, the version, or what?

egberts1 · on Feb 23, 2023

As a security vulnerability researcher, eBPF is a problematic one to this day.

I instruct everyone to disable eBPF at kernel compile time.

Unfortunately, one cannot completely compile eBPF out of their kernel

travbrack · on Feb 23, 2023

Any benchmarks?

victorbjorklund · on Feb 23, 2023

https://loxilb-io.github.io/loxilbdocs/perf/

yogaBear · on Feb 22, 2023

My imaginary kingdom for native language libs that let me interact with eBPF to load balance logic forks.

Then I can put k8s and containers behind me.