Why are NATs getting costly? Don't quite understand.

stingraycharles · on July 3, 2022

To do NAT, you need to map (external) port numbers to (internal) IP addresses. This is done using connection tracking: tracking the state of the connection and the appropriate mapping.

And connection tracking gets expensive at scale.

martinald · on July 3, 2022

Yes, I understand how NAT works.

But CPU/memory is going down in cost way faster than bandwidth demand is increasing.

Regardless, its way way cheaper than buying IPv4 blocks clearly, otherwise people wouldn't be doing it.

Edit: ok, the problem isn't hardware, it's comedy license fees. https://itprice.com/juniper-price-list/cgn.html

$470k for a license to do CGNAT at 100gbit/sec. Surely these guys are opening themselves up to be replaced with some cheaper open source based software solution?

klabb3 · on July 3, 2022

> $470k for a license to do CGNAT at 100gbit/sec. Surely these guys are opening themselves up to be replaced with some cheaper open source based software solution?

Good. CGNAT needs to die. Addressing is fundamental and customers deserve not just an address, but their own RANGE, especially now that it's feasible.

znpy · on July 3, 2022

i'd love to be able to also ask for the reverse-dns zone to be delegated... if i've got a public subnet, it would be lovely to be able to use it properly.

a man can dream.

stingraycharles · on July 3, 2022

Software-defined networking is slowly becoming more popular, but it’s always going to be more resource intensive than these enterprise-grade routers that are typically implemented using FPGA / ASICs.

Having said that, I’m often equally baffled at just how expensive modern networking hardware is, but as it’s pretty much all of these carrier grade networking solutions being this expensive, I’m assuming it’s somewhat justified.

That doesn’t take away the fact that NAT just adds an expensive layer of complexity on top of it, and I can imagine that in the long term, IPv6 is starting to become much more attractive.

kazen44 · on July 3, 2022

> I’m assuming it’s somewhat justified.

in a sense, yes. People claiming software based solutions can match performance of hardware basic ASIC's are simply not thinking about the scale and speeds of modern core routers and switches.

For instance, taken from the blog of ivan pepaljnak[0] > It’s hard to imagine how fast switching ASICs have to work – a modern data center switching ASIC can forward billions of packets per second. For example, the throughput of Broadcom Tomahawk 31 is 12.8 Tbps, and it can switch 8 billion packets per second, or 8 packets every nanosecond.

Another thing which makes routing at large scales with large traffic flows expensive is the separation of the control and data plane. most modern datacenter routers can continue forwarding traffic inside the ASIC while its control plane encounters a failure. (usually for a few 100ms to a second, after that the forwarding table will become stale, and this cannot be refreshed without a control plane).

Having a redundant control plane isn't that expensive, but it becomes harder and harder to keep this failover fast enough if your forwarding plane is pushing more and more individual traffic flows.

Then there are still other items which one can add to a modern router to make it do more but also cost more. (think about accelerated IPsec encryption, MACsec at line rate or DWDM functionality).

[0]: https://blog.ipspace.net/2022/06/data-center-switching-asic-...

martinald · on July 3, 2022

Probably a bit of a cartel for "enterprise grade" networking equipment, is my guess. Was similar in the late 90s/early 00s for web/database servers.

noizejoy · on July 3, 2022

My (uneducated) guess would be to look at the way patents last too long. So society ends up suffering, rather than benefiting from IP protection.

Nextgrid · on July 3, 2022

I’m not sure the price is justified, however the ISP market is extremely difficult/impossible to break through for startups or any company capable of building their own. It’s a self-fulfilling prophecy, the market is hard to break into (for other reasons besides networking equipment cost) so nobody who can actually do something about it is able to get in.

notatoad · on July 3, 2022

the cheaper solution is IPv6. if an organization is too resistant to change to implement IPv6, they're going to find themselves subject to exorbitant licencing fees in order to keep using the technology they are stuck on.

fomine3 · on July 4, 2022

CGNAT also needs IP-port-user logging to support disclose request by law enforcement.

londons_explore · on July 3, 2022

Not if you only allow each user 100 connections. That's 1200 bytes of ram per customer paying 100$ a month.

And you can charge them an extra $10 per month for 'pro' internet and let them have 1000 connections for 'all the family'.

welterde · on July 3, 2022

That's a laughable low limit. Even the "pro" plan would be marginal for a single person without running into limits from time to time. And nevermind power users that might do something with p2p or have a couple more devices connected to the network.

But that's besides the point. Your home router can easily have millions of connections open (if they didn't skimp on the ram anyway), but if you have CGNAT boxes that do the same for tens of thousands of customers you also have to take into account that they have to move a lot of traffic. This means routing and doing NAT in software won't cut it anymore, but you need dedicated hardware coupled with very fast specialized memory to handle that traffic.

londons_explore · on July 3, 2022

You can still do hardware NAT for the few thousand connections with the most packets and software NAT for everything else.

I bet across even an ISP network of a million users, 80% of the traffic at any point in time is within 10,000 connections.

stingraycharles · on July 3, 2022

You do realise that almost all connections are long-lived, and burst up and down in throughput? So the 10,000 “heaviest” connections right now are not the same as in, say, 3 seconds from now ?

So you propose constantly swapping in and out connections from “hardware NAT” to “software NAT”? What heuristic will you use to decide which connections go where?

Such a heuristic will probably look a lot like QoS, which is even more (much more!) resource hungry than NAT.

At which point will the obvious conclusion be, “maybe the carriers who actually deal with these problems have a point, NAT is indeed a significant amount of complexity, and let’s be happy IPv6 starts to make actual economic sense?”

stingraycharles · on July 3, 2022

How do you ensure each user is capped at 100 connections without that check incurring additional resources?

londons_explore · on July 3, 2022

You have a per user counter. So instead of 1200 bytes it's 1201 bytes per user.

stingraycharles · on July 3, 2022

Memory isn’t the only dimension we care about.

You’ve basically proposed an absolutely horrible solution, for both the end-user and the ISP. Something tells me you haven’t actually done any actual low level network engineering, and just brush all this off as “how hard can it be”.

kazen44 · on July 3, 2022

how are you going to keep this counter? Do you identify the bytes that are processed in individual flows? Which system will keep track of this? the control plane of the router maybe? great... you just added additional complexity instead of just pushing packets through a forwarding plane.

Dylan16807 · on July 4, 2022

When an unrecognized flow shows up, punt it to software. Handle the counter there, and if it overflows then you drop the packets. No need to add anything to the control plane.

kazen44 · on July 4, 2022

"punting it to software" from a router with seperate control and forwarding planes perspective, is forwarding it to a control plane, instead of relying on the logic programmed inside the ASIC to forward traffic.

Dylan16807 · on July 5, 2022

Sorry, I meant no need to add anything to the forwarding plane, or interfere with its efficiency at all.

The point is, the really fast part doesn't need to be more complex.

The part that handles new connections needs to be marginally more complicated, but not enough that it should really matter.

shepherdjerred · on July 3, 2022

Please don’t give Comcast ideas

kazen44 · on July 3, 2022

because maintaining state for GCNAT tables is far more complex then just forwarding packets. routers doing NAT are thus more expensive then those just doing simple forwarding.

Also, in some countries ISP's need to map the use of a specific ip address to a specific subscriber for law enforcment purposes. GCNAT is no exception to this and creates a large amount of overhead because the public IPV4 prefix space is shared between multiple customers.

TedDoesntTalk · on July 3, 2022

> because maintaining state for GCNAT tables is far more complex then just forwarding packets.

But it’s a solved problem with mature solutions, decades old. Is it really financially expensive?

kazen44 · on July 3, 2022

compared to rolling out IPv6? definitely, especially on the longer term.

For instance, most Core/Edge routers (my experience is mainly with juniper MX series, but i assume the model is roughly the same for other vendors), you need specific licenses or interface card's to do stateful services like NAT.

Compared to doing IPv6, which is "just forwarding packets" and doesn't require the hardware to track state in nearly the same manner.

Most serious core/edge hardware vendors also do not put IPv6 behind licenses compared to CGNAT and other NAT-like features, because packet based forwarding is the most basic functionality a router should provide.

Routers which are able to do less state, also are frequently far less expensive.

kortilla · on July 3, 2022

You’re presenting a false dichotomy. The choice for an ISP today is not “v4 or v6”, it’s either “v4 or v4+v6”. A v6 only connection in the US is unusable.

Nextgrid · on July 3, 2022

The v4 fallback can operate on slower equipment if needed. The majority of bandwidth-heavy services support IPv6 (and the slowness will encourage outliers to migrate).

est31 · on July 3, 2022

The more traffic you can get onto ipv6 the less stress is on the v4 infrastructure. Each v6 connection is one your CGNAT doesn't have to provide an ipv4 port for.

kortilla · on July 3, 2022

So what? That’s still just a scaling factor at that point and still requires you to have v4 cgnat infrastructure + ipv6.

est31 · on July 3, 2022

You don't have to beef up your v4 infra as much though. Think 4 powerful v4 routers instead of 5 or something. If the traffic to the big streaming providers doesn't have to run through these routers, you can save a lot. Same goes for the ipv4 address space you have to rent/buy. The more connections are on ipv6, the less public ipv4 addresses you need to have.

So ipv6 support might be saving you costs already in a dual stack setting.

kortilla · on July 4, 2022

Take a step back to the wider context of a brand new ISP though. If you’re rushing to market like Starlink appears to be, you either implement just v4 and scale later or implement both v4/v6 up front.

Until there is a bunch of exclusive v6 stuff customers will be up in arms over missing, the answer of which thing to prioritize is obvious.

est31 · on July 5, 2022

Yeah I guess it's the same as with the inter satellite communication which is promised for later, but not implemented yet so that they get at least some product out to customers. I don't think dual stack is that hard to do for entirely new networks though.

Also, one of the reasons to do satellite internet is lower latency which is a bit hurt by CGNAT infrastructure.

Last, generally brand new ISPs are in the situation that they have a hard time of getting ipv4 address space. The incumbents, especially the older ones, were around when ipv4 addesses were still plenty so they usually have way less problems with ipv4 address space. Starlink only has 166k ipv4 addresses according to https://ipinfo.io/AS14593 . Compare this to AT&T which has over a hundred million for their AS 7018 https://ipinfo.io/AS7018 alone, and there are other AS numbers they have like AS20057 with 7 million ipv4s. This roughly matches the number of AT&T customers while Starlink has more than double the number of subscribers than its number of public IPs, with growth ahead.

p_l · on July 4, 2022

Having your core as v6 only lets you push NAT to limited places (one of the many options for 4x6x4 NAT, including stateless options if you're willing to cut certain corners off v4).

And v6 connections help drop the pressure on NAT resources - and sites that are optimizing for mobile connections are already going to be on IPv6 where possible (due to mobile networks prioritizing v6 traffic for various reasons, including licensing - and NAT resource costs)

lmm · on July 4, 2022

CGNAT is just a slightly more fiddly version of DS-Lite (and frankly at this stage your internal network is either v6 or an ad-hoc informally-specified bug-ridden implementation of half of it). You're always going to have to do messy connection tracking stuff with connections going to v4-only sites, the only question is whether you want to do it for connections to v6-enabled sites as well or not.

j16sdiz · on July 3, 2022

All apps on iOS support DNS64 on ipv6 only network.

kortilla · on July 3, 2022

That doesn’t help for servers that are only reachable via ipv4 (see GitHub).

wmf · on July 3, 2022

NAT64+DNS64 is specifically for IPv6-only clients to access IPv4-only servers.

Taywee · on July 3, 2022

A problem being solved doesn't mean the current solution is inexpensive or optimal.

wongarsu · on July 3, 2022

The same could be said about IPv6. I think the point is that IPv6 scales better with traffic increases, to the point where switching from CGNAT to IPv6 becomes financially attractive.

swinglock · on July 3, 2022

What do you mean, solved problem?