Hacker News new | past | comments | ask | show | jobs | submit login
Faster Linux 2.5G Networking with Realtek RTL8125B (jeffgeerling.com)
121 points by geerlingguy on Dec 21, 2021 | hide | past | favorite | 84 comments



Speaking of which, what's a reasonably cheap solution for a little 10 GBit/s home LAN (Linux)? A 4 port Microtik switch and which cards? Nothing fancy. Just something that won't bottleneck like crazy the PCIe 3.0 or 4.0 x4 lanes NVMe SSDs my machines have. I'd rather go to 10 GBit/s directly instead of 2.5 but I'll go 2.5 if it's really easier/cheaper.


Mikrotik makes a 4xSFP+ switch and Ubiquiti makes a 4x10GBase-T switch. The Mikrotik one can take SFP+ copper transceivers, so it allows you to mix and match twisted pair and fiber optics if you need it. Asus makes a pair of Aquantia (now Marvell) 10G PCIe x4 cards, one with SFP+ and the other with 1/2.5/5/10G Base-T. Other vendors have similar cards in PCIe x4 and Thunderbolt 2/3 formats.


I just picked up a Mikrotik CRS305 from Amazon and a pair of used Intel X520T dual SPF+ cards from eBay, plus a couple DAC (direct attach copper) cables for maybe $300 total. I'm very happy with the performance of this setup.

One thing to keep in mind about the CRS305 is that it can't handle more than one or maybe two SFP+ copper modules. They use a lot of power and put out a lot of heat. I think the CRS309 has fewer restrictions.


I find having a 120mm pc fan sitting on top blowing down through the case's top holes keeps everything sane. A 120mm fan matches the size almost perfectly, and there are even some usb powered ones available online. Running the fan slow and with rubber standoffs to help air move freely allows it to be nearly silent. Without that, even a single sfp+ to rj45 copper module can overheat and become unstable under load. Other than that the switches work great, perfect option for small scale cheapish 10G networking.


The transceivers get up to 90-95°C (enough to leave a little burn!) if you plug them next to each other.

They can work in that configuration (with up to 4 in that enclosure) but I'd only do it if you have a fan on the thing. It relies entirely on passive ventilation, and the enclosure gets hot, too, like 60°C or more!


Oh hey I just read the post about transceivers being slow only in one direction. I have the same problem on a different scale. Two X520T cards, one in a Dell T20 and the other in a Dell T30. Both connected with DACs to the CRS305. 9.something Gbps one direction, 7.something Gbps the other.

I haven't bothered to do much actual diagnosis on this because I'm satisfied with the speed as-is, but it's another illustration of weirdness where you'd least expect it.


Yeah; I've noticed asymmetric speeds in certain situations with the CRS305 as well... I thought it was just me, but I'm glad (sad?) to know someone else sees the same thing too :)


Could you recommend a PCIe adapter to receive a Nokia 3fe4960ac SFP+ transceiver (AT&T fiber) and maybe one or two 2.5G or 10G modules?


You can’t bypass AT&T’s absolutely craptastic fiber modem, even if you got the bidi transceiver you needed. The connection is encrypted with keys hardcoded into the modem.


For their craptastic router you could downgrade the firmware, use an old exploit to root, extract the keys, and then use those keys on your own Linux box to authenticate as the router…

For the fiber ONT (what I realize now you may be calling their fiber modem) I would love to see a similar attempt to get their hardware out of the loop.


Yeah, I’m referring to the ONT - which is now additionally bundled into their router + wifi box. There’s similar exploit to bypass the router part of the equation but you still need the ONT to negotiate the connection.


Do you have a link to this exploit?


Is it actually encrypted? or is it just 802.1x like it used to be (I moved away from the service area when they were still doing separate ONT and crappy 'internet gateway' that turned ethernet into ethernet.


The latter. The session negotiation is encrypted. You can technically shelve the unit after that’s been negotiated but it would need to be plugged back in each time your connection reset for some reason.


You can also keep the unit online, but only pass it the 802.1x frames. You'll be out pennies of electricity, of course.


Yeah, I saw some setups doing that. It seems like too much hackery involving a very core piece of network hardware, which is something I’d prefer to avoid. Fortunately it wasn’t my connection so I could live easier with deciding to keep the ONT.


I dunno, if the new ONTs are as braindead as the old Residential Gateways, it's probably worthwhile, but sure, it's a little hacky. I also wonder if you can get a higher data rate using the transceiver directly; GPON is 2.5G down/1.25G up IIRC, where the ONT I used was limited to 1G ethernet (although maybe you could do bonding, but obviously AT&T wouldn't set that up)


I get a symmetrical 1G on the GPON while I thought I had the best plan... maybe I need to pay even more?


And this thing doesn't have a JTAG?


Oh I have no idea. You'd need to do some digging. I picked the Intel card because it was decently priced and I knew it would be reasonably compatible with what I was doing. Sounds like maybe you want a quad SFP+ card?


PCIe 2.0 cards (such as ConnectX 3) are $30 used on ebay.

The modern versions (ie: ConnectX 6) are much more expensive of course, but the older hardware might do better with FreeBSD / Linux anyway. Of course, your mileage may vary so do some research before buying (or buy for $30 and hope for the best).

You'll also either want Optical Transceivers + Fiber, or Active Fiber (aka: Fiber that has transceivers soldered onto them already), or DAC (direct-attach copper, the cheapest but only works over small distances).

Optical Transceivers + Fiber is the most flexible of course, different transceivers are spec'd for different wires and different costs though, so it gets a bit complicated.

Active Fiber and DACs are is probably easiest, but if you buy a 10m cable but need like 12m, you need to buy a new cable + transceiver combo (rather than just purchasing a bunch of different fiber lengths).

Length and cost is the big difference from Active Fiber and DAC, but otherwise are very similar in use. You plug the modules into the SFP+ port, hope its compatible and let them rip.


ConnectX3 and Intel SFP cards are pretty reliable and easy enough to get working on Linux. On my Windows 10 Pro machine, all the cards I plugged in just worked immediately, no driver install required.

I'd start there and if you find you need other features, think about spending more.


My ConnectX-3 on Windows 10/11 machine takes 15 secs to reconnect on resume from sleep. It's not serious issue but a bit annoying. Anyone had similar issue?


ConnectX-3 are PCIe3 x4/x8 cards, even the SFP+ ones. The older connectx-2 cards are PCIe2.

If you want to go all-out fast and not need a switch you can use a QSFP+ <-> 4xSFP+ cable or splitter by getting a QSFP+ card on the server end.


The major decision to make is, what kind of connector do you want? Ethernet or SFP?

Ethernet is more costly, SFP is the more approachable-but-unusual way. Lengths of the runs and how many clients changes the decision making a bit, so keep that in mind.

I didn't mind cost so much but I wanted to stick with typical Ethernet. Mainly a comfort thing, I don't know the limits of SFP well.

I went with this switch: Mikrotik CRS312-4C+8XG

... and these cards: ASUS XG-C100C

I upgraded my home lab to 10GbE around a year ago mainly to speed up my SSD-based backups. All told I think it was about $800 to upgrade my storage cluster and a couple clients.


Ethernet is more costly?! Over what distance? And surely ignoring the 'endpoint hardware', switches or whatever, (or equivalently considering only stuff high enough end that it has SFP 'for free' anyway) whatever the distance?


Ethernet is quite a bit more costly for 10G. In our deployment we use SFP+ with 3m DACs which are downright cheap, plus long fiber runs which are more feasible and less expensive for connecting sub-sites.

We support 10G-BaseT as well for devices that are 10G-PoE-BaseT but otherwise standard here is SFP+. Gets even cheaper when you consider power consumption, heat dissipation, and multi-port cards (2xSFP+ NICs are extremely affordable and nice when you have segregated networks or VM hosts).

EDIT: 10G-BaseT also has ~2 orders of magnitude higher latency compared to SFP+ Fiber, and ~1 order of magnitude higher latency than SFP+ DAC. The numbers are small across the board, but it's relevant. 5-10W draw for 10GbE compared to 1-2W draw for SFP+ Fiber, for power consumption numbers.


You can get a lot of cheap, used SFP+ stuff on eBay and the likes. For short range (up to ~3m) you can go with DACs, which are 5-10€ a piece; for longer ranges, you can go to fiber, where you'll pay 20-30€ per cable. Ethernet cables are a bit cheaper, but you need high quality ones (cat6 shielded and up) and the receivers are a lot more expensive, plus they get hotter and use more power. Unless you get a really great deal on ethernet hardware or need backwards compatibility to 2.5/1 gbit, SFP+ will be a lot cheaper.


In the case of 10G sure, over any distance. Servers use cheap and simple passive copper DACs in racks and switches use optical transceivers for uplinks (usually of a higher speed RJ45 doesn't support anyways). 10G RJ45 is more complicated to implement for both the NIC and switch and requires more power, meanwhile it's not mass produced because of the aforementioned DACs and optical uplinks and lack of consumer penetration.


I recently discovered /r/homelab and I found it extremely confusing. People are throwing thousands of dollars and thousands of watts at servers that could power a medium-sized business and then putting them in their basements and running local DNS servers, a NAS and maybe a mail server handling personal mail. Most of the compute power is spent reinstalling various virtualization technologies over and over again and repeatedly backing up a bunch of torrented movies and music without compression or dedup to flex their 30T NAS.

What am I missing?


Couldn't say for sure, I've noticed similar. I feel like I have to be overlooking something... even though I may appear to be in the same wasteful-spending boat.

Redundancy is an easy way to spend a ton of cash poorly, but if done well I recommend it for anyone!

The virtualization aspect for example, that's just a bullet-point for how I use KVM and some of the fancier gear to prove out certain implementations... then automate using them. Lately, SR-IOV.

For those like me, it's often fairly practical but the trivial stuff may be what gets shared


Note: SFP/SFP+ is just the transceiver format. LR/SR is fiber and T is Ethernet.


I'm running a bunch of old Mellanox ConnectX-3 cards, they've been plug and play in all of my Linux systems. Cheap too.


I have a server that I'm already running a bunch of vms on running opnsense, with an Intel 82599 passed through. I'd use SR-IOV but the freebsd driver is buggy when using vlans. There are some "cheap" ($250-400) switches for 10g now. HW routers are still spendy.


as was mentioned, used mellanox cards off ebay. then hit fs.com for optics and cables.


I've always been confused by the situation with Realtek Linux drivers.

Does anyone know why they are not contributing to the kernel directly? (Just a guess: their drivers are from the same codebase as the Windows driver and use a bunch of abstraction layers that wouldn't be acceptable in Linux)


Realtek does contribute directly to the kernel, but I think it depends on which internal team is handling the device support. For their PCIe 802.11ac and 802.11ax wifi cards Realtek devs wrote and upstreamed rtw88[1] and rtw89[2] respectively.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...


Wasn't rtw88 supposed to abstract away the transport so it could be used with PCIe, USB and SDIO? [0] The later two never materialized (there is the out of tree rtw88-usb [1] but it's not very active and not maintained by Realtek)

[0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

[1] https://github.com/ulli-kroll/rtw88-usb


I'm familiar with the Realtek RTL vendor driver codebase. It's a combination of two classics: a Realtek-specific hardware abstraction layer across a large number of their products (which won't fly in the Linux kernel), and abominable code quality (ditto).


Do they publish source code or binary drivers? If it's the latter, the article points at why they might not want their competitors to read their source code: the drivers matter a lot, and starting from theirs might well cut the development costs for a competing product in half.


Realtek drivers are GPL’d, but have a reputation (especially Wi-Fi dongles) of not following the coding standard of the Linux kernel and of low code quality, making them unacceptable for the upstream kernel. A driver often goes through a lengthy cleanup process by the community, first by existing as a “staging” driver in the kernel with incomplete features and unstable code, and later the code is gradually refactored before it’s finally official.


Source code and no BLOBs.


Another front for improvement is USB wireless support in linux. Yes I'm hijacking this to complain but currently (almost 2022) there isn't a single 5GHz capable(802.11ac) USB dongle with in-tree drivers or support by Ubuntu afaik. Only sketchy github scripts exist. Sure, this is primarily the vendors fault and you cannot compromise security and GPL for convenience but come on.


I'm guessing the story is overall similar to how gigabit ethernet was ~ten years ago: Use Intel if you want working hardware and drivers that actually achieve line-rate throughput. Back then the RTL chips were a trashfire and couldn't do gigabit even on Windows. Nowadays the RTL chips can do that. And here we are again with bad / proprietary network drivers for RTL chips.


For 2.5GbE, this became very much not the case.

> Use Intel if you want working hardware

Sadly Intel totally dropped the ball on this.

i225-V (Foxville) NIC early steppings were awful. And no driver can fix that, it required Intel to produce a new stepping... countless motherboards are affected by that one...

Realtek NICs are fine at this point in time.


Yeah, Intel managed to incorrectly implement the 2.5GbE standard with the result that if you connected older steppings of their NIC to a standards-compliant device massive packet loss and effective data transfer speeds apparently as low as kilobytes/sec in some cases ensued. The only workaround was to force it to gigabit mode. They took an age to release a stepping that fixed this and I think there may have been other issues too, so a whole bunch of motherboards ended up shipping with 2.5Gb support that was effectively unusable.


To be honest, multiple vendors have really screwed up 2.5/5/10G implementations—I tried cheaping out with FlyProFiber copper transceivers, and would get a solid 2.3 Gbps one way, with 400-800 Mbps the other way. Turns out there was tons of packet loss, but only in one direction.

They worked fine for 10 Gbps, but I had to switch to Mikrotik transceivers for 2.5G connections.


Not all Realtek NICs are fine at this point in time. For some models of Realtek NIcs the Linux driver has to memcpy() data for every single packet in order to work around a hardware bug. It's entirely possible that the Linux driver has that kind of workaround enabled for newer hardware that doesn't need it anymore, but getting that kind of driver change merged tends to be a bit of an uphill battle without having the hardware manufacturer actively involved in maintenance of the mainline open source driver.


Interesting. I'm having a problem right now with a motherboard I just bought that has an Intel 2.5gb nic. Running Truenas connected to my gigabit switch I get a solid 110MB/s transfer but connected to my multi gig switch its down to around 45MB/s. It's a brand new board though.


Intel’s 2.5G is just an afterthought. Realtek pushes it because it’s more cost-effective than 10G but that’s where the actual party is at and where you’ll find good hardware and software support to match.. but at a price.


Isn't it the same for wifi? On nearly every laptop I've owned I've ended up swapping for an Intel wireless card eventually.


Yes, and even the generational changes can be quite worth it - I recently swapped a 2016 Intel 8265 wifi card with a newer AX210 (- which costs basically nothing) and even without utilizing the updated standards I got 50-80 % more bandwidth under exactly the same conditions.


The driver from Realtek is GPL2. Maybe the code quality is not good enough for the kernel.

https://www.realtek.com/en/component/zoo/category/network-in...


They are upstream nowadays.

Supported mainline by the r8169 driver.


Yeah, but according to the blog post there must be some difference. "Supported" but not "fully supported". Mainline has 8k lines, Realteks driver has 15k lines in only 1 .c file and there are some more.


Realtek was always the "budget" choice.


Ah consumer networking, stuck in the single gigabit speeds, 1 GigE was introduced decades ago in the last millenium.. I wonder if this is a permanent plateau in human development or if there will be some bump when we get modern 100+G ethernet that servers are currently using.

Meanwhile there are notable recent perf speedups in the modern ethernet too: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-n...


Can you do wired >1GBit/s with low power consumption?


I've typically seen most 10gig kit to need a lot of cooling (either heatsinks or active fans), not something I'd be too happy about dealing with. Also there's like only the homelab enthusiast that needs 10gig. The rest of user can potter along fine on 1/2.5gig as we're not slinging ZFS snapshots about the place.

Tl;Dr 1/2.5gig works for consumers needs, 10gig is not price competitive for our needs.


Bandwidth stopped growing for consumer stuff when everyone moved to slow and flaky wireless. If it was available, there would be apps to use it.


Bandwidth stopped growing because needs were met.

> If it was available, there would be apps to use it.

To an extent, since the inverse is also true: there is as of yet no market demand for such large amounts of bandwidth either.

Once popular apps/usecases with heavy bandwidth requirements see wide adoption (at first limited to specialty high bandwidth networks), we will start seeing growth in the consumer bandwidth space.


PCI-E cards with this chip cost about $15 today, and 2.5G Ethernet switches just over $100. Post-gigabit networking has become truly affordable.


> just over $100

This is pretty expensive considering that you can get a 10G switch instead for around $40 more.


This assumes you're not also paying the cost for a bunch of copper transceivers. One benefit of 2.5G is it will work full speed through existing Cat5e cabling, and RJ45 connections.

10G requires upgraded cabling, and lower cost 10G devices may use SFP+, but most consumer gear uses RJ45.


Upgraded cabling if you haven't gotten cat6 at some point in the last 15+ years.

And as far as I understand it 10G on cat5e will work most of the time as long as you're not going super far.


Believe it or not, the fact that 2.5G switches are cheaper than 10G switches at all is an improvement. There seems to be a weird lack of interest in shipping it as a consumer technology.


Why do Ethernet standards mostly come in 10x orders of magnitude, anyway?

10 -> 100 -> 1000 -> 10G -> 100G

I know there are some oddball 25G, 40G, and 2.5G out there too, but always wondered this.


Think about it this way: what's the minimum improvement you'll pay a whole heap of money for?

Each generation gets introduced with new switches, new NICs, and new cabling. Since 100Mb it has been possible to have link aggregation groups, where two or more cables on the same number of NICs on each side are bundled together into one bandwidth device. At some point you need to decide that it's worth doing the transition -- and 10x is a pretty good number for that.

The recent oddballs are 50G, 40G, 25G and 2.5G. All of them come from the idea of taking a higher speed NIC and adding a little more hardware to get multiple PHY transceivers -- a 100G becomes 2x50 or 4x25, and a 10G becomes 4x2.5.

Oddly, 40G comes from 4x10G in the other direction, and is more expensive to produce than 50G equipment. It's not very popular.


40G is quite common, and probably the most cost effective high speed open on the ebay-old-enterprise-gear market or at least was for a long time until maybe just now.

I started to write that there are $30 single-fiber-pair CWDM pluggables for 40GB making it the fastest speed you can do cheaply at distances longer than realistic for DAC cables... but checking ebay I see that there are now 100G pluggables for $39, so maybe its time to update the last of my 40G hosts to 100G.


Only 2.5G is an oddball. (I guess consumers are scared of fiber or too lazy to pull it)

40G QSFP is just 4 lanes of 10G and can usually be broken out into separate ports

Same for 100G QSFP28 (4 lanes of 25G SFP28)

You see this all the time with interfaces. Like with PICe bifurcation. Or how 56G infiniband is 4 lanes of 14G tho you can’t split it, etc


Physical. Look at what is behind the PHY.

https://en.m.wikipedia.org/wiki/SerDes https://en.m.wikipedia.org/wiki/XAUI

First 10G was 4x3.25G SerDes, so 4 traces to route on the board to the switch chip. There is some overhead in the signaling so you get 10G. If you plugged in a 1G, it just used one lane. Move to 40G, each lanes speed went up, 12G IIRC. At this point there was room for MAC/PHY so you go break each out to 4x10G. 256 traces to route to switch ASIC. Board routing is black magic. PHYless (no separate physical PHY chip required each set of 4 traces to be the same length down to the nm). Arista 7050 was first switch that shipped this. First 100G was 10 lanes, but lots of traces so smaller number of ports. Then they got the lanes up to 28G so you got 100G port for 4 lanes again, or 4x25G. So paired with MAC/PHY you could get 4x25G, 2x50, etc. and so on and so on as the speeds on the SerDes goes up. This is a simplified write up but mostly correct.


https://en.wikipedia.org/wiki/IEEE_802.3

I suspect it's about us having 10 fingers and using arabic numerals.


For $150 you will get a switch with multiple GbE ports and one or two 10G uplinks. It's not quite the same thing as having 2.5G on all of the common ports.


Not a copper one though.


What, where?


QNAP for the 2.5G switches, generic/noname for the $15 8125 PCI-E cards. Branded such - mostly using the exact same reference schematic - cost a few dollars more.


Thanks.


Rtl8156 is one of few NCM USB cards which got mainstream.

NCM protocol is very lightweight, but has support for DMA, and some offloading


TL;DR: Vendor's driver performs far better than what's upstream.

This is what happens when the kernel community is confrontational and generally a pain in the neck to deal with. After enough friction caused by the netdev folks, the vendor just gives up on upstreaming and self-publishes their driver.


FreeBSD dev dishing out Linux processes in 2021. Some things never change I guess ;)

But honestly, Drew, all serious vendors of networking hardware have fully functional upstream drivers and good relationships with Jakub and Dave. However, one can't just throw mess of a code to netdev and expected it to be accepted as-is because they are a big vendor or something. Which I believe is good for users!


I'm still peeved at what happened to my company around ~2006 or so which resulted in the exact same situation in the article.

In my case: Company A submits a driver with a feature in their driver that basically triples performance for our class of device. It was implemented in a horrific, unreadable way. Companies B and C (mine) each independently re-implement that feature in a less crazy, far more readable and maintainable way. Company B gets their driver accepted with this feature (I think it was part of the initial submission). We get ours NACKed and are told to implement it for the entire kernel. This would be fine, I suppose, if companies A and B were also required to remove the feature and help implement it, but they weren't.

This might not have been such a big deal if we had been a big company. However, we had 1 dev (me) doing drivers and support for Linux, Solaris, OSX, FreeBSD, ESX, etc. So in the short term, management directed that we have customers ignore the driver in the kernel and use the one from our website so we didn't have to stop development on other OSes to implement this feature for our competition. This feature was eventually implemented in a generic way (with my help), but for several kernel releases the driver on our website outperformed the in-kernel driver by a factor of 3 or more.

FWIW, FreeBSD had no problem with the feature, and another driver author eventually ported it out of my driver into a general layer where it still exists to this day, and is used by almost every driver in the OS.


Sorry this happened to you. I'm not maintainer and hadn't been around 15 years ago, but I think these days it would be much harder to sneak in some feature in unreadable code piece with initial driver submission. The whole driver would probably be NACKed. Don't know if it makes netdev more or less confrontational from you point of view though :)


I haven't looked at what kernel people are doing in a while, but Realtek doesn't seem to have changed their policy here in the last 25 years(!), so it doesn't seem quite fair to blame the kernel community in this instance. I remember fighting with Realtek's "thrown over the fence" open source drivers around the turn of the century. Their self-produced code quality doesn't seem to have improved in that time span at all, either.


It's a shame you're getting down voted for this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: