Hacker News new | past | comments | ask | show | jobs | submit login

The data rates are impressive but that doesn't really excuse anything. The data rate is more or less isolated from everything else in PCIe. The protocols that are glitching out don't care if it's 40Gbps or 480Mbps.



The faster the speed, the more critical signal integrity becomes; and I suspect at least some of the flakiness is due to how tight the tolerances are at Gbps speeds.


It's packet-based, and you can delay a packet by a lot of time and it won't cause any problems. So at higher levels there shouldn't be any tight tolerances.

Signal integrity and timings are really important at the physical layer, but I don't think that's what's breaking.


PCIe is packet-based and will retry, but typically the drivers for PCIe devices are not written to expect that e.g. accesses to its MMIO space take arbitrarily long times to complete, and the software above and around the driver (including the OS) can also be affected by such delays. Likewise, a device that's processing a continuous stream of data is going to overrun or underrun if its requests for interrupts or DMA get delayed due to retries caused by physical layer errors.


But as far as I know an x1 link works fine with real devices, and very much should work fine with real devices. And at the same time, to slow down a normal connection to less than an x1 link you'd need something like a very broken cable that just barely doesn't lose everything, and also doesn't get rejected by the system. So I still doubt physical level errors are the cause of weird flakiness except in very rare cases.


That's true, but I've also seen PCIe devices survive droping to kb/s bandwidth before failing over like you're saying.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: