How do they keep pulling this off?

nullindividual · 2024-04-04T16:56:55 1712249815

I have the same question! They double the performance over the same physical interface generation after generation.

Why haven't we seen the need for a PCI-X or VLB-style PCIe interface expansion?

Night_Thastus · 2024-04-04T17:00:05 1712250005

They explain on the website:

>To achieve its impressive data transfer rates, PCIe 7.0 doubles the bus frequency at the physical layer compared to PCIe 5.0 and 6.0. Otherwise, the standard retains pulse amplitude modulation with four level signaling (PAM4), 1b/1b FLIT mode encoding, and the forward error correction (FEC) technologies that are already used for PCIe 6.0. Otherwise, PCI-SIG says that the PCIe 7.0 speicification also focuses on enhanced channel parameters and reach as well as improved power efficiency.

So it sounds like they doubled the frequency and kept the encoding the same. PCIe 6 can get up to 256 GB/s, and 2x 256 = 512.

In any case, it'll be a long time before the standard is finished, and far longer before any real hardware is around that actually uses PCIe 7.

Retr0id · 2024-04-04T17:05:46 1712250346

This answers the question in the literal sense, but I'm nonetheless surprised (as a relative outsider to the world of high-speed signalling).

"just double the frequency" isn't something we're used to seeing elsewhere these days (e.g. CPU clock speeds for the last couple of decades). What are the fundamental technological advances that allow them to do so? Or in other words, what stopped from from achieving this in the previous generation?

dvas · 2024-04-04T17:13:50 1712250830

I think a quick 2-minute read on the changes around each generation gen1 -> gen4 example from 2016 will make it a bit clearer [0].

Things like packet encoding etc. Then a quick look at the signalling change of NRZ vs PAM4 in later generations.

Gen1 -> Gen5 used NRZ, PAM4 is used in PCIe6.0.

[0] Understanding Bandwidth: Back to Basics, Richard Solomon, 2016: https://www.synopsys.com/blogs/chip-design/pcie-gen1-speed-b...

Dylan16807 · 2024-04-05T03:21:55 1712287315

I don't think that makes the answer to the question clearer at all. The slight differences in encoding are interesting but they don't answer the big question:

They made a significant signalling change once, with 6. How did they manage to take the baud rate from 5 to 8 to 16 to 32 GHz?

dvas · 2024-04-05T18:00:50 1712340050

To go a bit deeper, while still keeping it at a very high level what changes were made between each generation:

PCIe 1.0 & PCIe 2.0:

Encoding: 8b/10b

PCIe 2.0 -> PCIe 3.0 transition:

Encoding changed from 8b/10b to 128b/130b, reducing bandwidth overhead from 20% to 1.54. Changes here in the actual PCB material to allow for higher frequencies. Like changing away from PCB material like FR-4 to something else [2].

PCIe 3.0, PCIe 4.0, PCIe 5.0:

Encoding: 128b/130b

There is plenty to dive deep on, things like:

- PCB Material for high-frequency signals (FR4 vs others?)

- Signal integrity

- Link Equalization

- Link Negotiation

Then decide which layer of PCIe to look at:

- Physical

- Data / Transmission

- Link Layer

- Transaction

A good place to read more is from the PCI-SIG FAQ section for each generation spec that explains how they managed to change the baud rate as you mentioned.

PCI-SIG, community responsible for developing and maintaining the standardized approach to peripheral component I/O data transfers.

PCIe 1.0 : https://pcisig.com/faq?field_category_value%5B%5D=pci_expres...

PCIe 2.0 : https://pcisig.com/faq?field_category_value%5B%5D=pci_expres...

PCIe 3.0 : https://pcisig.com/faq?field_category_value%5B%5D=pci_expres...

PCIe 4.0 : https://pcisig.com/faq?field_category_value%5B%5D=pci_expres...

PCIe 5.0 : https://pcisig.com/faq?field_category_value%5B%5D=pci_expres...

PCIe 6.0 : https://pcisig.com/faq?field_category_value%5B%5D=pci_expres...

PCIe 7.0 : https://pcisig.com/faq?field_category_value%5B%5D=pci_expres...

[0] Optimizing PCIe High-Speed Signal Transmission — Dynamic Link Equalization https://www.graniteriverlabs.com/en-us/technical-blog/pcie-d...

[1] PCIe Link Training Overview, Texas Instruments

[2] PCIe Layout and Signal Routing https://electronics.stackexchange.com/questions/327902/pcie-...

Night_Thastus · 2024-04-04T17:10:38 1712250638

PCIe doesn't have to do anything as complex as a general-purpose CPU. Increasing the frequency is a lot easier when you don't need to worry about things like heat, pipelining, caching, branch prediction, multithreading, etc. It's just encoding data and sending it back and forth. We've gotten very, very good at that.

It wasn't really like it was impossible before now - it's just more that it wasn't in demand. With the proliferation of SSDs transferring data over PCIe, it's become much more important - so the extra cost of better signaling hardware is worth it.

Not to dismiss it completely, it's still a hard problem. But it's far easier than doubling the frequency of a CPU.

p1esk · 2024-04-04T17:14:24 1712250864

Why would pipelining, caching, branch prediction make increasing the frequency difficult? Why would heat be less of a problem for a pcie controller than for a cpu?

kanetw · 2024-04-04T17:34:10 1712252050

The short, stupid answer is: transistor count and size.

Explaining it in detail requires more background in electronics, but that's ultimately what it boils down to.

High-end analog front ends can reach the three-digit GHz (non-silicon processes admittedly, but still).

p1esk · 2024-04-04T17:45:38 1712252738

Explaining it in detail

Please do

0xcde4c3db · 2024-04-05T00:25:00 1712276700

I see you don't have any responses, so I'll give it a shot. I don't remember the actual math from my coursework, and have no experience with real-world custom ASIC design (let alone anything at all in the "three-digit GHz" range), but the major variables look something like this (in no particular order):

1) transistor size (smaller = faster, but process is more expensive)

2) gate dielectric thickness (smaller = faster, but easier to damage the gate; there's a corollary of this with electric fields across various PCB layer counts, microstrips, etc. that I'm nowhere near prepared to articulate)

3) logic voltage swing (smaller = faster, but less noise immunity)

4) number of logic/buffer gates traversed on the critical path (fewer = faster, but requires wider internal buses or simpler logic)

5) output drive strength (higher = faster, but usually larger, less efficient, and more prone to generating EMI)

6) fanout (how many inputs must be driven by a single output) on the critical path (lower = faster)

Most of these have significant ties to the manufacturing process and gate library, and so aren't necessarily different between a link like PCIe and a CPU or GPU. Some things can be tweaked for I/O pads, but broadly speaking a lot of these vary together across a chip. The biggest exceptions are points #4 and #6. Doing general-purpose math, having flexible program control flow constructs, and being able to optimize legacy instruction sequences on-the-fly (but I repeat myself) unavoidably requires somewhat complex logic with state that needs to be observed in multiple places. Modern processors mitigate this with pipelining, which splits the processing into smaller stages separated by registers that hold on to the intermediate state (pipeline registers). This increases the maximum frequency of the circuit at the cost of requiring multiple clock cycles for an operation to proceed from initiation to completion (but allowing multiple operations to be "in flight" at once).

That being said, what's the simplest possible example of a pipeline stage? Conceptually, it's just a pair of 1-bit registers with no logic between the input of one register and the input of the next one. When the clock ticks, a bit is moved from one register to the next. Chain a bunch of these stages together, and you have something called a shift register. Add some extra wires to read or write the stages in parallel, and a shift register lets you convert between serial and parallel connections.

The big advantage that PCIe (and SATA/SAS, HDMI, DisplayPort, etc.) has over CPUs is that the actual part hooked up to the pairs that needs to run at the link rate is "just" a pair of big shift registers and some analog voodoo to get synchronization between two ends that are probably not running from the same physical oscillator (aka a SERDES block). In some sense it's the absolute optimal case of the "make my CPU go faster" design strategy. Actually designing one of these to reliably run at multi-gigabit rates is a considerable task, but any given foundry will generally have a validated block that can be licensed and pasted into your chip for a given process node.

Does that make sense?

Dylan16807 · 2024-04-05T03:41:06 1712288466

It makes sense why links can reach higher speeds than CPU cores, but it's not enough explanation for how symbol frequencies got 25x faster while CPU frequencies got 2x faster.

pclmulqdq · 2024-04-05T14:01:36 1712325696

The simple answer to this is that signaling rates were far behind the Pareto frontier of where they could be, and CPU clock rates are pretty much there. CPU clocks are also heat-limited far more than I/O data rates. CPUs are big and burn a lot of power when you clock them faster, while I/O circuits are comparatively small.

Transmitting and receiving high-speed data is actually mostly an analog circuits problem, and the circuits involved are very different than those in doubling CPU speed.

loeg · 2024-04-04T20:05:44 1712261144

The rest, sure, but PCIe does still have to worry about heat (and energy consumption).

EgoIncarnate · 2024-04-04T17:22:07 1712251327

Too much energy is used (and heat generated) at high frequencies. For something like PCIe, you don't need to double the frequency for the whole circuit to double the frequency of the link. You can double the front/back and then double the parallelization of the rest of the circuit. Most of the circuit can still run at a more energy efficient frequency. Potentially it was possible earlier, but the circuit size made it cost prohibitive.

georgeburdell · 2024-04-04T17:28:36 1712251716

PAM4 allows the frequencies involved to stay close to the same, at the expense of higher required precision in other parts of the transmit/receive chains, so they didn’t “just double the frequency”

Kirby64 · 2024-04-04T17:54:37 1712253277

But they literally did. FTA:

>> To achieve its impressive data transfer rates, PCIe 7.0 doubles the bus frequency at the physical layer compared to PCIe 5.0 and 6.0. Otherwise, the standard retains pulse amplitude modulation with four level signaling (PAM4), 1b/1b FLIT mode encoding, and the forward error correction (FEC) technologies that are already used for PCIe 6.0.

Nothing else changed, they didn't move to a different encoding scheme. PCIe 6.0 already uses PAM4. Unless they moved to a higher density PAM scheme (which they didn't), the only way to increase bandwidth is to increase speed.

Dylan16807 · 2024-04-05T03:46:43 1712288803

That was gen 6 and only gen 6. Every other generation doubled the frequency.

magicalhippo · 2024-04-04T22:33:08 1712269988

Some EE's I know speculate that PCIe 7.0 will require cables, such as the next generation of these[1].

That is, they recon long traces on a motherboard just won't cut it for the strict tolerances needed to make PCIe 7.0 work.

[1]: https://www.amphenol-cs.com/connect/news/amphenol-released-p...

touisteur · 2024-04-04T18:10:43 1712254243

I'm thinking huge progress recently both on ADC tech and increase in compute power near the rx/tx source, are getting widespread adoption, be it on Ethernet (latest raw lane speed being industrialisés is 224Gbps, useful 200Gbps, and you bunch lanes together in e.g. quad- or octo-sfp/quad-sfp-double-density - not sure osfp-224 is already available...) and get to 800G or 1.6Tbps ifffff you have the switch or controller for this (to my knowledge no Ethernet controller yet - NVIDIA connectx 7 stops at qsfp-112) but it's mostly because of PCIe ?

The future NVIDIA B100 might be PCIe 6.0 but hopefully will support 7.0 and maybe NVIDIA (or someone) gets a NIC working at those speeds by then...

ksec · 2024-04-05T05:55:44 1712296544

>In any case, it'll be a long time before the standard is finished, and far longer before any real hardware is around that actually uses PCIe 7.

The standard is expected to finish in 2025, and hardware / IP for PCIe 7 are already in the work. Since there is so much resemblance of it and PCIe 6. In terms of schedule it perhaps may be the least lead time of recent PCIe from Standard 1.0 to IP available. The HPC industry is really pushing for this ASAP.

Tuna-Fish · 2024-04-04T19:19:56 1712258396

> the same physical interface

It's not really the same physical interface. The connector is the same, but quality requirements for the traces have gotten much more strict over time.

> Why haven't we seen the need for a PCI-X or VLB-style PCIe interface expansion?

x32 PCIe does exist, it's just rarely used.

jeffbee · 2024-04-04T17:19:03 1712251143

My impression is they use the standards process as a kind of objective-setting function to ensure the industry continues to move forward. They seem to figure out what will be just possible in a few years with foreseeable commercialization of known innovations, and write it down. It seems to have worked for > 20 years.

WatchDog · 2024-04-05T00:37:36 1712277456

Yeah, I always see these announcements from the standards body, “we’ve doubled pcie bandwidth again”, but I never see any articles actually digging into the engineering challenges involved, and what innovations have been made to enable such progress.