random rant not really related to op but spurred by it:
i know it will probably never happen, but it would be 'truly epic' if hard PCIe endpoints became as ubiquitous as say, SPI peripheral blocks, within MCUs/MPUs and the general silicon jelly-bean world.
just a very high speed, very low latency, very widespread, very standard, very boring communication channel. and ofcourse an atmega328p can't keep up with the full b/w, but even something like an rp2040 could take serious advantage of such a fast port.
it would mean that fast information interchange would be available for "anyone", and most importantly, could be interfaced with a bog-standard run of the mill desktop PC.
it feels like, right now, if you want to do pcie stuff you have to either go FPGA, which seem like theyre quite far behind: artix stuff is pcie gen 2, i think gowin announced some pcie4 products but im not sure if they ever became real - and fpgas ofcourse are expensive and drag in a bunch of other considerations that you may not want or may be detrimental to your project... or go for a limited selection of (often closed source) mcus/mpus, thankfully the raspberry pi isnt that closed, but if the rpi doesnt have what you want then you have to go for say a rockchip, or whatever asian SOC that often has a very ambivalent relationship with datasheet availability. and even then they might not support operating the pcie in device mode (might be misremembering)
whatever, maybe im talking rubbish. i remember wanting to make some kind of USB3 gadget and being incredibly frustrated that the only option among the ocean of MCUs was the cypress ezkit bridged to a parallel bus to an FPGA. so lame! i think xilinx offered a usb3 peripheral IP block but it was behind an expensive license. i think since then there have become available USB3-capable RISC-V MCUs, thankfully, but im still surprised that nobody seems to really care about fast low latency data transfer. i guess its just not very important for "real" applications out in the wild.
I'm hoping UCIe starts to fix this. It's touted as being something that can combine the low power with the high speed. As is PCIe is a relatively high power for anything that doesn't need the speed. You need to be moving pretty well past normal MCU interfaces (spi, uart, etc) before you see the power trade off.
As is, PCIe would add sizable die area consumption, raise power consumption by an order of magnitude, and likely raise part cost dramatically. Couple this with cheapy board designs that often can't even get USB 1.0-2.0 right (recalling the amount of boards that get the pull up resistors or power supply schema wrong), and it sounds like a nightmare to manage.
I think gigabit Ethernet is where I'd prefer the effort spent. Most micros have at most 100 Mb. NXP's i.MX RT line has gigabit but that's the only one I know of.
PCIe is extremely scalable, from Gen1 to Gen5. There is no reason that you couldn't extend one of the GenX specs down to Gen0 speeds.
We are already drowning in transistors, I am not convinced by your area argument, esp in the face of shrinking nodes, we have more area than we know what to do with.
Yeah uh if you think microcontrollers are on 3nm... No. Most microcontrollers are on 28-40 if not larger. Lower gate widths can sometimes increase logic density and sometimes fmax but they dramatically increase static power, yields, and by extension cost (along with the other factors as outlined by your ieee source).
This gross look shows what I believe are some PCIe block die areas.
This is roughly half the size of what I've seen to be some of the smallest micros, and those are most likely not on the same lithographies. When you combine the amount of additional RAM/ROM needed to run a PCIe stack, maybe divide by 4 for 1 lane instead of 4 to be generous, you're still likely going to double the die area used. On top of that which I didn't mention earlier is that PCIe transceivers are going to need more per supply circuity. ST's been doing a good job of integrating things in for stuff like DSI but those app notes show more complexity than the aforementioned atmega328p which is so simple it can sometimes be run without decoupling caps on a breadboard.
If you look at die shots for micros, a lot of the time you'll see most of the area is consumed by large arrays of SRAM or flash. And SRAM is known to not scale to lower lithographies. So, I don't expect to see this changing really. To be PCIe compliant you'll likely need to have the ability to store a full TLP. On a 328p, that could double if not more the SRAM needed. So the chip would get idk ballpark 3x bigger with all the stuff included? I just don't see manufacturers wanting to do this, which is likely why they haven't. There are very few cortex M chips with PCIe that come to mind. Some more cortex Rs. I figure if it was really the silver bullet, somebody would've done it by this point.
That's on me for confusing you with the original commentor who suggested adding PCIe onto that chip.
Also, Efinix hasn't proven themselves yet. I've been following their titanium line for years. Every year they say they're getting transceivers. And every part that comes out that was supposed to (first the Ti180, now the 375) is missing transceivers. Sidetracking here that I honestly don't know what's wrong with them, they've taped out a handful of these parts since I started using them, but they evidently cannot get high speed serial working.
I may be way off here, but doesn't USB do what you want? Like if I have a Teensy and I want to plug a USB device into it, isn't that relatively simple? And fast?
These are on-board communication channels. I guess you could run USB traces, but you would end up using I2C or SPI (via a USB chip) for devices that don't have built-in USB - which is many.
Hadn’t expected to see this here! I helped put this standard together while working at Opal Kelly.
I think we had some nifty ideas in there (using an analog voltage for peripheral addressing was one of my favorite). It has slowly seen more adoption as time passes. Would always love to see more adoption!
With many of these types of standards (FMC and Syzygy) it becomes helpful to include some form of memory on the peripheral cards. With Syzygy it is required in order to store some helpful data about the card (product name, manufacturer) along with supported voltage ranges for an adjustable voltage rail from the carrier board. FMC used standard off the shelf I2C eeproms for this task, with some pins on the FMC connector used to indicate the address for the particular slot that the peripheral was attached to on the motherboard. We wanted to lose as few pins as possible for address lines on Syzygy, so we opted to instead use a microcontroller with an ADC for our peripheral flash memory. There was a single pin used to indicate the peripheral address. This pin had a fixed resistor to a power rail on the peripheral and the carrier would have a port-specific resistor to ground on the same pin. By reading the voltage across these resistors we defined 16 addresses with just a single pin. IIRC the microcontrollers we selected were almost the same cost as the I2C flash we would have used in their place!
It was a pretty neat hack.
Placing small microcontrollers on the peripherals also opened up other functions like using them for basic power supply sequencing.
Oh wow, I forgot about this. Syzygy was supposed to be like PMOD but higher speed capabilities. I actually put it into a design, but never got around to fabricating it.
(Warning, hyperlinks are definitely gonna be broken from that post). I wonder how popular the interface has been over the last few years - I haven't really paid attention.
I remember watching Andy Barry/MIT demo of Pushbroom stereo approach on a flying-wing drone, it used stereo 120FPS cameras to dodge trees & branches. It's almost a decade later now & such hardware is still not cheaply available for Pi's/SBCs.
Syzygy imo largely failed to be adopted because microcontrollers (& app processors) haven't improved all that much. We are barely seeing a lane or two of pcie 2 appear.
I am just aghast at how slow so much of the computing world moves. Cortex A53 is twelve years old, & still rampantly abundant & the go to option, or it's barely improved at all A55. We still see lots of A7's too for lower power, only barely beginning to be replaced by A32. The M7 has been a significant boost but the M3/4 are enduring. ADC/DAC have gotten a bit better for microcontrollers, but we are so far away from bandwidth & connectivity being competed on, being competitive.
Recalibrate your weights toward more recent developments (post '21) and you will see a clearer picture, I think. M7 is quite popular now for TinyML applications (600MHz for an MCU!!). It's somewhere between RPi 2 and 3 based on Core Mark score and what I've seen in terms of CV capabilities.
Things are moving very fast imo. MCUs can do 1080p at 30fps now and not just from faster hardware but with some impressive additions to instruction sets!
Edit: After re-reading I see that you're talking about adoption rates, in which case, fair points.
The real reason seems to be that FMC has basically won this socket, not that MCUs/MPUs aren't powerful enough. FMC is bigger, but it's also a lot more flexible.
2 primary reasons come to mind:
1. PCIe is mostly transceivers. Syzygy can be low speed, single ended, etc etc as well.
2. PCIe's card edge connectors can add manufacturing cost (gold plating, chamfered edges, board thickness tricks, etc) and are also relatively large.
Your question was why to use this over PCIe. Maybe I'm misunderstanding your question but as Syzygy is a connector standard, your question must have been about the connector. I think colloquially the "PCIe connector" refers to the finger and slot standard.
As for why to use single end low speed I/O, I guess it's about perspective. For some, an FPGA is something that sits on a PCIe bus or network. For a lot of people, it's a lot more than that. Something that talks to countless chips that maybe need spi, i2c, uart, or something custom.
For example, think about all those 1 wire LED strips and how it easy it would be to do that in HDL compared to the other approaches. Being and to work easily with bits vs bytes has huge advantages for custom interfaces. Another example: one time a group I was working with wanted to talk to battery management modules. Each one had a uart and they had to get daisy chained together, complicating things a lot. An FPGA with more than a few uarts is trivial and could've talked to each one independently. Even the highest end micros have maybe at most 20. You could fit 16 on 1 of these connectors. The UART block from Xilinx is on the order of 100 flops. You could have literally hundreds of uarts. Not that I have seen a need for that, but who knows, maybe someone does.
Getting rambley but another option is for test devices. Let's say you have a device that has a low speed interface and you want to test a lot of them at a time. You could use a bunch of multiplexors and time division control logic, or slap down an FPGA and do them all at once.
I think “low speed” is quite a relative term here. PCIe serdes lanes are for very high data rate communication (> 1gbps per lane). This is the realm of the Syzygy XCVR standard.
The lower speed Syzygy standard, while not operating at these speeds, is capable of much higher rates than a typical microcontroller. There are many peripherals with I/O requirements beyond a simple LED or SPI device, but below that of a PCIe or other high rate transceiver such has:
- moderate to high end ADCs and DACs (LVDS and parallel)
- image sensors (LVDS, parallel, and HiSPI)
- various networking PHYs
The lower end syzygy connector has pinouts to support LVDS differential pairs, which can easily achieve hundreds of Mbps data rates.
Piggy backing on this, image sensors using CSI are quite common. I don't know if anyone has this application, but theoretically if you wanted more than a few (I think even higher end processors cap out at 4) video streams in comes.... FPGAs. Maybe the newer Qualcomm XR chipsets can deal with that but an FPGA seems more attainable.
Qualcomm has supported CSI virtual channels for ages - you can get a little 4:1 bridge IC to renumber streams. IIRC, the 855 had 4 CSI IFEs, and would happily handle a 16x configuration if you bought the wide memory bus.
i know it will probably never happen, but it would be 'truly epic' if hard PCIe endpoints became as ubiquitous as say, SPI peripheral blocks, within MCUs/MPUs and the general silicon jelly-bean world.
just a very high speed, very low latency, very widespread, very standard, very boring communication channel. and ofcourse an atmega328p can't keep up with the full b/w, but even something like an rp2040 could take serious advantage of such a fast port.
it would mean that fast information interchange would be available for "anyone", and most importantly, could be interfaced with a bog-standard run of the mill desktop PC.
it feels like, right now, if you want to do pcie stuff you have to either go FPGA, which seem like theyre quite far behind: artix stuff is pcie gen 2, i think gowin announced some pcie4 products but im not sure if they ever became real - and fpgas ofcourse are expensive and drag in a bunch of other considerations that you may not want or may be detrimental to your project... or go for a limited selection of (often closed source) mcus/mpus, thankfully the raspberry pi isnt that closed, but if the rpi doesnt have what you want then you have to go for say a rockchip, or whatever asian SOC that often has a very ambivalent relationship with datasheet availability. and even then they might not support operating the pcie in device mode (might be misremembering)
whatever, maybe im talking rubbish. i remember wanting to make some kind of USB3 gadget and being incredibly frustrated that the only option among the ocean of MCUs was the cypress ezkit bridged to a parallel bus to an FPGA. so lame! i think xilinx offered a usb3 peripheral IP block but it was behind an expensive license. i think since then there have become available USB3-capable RISC-V MCUs, thankfully, but im still surprised that nobody seems to really care about fast low latency data transfer. i guess its just not very important for "real" applications out in the wild.