That's freaking awesome. For those unaware, the RP2040 microcontroller has several Programmable IO (PIO) cores, which are essentially standalone state machines that read/write directly from memory to the GPIO pins. Unlike bitbanging the pins with the CPU, you can implement much more timing-sensitive protocols while reducing CPU usage at the same time. There's demos of people implementing VGA out, digital audio, all sorts of interesting protocols.
I love the PIOs, it's such a great idea. I've only used one to emulate a 4021 shift register being clocked at 100 kHz, which is light work compared to what they're capable of, but still something an Arduino struggles to keep up with.
I used them (PRU in Beaglebone) to implement e-fuse burning with JTAG (the chips stops working if you take a little too long) and MDIO (the interface for controlling a network PHY).
My interesting battle story around this is that I first implemented the MDIO as bit banging in the kernel. This used quite a bit of CPU, which I wanted to use for other things. I switched it over to the PRU and CPU usage dropped to 1%. Great! But the data rate was much slower, with huge latency spikes. It turned out that the CPU usage was so low that the CPU governor thought the system was idle, so was throttling the CPU to just a few hundred MHz. I had to change the governor to keep the CPU clock to 100%, then everything was about 10x faster.
It is very humbling to know I’m not the only person to have had to implement bit banged MDIO recently.
If your PHY’s data sheet says clause 45 register access only and you don’t have GPIO access as the bus master, run away.
Unfortunately I was tasked with emulating a clause 45 PHY on a clause 22 bus. Without creative use of the microcontroller peripherals, it is next to impossible to achieve implemented a 2.5Mhz slave.
I had to use an edge interrupt to jump in to a while loop state machine, whose state was determined by a timer counter driven by the MDIO bus, with all GPIO acessd via hard coded bit banded GPIO.
The PRUs in the beaglebone are way more flexible to be honest. Most of my effort with the Pico PIOs is squeezing everything to fit inside the 32 instruction limit.
If you're short on PIO memory, the OUT EXEC instruction will execute an instruction directly from the FIFO. Feed the FIFO with a DMA channel and it'll keep up with the system clock. From the RP2040 datasheet section 3.4.5.2:
> OUT EXEC allows instructions to be included inline in the FIFO datastream. The OUT itself executes on one cycle, and the instruction from the OSR is executed on the next cycle. There are no restrictions on the types of instructions which can be executed by this mechanism. Delay cycles on the initial OUT are ignored, but the executee may insert delay cycles as normal.
No, sorry. I also don't have documentation that I can give. When I left the company, the Beaglebone community was pretty well into a C compiler, which is probably complete by now. There was also some coprocessor library being developed, to provide a nice interface to the host. It was all asm and kernel modules back then. I'm sure things are much easier now.
Noob question: can it be said that the microcontroller contains a very simple FPGA? Or are there some crucial differences between the two technologies?
Much more limited capability wise, but much tighter guarantees about latency. More like a 80Mhz AVR 8bit than an ARM coprocessor.
The real shame of modern microcontrollers is the decoupling of peripherals and GPIO from the main processor. It prevents these sorts of hacks as all access has to be through the memory bus, effectively capping bitbanging at sub 10MHz speeds.
By that I mean how IO is coupled to the processor. In AVR8 (and other older processors I presume) GPIO access was via a CPU register, thus a single instruction (one clock RISC) can directly modify the GPIO state.
Every GPIO implementation I’ve seen on modern processors accesses it via a memory mapped peripheral. The difference being accessing a memory bus is not a single cycle operation. You have to wait for the bus to be free, then wait to fetch or write the data.
The most extreme analogous example of this is modern cpus are effectively infinitely fast but bounded by cache misses that necessitate memory access.
This is fundamentally why every toggle a GPIO pin benchmark is flawed. What is really being measured is memory bus latency.
This misunderstanding is why people have trouble reconciling why a multi ghz processor cannot also bitbang GPIO at ghz speeds, although if such a processor existed it would be amazing.
Nice — RP2040 is cheap, capable (2 cores running at 120Mhz+, PIO, etc) and most importantly available. It can't be overstated just how the global chip shortage affects even hobbyists or those making small runs of some breakout board.
My own product [1] has been out of stock for a long time now, because its main chip (AND the aux. one) has been unavailable for year and still has many weeks lead time.
Seeing this, I learnt that if it can be done in software, it should be. RP2040 is perfect for this and I've started writing a minimal USB-PD stack in Micropython to run on one of the cores and so far so good!
I've spent a few hundred bucks over the last couple of month snapping up components I want for projects whenever I notice them in stock. FPGAs, gate drivers, etc. Anything and everything seems to be randomly affected.
Been doing a lot of retro computing projects because the eBay parts are still there for the most part.
This is a pretty big deal, considering that the RP2040's USB hardware is purportedly buggy in a few places (IIRC you cannot plug it into a hub with a device that reports or is polled often, like a USB/Serial device).
Damn. I wonder if I can emulate a USB host on the other end and put a USB device such as a Yubikey "in the cloud" so that multiple devices can use it as a U2F device.
Yes, but the problem is many websites have horribly shitty implementations of 2FA notably AWS, PayPal, Kraken, Gusto among others. All of these websites only permit registering one key instead of multiple.
You're supposed to allow multiple keys so that if one key gets lost or stolen, you can login with another one and deactivate the lost key, among other reasons.
I also generally leave a key in each of my frequently-used devices and don't like to move keys around or travel with keys, especially on the streets of San Francisco. I have a computer at work, a computer at home, and don't want to commute with a laptop considering 3 friends have been mugged in just the past week.
So for these I'd rather circumvent the 2FA by putting the key in the cloud until they hire some better engineers who can implement multi-key 2FA.
One of my favorite innovations from sekigon-gonnoc is his BLE Micro Pro, which brings a Pro Micro compatible nrf52840-based board that brings wireless BLE capability to QMK keyboards. Newer projects like ZMK have taken over that space, but this was very cool a few years ago when few other solutions were availabe.
That's cool. There is also USB HID host firmware in TMK Keyboard Firmware [1] for 8-bit AVR, but those µC are quite small and the USB part in the firmware is quite limited.
Isn't a standalone PIO literally just a microcontroller?
You may have fun playing around with a Parallax Propeller MCU. Rather than have a bunch of peripherals, the Propeller has 8 CPU cores. The idea is that you implement whatever peripherals you want by dedicating cores to the purpose and bit-banging it. It's a similar idea to a PIO, but rather than having a main core and a bunch of tiny cores, you just have a bunch of equally capable cores.