Hacker News new | past | comments | ask | show | jobs | submit login
Learning about PCI-e: Driver and DMA (davidv.dev)
229 points by todsacerdoti 3 months ago | hide | past | favorite | 23 comments



The end goal in this series is to use an FPGA to build a display adapter, I've gotten a Tang Mega 138k [0] to start the process but there is not a lot of documentation, so it is taking a while

If you got recommendations for other (cheap) FPGA boards with PCI-e hard IP, do let me know.

[0]: https://wiki.sipeed.com/hardware/en/tang/tang-mega-138k/mega...



These seem to be ready made products opposed to devkits. Can the FPGAs be easily reprogrammed?


In general, the SRAM based units have a asic boot loader that may often be reconfigured with external pin logic. i.e. the chip configuration may be pulled from internal flash memory, sdcard, and external spi flash. Most designs I see place external spi flash chips next to the fpga, and the JTAG header in some convenient location.

While some modern high-performance fpga chips may have internal flash and an asic cpu... the configuration can generally still be updated with relative ease.

Notably, there are some exceptions where some chips may have fuses blown to lock it down, or were substituted with a One Time Programmable chip variant in production.

All that aside, the IP license to make it work will likely be unavailable for free, and high-speed/lvds layout is a black art requiring some pricey equipment to get clean performance.

One is often better off getting an overpriced xilinx PCIe development kit from a vendor, as most of the basic IP issues are solved... and one may focus on your core application.

Reverse engineering high-speed boards are way more work than most folks could imagine.

Best regards, =3


Thanks for the elaborate answer. I appreciate it.


The signal-to-noise ratio on the modern web makes these kind of summaries elusive to find unless you are reading chip app notes regularly.

Cheers, =3


FPGAs usually need their bitstream (analogous to firmware) loaded after power-on, so either there is a flash chip on the device that contains it, or it is loaded over the PCI bus from the host machine via a software driver. Either way you'll likely need to reverse-engineer and modify the firmware and host-side device driver. There are often better choices, such as dev boards, to play around with FPGAs.


Screamer PCIe Squirrel for 159 Euro (w/o tax) using Xilinx Artix XC7A35T (according photo). But it has only one high-speed external interface: USB 3.1 Gen 1.

https://shop.lambdaconcept.com/home/50-screamer-pcie-squirre...

Litefury, Xilinx Artix FPGA kit in "NVMe SSD" form factor (2280 Key M) for 102 Euro using Xilinx XC7A100T. It has only few external high-speed LVDS I/O.

https://rhsresearch.com/collections/rhs-public/products/lite...


The Litefury/Nitefury/Acorn design has enough I/O for an HDMI output. I've designed a little board with a buffer:

https://github.com/mng2/AcornHDMI

Even with the fastest speed grade Artix, 1080p output is technically out of spec, but it seems to work OK.

As you might guess I had/have my own ambitions of making a video card. The software side has been a source of dread for me, but with OP's tutorials I may have enough guidance to get back to it.


The screamer looks interesting, using Displayport Alt mode could work!


The ULX4M boards are not available yet but seem to support both PCIe as well as digital video:

https://kitspace.org/boards/github.com/intergalaktik/ulx4m/

Done by the same team who did the ULX3S.


Actually not, it's wired via a FT601. However, that's not saying it can't be useful.

E.g. you could make it behave like a UVC-type USB3 webcam for the video output.


I would definitely use Xilinx over anything else here - Vivado isn't "great" by any pro-SWE's standard of tooling, but is absolutely best-in-class for FPGA development and implementation.

The Xilinx PCIe device path is pretty well-trodden.


I would also strongly suggest using a modern higher level HDL like clash or amaranth over raw-dogging it with VHDL or (System)Verilog.


SpinalHDL or Chisel would be my recommendation - type-checked metaprogramming is really nice.


Any dev kit you can buy in the low hundreds? I thought the UltraScale was in the low _thousands_


Do you need UltraScale(+)? The Tang Mega 138k has way fewer LUTs than most US+ chips, it's a completely different class of FPGA.

You can purchase a SQRL FK33 (VU33P on PCIE cards) for $250 used in the Dream escrow Discord. Here's one on eBay for $400 (a little steep) - this is the cheapest you will get into UltraScale+: https://www.ebay.com/itm/387229340513 This is also the cheapest way to obtain 8GB of HBM, which you will want for a GPU eventually.

For $1000ish, you can get a C1100 (VU35P, 2x the area of VU33P, 8GB HBM) - https://www.ebay.com/itm/186414684753 - these do not require a Vivado license.

For $600-1300 you can get VCU1525 (VU9P - no HBM) boards used: https://www.ebay.com/itm/326206486963

For new stuff, you could get 7 Series Kintex devboards from China: https://www.aliexpress.us/item/3256805175295035.html


I definitely do not need a very large FPGA, though navigating this space as a newbie is quite hard -- I've built some small projects (serial multiplexer and a WASM CPU) on the Tang Nano 9k/20k

I think the difference between a $220 board and a $1000~ board is quite large for a hobby project


> I definitely do not need a very large FPGA

Bigger is better here - even if you don't use the LUTs/FFs, the tools have a much easier time placing/routing a relatively small design into a huge chip.

If your budget is $220ish, I would buy a used FK33 without a doubt for $250. You can use the Vivado trial (and keep renewing it.)

https://discord.gg/MynAgXK is Dream Escrow - lots of good/safe listings in there.

439680 LUTs, 879360 FFs, 672 BRAM, 2880 DSP, 320 URAM, 8GB HBM is plenty to work with.

The FK33 also has an FTDI chip onboard for convenient JTAG flashing, comms, and management.

> though navigating this space as a newbie is quite hard

This space is all about the tools: Xilinx is head and shoulders above everyone else. Quartus might be second, Lattice's tools are absolutely terrible (but open-source workflows exist for some of their chips.)

If you're designing PCIe devices and not wanting to build the PCIe IP, Xilinx's is absolutely the best in that department as well. There are more PCIe examples and boards for Xilinx chips than probably every other manufacturer.


For making a product with UltraScale+, you do not need a dev kit.

You may use e.g. the KRIA K24 SOM ($250) or the KRIA K26 SOM ($325) (the sizes of the FPGAs differ for these variants).

For designing a product with these SOMs, you can buy a few cheap development boards (the development boards including a SOM are cheaper than the SOM bought separately, so obviously they cannot be bought in big quantities) like the Kria KD240 Drives Starter Kit or the Kria KV260 Vision AI Starter Kit or the Kria KR260 Robotics Starter Kit.

You may buy these SOMs as single units at higher prices or as hundreds at discounted prices at many distributors.

These SOMs include 2 GB or 4 GB DRAM memory, QSPI or eMMC flash memory and some interfaces, like USB 3, Ethernet, PCIe Gen2 x4, transceivers suitable for HDMI 2.0 and DisplayPort 1.4 etc. and the FPGAs include quadruple Cortex-A53 + double Cortex-R5 cores, which can initialize the FPGA or reconfigure it remotely whenever desired and which can run a Linux OS. The FPGA also includes a Video Codec Unit, but that is limited to an aggregated resolution of 4kp60 (i.e. either a single 4k stream or up to 32 lower resolution streams), and also a Mali GPU and a (slow) DisplayPort controller that can be used as the video output for the operating system run on the Arm cores.


This seems like a great intro to Linux PCIe device drivers. I've never worked with Linux device drivers but have worked with multiple PCIe drivers on a different operating system years ago, and the concepts look very familiar. Love to see more of this type of content in the world.


I really, really like the way these articles flow, with just enough code to illustrate their point, and a gradual build. This is good stuff. I have never in my life wanted to make a new PCI device, but now I kind of do, which is like the acid test for good technical writing, right?


thanks so much for writing these. they are really informative and practical in an area this seems really rare in. so handy! exactly what i was needing to create some dev/playtest environment for my project without knowing how to even search for it.

also like a lot the other 2 parts. many practical things like how to use some of your bootsvc driver stuff after u exit them, busmastering, msi-x stuff. tons of little nice details and super useful




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: