CAPI allows an FPGA connected via PCIe to be treated as a coherent peer to the CPU cores that is able to hold cache lines and also use address translation. Among other things, from the application programmer's perspective, the CAPI accelerator can basically be treated as if it were another thread, since it can use the application's virtual address space - the application can set up data structures in main memory and pass unmodified pointers to the CAPI card.
Thanks for the link. The papers mentions key/value stores. Would a valid use case for CAPI be something similar to a "flash cache" where the FPGA is not as fast as DRAM but still faster than NAND flash?