That's honestly not true at all; it all just depends on your platform. On the Pocket, the FPGA _is_ the processor (there are actually two FPGAs, one for the actual emulation core, and one for scaling video, and there's technically a PIC microcontroller for uploading bitstreams and managing UI). The FPGAs are still not much power compared to the display itself. With the in-built current sensor on the dev kits, the highest we've measured drawn by the main FPGA is ~300mAh. Now this sensor isn't going to be the best measurement, but it's something to go off of.
Personally I think this is the biggest selling feature of FPGA based emulation.
The reality is both Software and FPGA emulation can be done very well and with very low latency, however to achieve this in software you generally require high end power hungry hardware.
A steam deck can run a highly accurate sega genesis emulator with read-ahead rollback, screen scaling, shaders and all the fixings no problem, but in theory the pocket can provide the exact same experience with an order of magnitude less power.
It's not quite apples to oranges of course, but the comfortable battery life does make the pocket much more practical.
When being nitpicky about latency is where FPGAs truly shine. You lose a good bit of it by connecting to HDMI (I think the Pocket docked is 1/4 a frame, and MiSTer has a similar mode) (EDIT: MiSTer can do 4 scanlines, but it's not compatible with some displays), but when we're talking about analog display methods or inputs, you can achieve accurate timings with much less effort than on a modern day computer.
For a full computer like the Steam Deck, you have to deal with preemption, display buffers, and more, which _will_ add latency. Now if you went bare metal, you could definitely drive a display with super low latency, hardware accurate emulation, but obviously that's not what most people are doing.
Gate for gate an FPGA consumes more power then a dedicated chip, but the power dissipation depends heavily on the programming. Careful programming can reduce power dissipation.
A potential advantage of an FPGA over a dedicated chip is that any unused functions can just be left out, saving power dissipation and logic resources. This is the (largely unrealised) promise of Reconfigurable Computing [1].
They are generally much more power hungry than the equivalent ASIC, built using the same process. However if your aim is to emulate a sufficiently old ASIC then you can probably beat it with a modern FPGA. And there are FPGAs aimed at low power consumption: they are just also slower.
You are entirely correct, but I would like to point out that there are Cyclone V cores running logic ~140MHz, not just RAM clocks, and the power consumption is nowhere near that.
Getting a large design that passes timing at that frequency with the Cyclone V fabric is unlikely, however.
----
The distinction here, being that more capable FPGAs can get up to the 600MHz+ range, and actually run a full design at that speed.