Capture cards need the bandwidth. Whether they need the latency is arguable, but they need a lot more latency determinism than USB tends to offer out of the box.
Introduced latency on the capture side makes latency tuning your entire production pretty difficult. For non-real time usage, sure, latency in the 100-200ms range is more than acceptable (assuming it's deterministic, as you pointed out), but in the real-time world? Keeping things within a frame is pretty much required, and with the popularity of software-driven studio workflows across both amateur streaming and professional production, it's been real hard to get reliable performance out of USB hardware that didn't add frustrating amounts latency due to pre-ingest compression or seemingly random amounts of delay due to protocol or CPU time starvation.
Yep, there's a reason I have Blackmagic quad 4K capture cards in my workstation. Syncing multiple video streams with USB capture cards would be nigh impossible even if you put them on separate USB controllers. Ingest over USB is fine (though slow) but pretty much every USB capture card does its own internal compression, as you point out, and then involves the CPU to decompress it and get it into VRAM or DRAM.
Realtime video production is definitely an outlier. You probably want a workstation-class system anyway, with a full TB of RAM so you know that's never an issue.
This is both only true for small transfers (not bulk/asynchronous transfers) and for the ~99th percentile. Large transfers have some buffer management and handshaking, so they tend to have highly variable latency that has a very fat tail.
The latency degradation for large transfers is so noticeable that most audio DACs (not just ones for gullible audiophiles, also for the pro market) use custom drivers and USB protocols. For 1/1000000th the data rate that a capture card would need.