No. There was an internal nvidia presentation from a few years ago that stated that moving data was the hard part. (I can't find the presentation any more, but if anyone can find it, please post it below.)
Previously graphics cards were essentially designed with a single area of the card handling computation, and another area holding memory. Data would need to be moved from the memory, to the computation area, then back again if there were changes that needed to be stored.
As the computation and the memory demands became larger, those areas had to handle more, but so did the bus between those two areas. What was a negligible overhead for the bus became more pronounced as it had to handle more data.
Eventually the energy overhead of transporting that much data across that distance started constraining what was possible with graphics cards.
That is why graphics card architectures have shifted over time to place memory cache units next to computation units. The less distance the data needs to travel, the smaller the power requirements. It's also led to the investment and adoption of stacked memory dies (why grab data from 5cm away in the x-y plane when we can stack memory and grab data 5mm away in the z-direction).
Moving around data is indeed a major issue for any throughout oriented device. But for a gaming GPU, PCIe BW has never been an issue in any of the benchmarks that I’ve seen. (Those benchmarks artificially reduce the number of PCIe lanes.)
In fact, the 4000 series still has PCIe 4.
Moving data around for a GPU is about feeding the shader cores by the memory system. PCIe is way too slow to make that happen. That’s why a GPU has gigabytes of local RAM.
Previously graphics cards were essentially designed with a single area of the card handling computation, and another area holding memory. Data would need to be moved from the memory, to the computation area, then back again if there were changes that needed to be stored.
As the computation and the memory demands became larger, those areas had to handle more, but so did the bus between those two areas. What was a negligible overhead for the bus became more pronounced as it had to handle more data.
Eventually the energy overhead of transporting that much data across that distance started constraining what was possible with graphics cards.
That is why graphics card architectures have shifted over time to place memory cache units next to computation units. The less distance the data needs to travel, the smaller the power requirements. It's also led to the investment and adoption of stacked memory dies (why grab data from 5cm away in the x-y plane when we can stack memory and grab data 5mm away in the z-direction).