Curious.
This is somewhat reminiscent of SGI's ccNUMA and CRAYLink/NUMALink architectures.
If memory serves, IRIX (SGI's UNIX OS) had both the metrics to see the latency of access, and the ability to migrate the data and/or the compute closer to each other.
ccNUMA was open-sourced and AMD uses it on their multi-core/multi-socket systems, though usually within the motherboard. Not so much leaving the case and interlinking SGI Origin system style (which is what the CRAYLink/NUMALink tech did).
The sad thing is that Hyper Transport was supposed to offer this exact feature and implement it just like SGI did with NUMAlink. There were a few boards produced with HTX slots, I have an older Tyan dual socket Opteron board with an HTX slot kicking around.
If memory serves, IRIX (SGI's UNIX OS) had both the metrics to see the latency of access, and the ability to migrate the data and/or the compute closer to each other.
ccNUMA was open-sourced and AMD uses it on their multi-core/multi-socket systems, though usually within the motherboard. Not so much leaving the case and interlinking SGI Origin system style (which is what the CRAYLink/NUMALink tech did).