IBM was locked in a brutal fight with HP and Dell to take share in the early x86 server market. Their chipsets were neat but they looked silly next to an HP Opteron box. Integrated memory control ended the chipset race.
Still their chipsets had neat features worth remembering, such as the ability to use local main memory as a last-level cache of memory read from remote nodes. And of course they went to 64 sockets which was respectable.
Opteron had vastly better architecture than the Intel chipsets of the day, but it topped out at four or eight sockets, I forget.
IBM, at the time, could offer you Opteron-like architecture and performance, with up to 32 sockets, using Intel chips. That was worthwhile to some customers. "Intel" wasn't the selling position. It was x86 or x86-64, with "big" as the selling position.
I'm not here to apologize for Intel. I'm just saying, those IBM proprietary chipsets had their nice bits.
Early Opteron (with Socket 940) topped at 8 sockets glueless - with the same chipset one could drive a socket 754 chip.
However, with custom glue logic, one could expand it very far - AMD offering in that space was "Horus" chipset which connected 4 sockets with external fabric (infiniband, iirc) to create 64 socket systems. Similar tactic was (and still is) offered by SGI in their UltraViolet systems which utilize the same principle using NUMAlink fabric and Xeon cpus.
But in the end, Xeon systems with anything more than 4 sockets, with 8, 16 or 32 CPUs were a rare weird niche market compared to on one hand, things like zseries mainframes and big Sun and SGI machines, and on the other hand, people who learned to write software to distribute workloads across a couple of dozen $2000 to $4000 1RU dual socket servers rather than buying one beastly proprietary thing with a costly support contract.
As I recall HP also used serverworks for a lot of dual socket 2ru systems, for people who didn't want to buy opteron (and from 2000 to 2003 before the release of opteron)
Still their chipsets had neat features worth remembering, such as the ability to use local main memory as a last-level cache of memory read from remote nodes. And of course they went to 64 sockets which was respectable.