Well, there are other reasons; you want to write code that operates on the data, and neither the code nor the data fits on a single machine - you have to target an abstraction which spans machines. Block storage is too low-level an abstraction.
That isn't to say that using high performance block storage isn't still a win even when the redundancy is multiplied at a higher level. The higher level redundancy is also about colocating more data with the code - i.e. it's not just redundant for integrity, but to increase the probability it's close to the code.
Of course. Most production monoliths are deployed on networked block storage - aka SAN - and NUMA is already structurally distributed memory, even on a single box. But it's not the right paradigm to scale well, no more than chatty RPC that pretends the network doesn't exist is the right way to design a distributed system.
That isn't to say that using high performance block storage isn't still a win even when the redundancy is multiplied at a higher level. The higher level redundancy is also about colocating more data with the code - i.e. it's not just redundant for integrity, but to increase the probability it's close to the code.