You'd have to consider what actually happens when 50% of your infrastructure goes down. Can the remaining 50% cope with 100% of the load? If not, then you still get a complete failure. So then the question becomes can you reprovision between stack A and stack B very rapidly, both running from the same hardware pool, while 50% of your infrastructure is down. Now you've introduced the potential for correlated failures (single hardware pool and network), plus added complexity and load due to reprovisioning just when things are already overloaded. Not easy to get this right, so might not actually increase reliability, as now you've got two different stacks that can each independently fail and get you into this mess.