I'm actually more confused after reading that. I assumed that you meant that tested in production on purpose, but it sounds, at a skim, like they do non-prod testing environments - in fact, it looks like they've gone to having multiple beta environments of every service?
My understanding is that they have a "tenancy" variable in every service call which can take a different code path. They seem to only have one environment for everything and do tests/experiments at code level based on this variable.