Yah, I think that's where the experience of how failure happens in machine learning projects is helpful. I've seen wrong explanations for root causes drag out for months and sometimes years, as we tried different attacks that just weren't working. ML experiments are often quite cheap, compared to things like clinical trials or field studies... So we can indulge in running another experiment to confirm or deny our hypothesis quite easily.
Humans are extremely good at inventing explanations for things. When the world doesn't do what we expect, we then have a choice of whether to believe we had the wrong explanation or just got the experimental details wrong... And epistemic hubris is a hell of a drug.
Humans are extremely good at inventing explanations for things. When the world doesn't do what we expect, we then have a choice of whether to believe we had the wrong explanation or just got the experimental details wrong... And epistemic hubris is a hell of a drug.