Ego depletion has been badly bruised the replication crisis: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-132383. At most the effect size is considerably weaker than first claimed and it probably doesn't actually exist at all.
On that note, it'd also be unwise to bet on the "ego depletion happens only to those who believe in it" finding from the paper I cited being replicate-able either -- it may be that the more broadly is no simple ego depletion effect effect at all.
Psychology studies tend to be marred by the "piranha problem": you can't have a bunch of large effects determining behavior, without them eating each other.[1]
We're immensely complicated, and make highly individual and nuanced decisions. There's just too much noise to try and boil things down to small patterns from simple studies.
I believe (perhaps grounded weakly in largely anecdotal evidence) that people need to be able to fit their own behavior in some sort of personal narrative, or justify it in some way. Therefore, that the belief they "can't help" but do something they want to do (whether that be procrastinate or eat unhealthy food) will make it easier to justify, and thus the behaviors more likely.
But there are as many different internal narratives, built on life experiences, as there are people. Making it hard to generalize surface level consequences between people.
This isn't about ego depletion, but about his mistakes on priming (which also failed to replicate), Kahneman writes:
"""
My position when I wrote “Thinking, Fast and Slow” was that if a large body of evidence published in reputable journals supports an initially implausible conclusion, then scientific norms require us to believe that conclusion. Implausibility is not sufficient to justify disbelief, and belief in well-supported scientific conclusions is not optional. This position still seems reasonable to me – it is why I think people should believe in climate change. But the argument only holds when all relevant results are published.
I knew, of course, that the results of priming studies were based on small samples, that the effect sizes were perhaps implausibly large, and that no single study was conclusive on its own. What impressed me was the unanimity and coherence of the results reported by many laboratories. I concluded that priming effects are easy for skilled experimenters to induce, and that they are robust. However, I now understand that my reasoning was flawed and that I should have known better. Unanimity of underpowered studies provides compelling evidence for the existence of a severe file-drawer problem (and/or p-hacking). The argument is inescapable: Studies that are underpowered for the detection of plausible effects must occasionally return non-significant results even when the research hypothesis is true – the absence of these results is evidence that something is amiss in the published record. Furthermore, the existence of a substantial file-drawer effect undermines the two main tools that psychologists use to accumulate evidence for a broad hypotheses: meta-analysis and conceptual replication. Clearly, the experimental evidence for the ideas I presented in that chapter was significantly weaker than I believed when I wrote it. This was simply an error: I knew all I needed to know to moderate my enthusiasm for the surprising and elegant findings that I cited, but I did not think it through. When questions were later raised about the robustness of priming results I hoped that the authors of this research would rally to bolster their case by stronger evidence, but this did not happen.
"""
https://statmodeling.stat.columbia.edu/2017/02/18/pizzagate-...
So he has made mistakes (we all do!), but tries his best to learn from them (which not all victims of the replication crisis do).