I'll go with an example to demonstrate why that's not always enough. Many people...

danielmarkbruce · on Feb 5, 2024

The author is attempting to build an explicit mental model of what a bunch of weights are "doing". It's not really the same thing. They are minimizing the loss function.

People try to (and often do) generate intuition for architectures that will work given the lay out of data. But, the reason models are so big now is that trying to understand what the model is "doing" in a way humans understand didn't work out so well.