I disagree. Even things that seem obvious in retrospect, take some time to be figured out.
Resnets, batch norm, dropout, etc are examples of this.
And I don't think transformers are obvious.
I disagree. Even things that seem obvious in retrospect, take some time to be figured out.
Resnets, batch norm, dropout, etc are examples of this.
And I don't think transformers are obvious.