Hacker News new | past | comments | ask | show | jobs | submit login

Yes, but you need to take care of the indices of the arguments when calculating the derivative expression. This is what we do.

For example if you have f_{i,j} = (x_i)^2, then the derivative (of some loss) w.r.t. x will be: dx_i = \sum_j df_{i,j} 2 x_i. The sum is needed because the argument x does not depend on the index j and thus x_i is affected by all j elements of df_{i,j}.

Another example would be: f_i = (x_ii)^2, i.e. taking the squares of the diagonal of the matrix x. Here the derivative is x_{i,j} = kronecker_{i=j} 2 df_i x_ii because off-diagonal elements have zero derivatives.

For such simple expressions it's trivial, but for complex expressions it's error-prone when you do it by hand.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: