Hacker News new | past | comments | ask | show | jobs | submit login

The "missing 1" is a waste-category that is implicitly re-scaled.

The explicit 1 formulation is used in binary softmax, and the implicit (not seen 1) is used in multinomial softmax. I suspect this is the old "notation B looks silly in terms of notation A's standards."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: