Hacker News new | past | comments | ask | show | jobs | submit login

Curious, why do you need to construct these as class instances, like operation = new Exp() ? Seems like a lot of extra overhead constructing those objects. Why not just have Exp contain static methods for forwards and backwards?

[edit] nevermind, I missed the cache step. Still not sure it wouldn't be more performant to centralize caches as plain objects somewhere rather than to call new() on every op...?




I centralized the entire backpropagation around the Operation objects. They store the data about the forward prop in the cache, and serve as the connections in the graphs between tensors. Each tensor has a “parent”, “child” and “operation”. These store who generated the tensor, what tensors it generated, and how it was generated (what operation). I could store the backward function inside of each tensor instead of an Operation object, but I chose the slightly more verbose option because I think it is a little more interpretable and simpler to add new operations.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: