Curious, why do you need to construct these as class instances, like operation =...

eduardoleao052 · 2024-03-29T03:51:58

I centralized the entire backpropagation around the Operation objects. They store the data about the forward prop in the cache, and serve as the connections in the graphs between tensors. Each tensor has a “parent”, “child” and “operation”. These store who generated the tensor, what tensors it generated, and how it was generated (what operation). I could store the backward function inside of each tensor instead of an Operation object, but I chose the slightly more verbose option because I think it is a little more interpretable and simpler to add new operations.