Hacker News new | past | comments | ask | show | jobs | submit login

Forward AD is the pushforward of a tangent vector (an element of the tangent space), Reverse AD is a pullback of a cotangent vector (an element of the cotangent space). The duality notion between tangent and cotangent spaces is the same as the duality notion of spaces in optimization. Unfortunately, I'm only passingly familiar with discrete optimization, but I would suspect the notion extends from optimization. That's not to say that they are fundamentally the same or that writing this down helps anybody in any way, but a lot of these "dual" notions do have some sort of dual vector space under the hood.



Yeah, but all you're really describing here is linear algebra. Vector spaces and linearity are a significant part of every single discipline the grandparent commenter mentioned, but they picked out duality.

I would agree with the critique: I don't think highlighting duality here is particularly useful. For example, the way dual numbers are used to extend the reals for automatic differentiation doesn't have a deep connection to duality in vector spaces. It's just a very general semantic concept that describes pairs of things. But it doesn't say that any given pair of dual things is related to another pair of dual things.


> For example, the way dual numbers are used to extend the reals for automatic differentiation doesn't have a deep connection to duality in vector spaces.

They don't. Because certain operations are hard to reason about in linear spaces. Such as optimization.

Don't get me wrong, I'm not shitting on vector spaces. All I'm saying is that some problems are hard to do in vector spaces, that are easy in the smooth spaces and vice versa. Like having these two APIs to the same space much more powerful, because again, you generalize over the conversions between the two spaces. You use whichever API is more appropriate in the particular context.

In some sense the linear spaces deal with things like infinity, the smooth spaces deal with cyclical things (signals, wavelets, modular arithmetic).


Quite a bit of optimization is easy to reason about in linear algebra. Take linear and mixed integer programming, for example. And convex optimization subsumes linear optimization in general. There is a lot of nonlinear optimization, but I can assure you with extremely high confidence that the common thread you're seeing here isn't duality, but more abstractly linearity.

Likewise cyclic things show up all the time in purely algebraic (read: discrete, non-smooth) contexts. We have that in vector spaces, group theory, rings, modules, etc.


They show up separately but not in tandem.

The canonical example is robotic motion and the reason why Lie theory is used there. You have very discrete states (positions) that you want to interpolate between smoothly.


> For example, the way dual numbers are used to extend the reals for automatic differentiation doesn't have a deep connection to duality in vector spaces.

Yes, the right way to think about dual numbers (esp once you generalize them beyond just the single e^2=0), is to think of them as tangent vectors (sections of the tangent bundle). I've never really liked the "dual number" terminology here. That's why I deliberately chose to use the duality of forward and reverse mode AD, because that notion of duality agrees with the underlying linear algebra (or in general differential geometry). I do agree it's a mess of terminology.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: