Forward AD is the pushforward of a tangent vector (an element of the tangent spa...

throwawaymath · on March 7, 2019

Yeah, but all you're really describing here is linear algebra. Vector spaces and linearity are a significant part of every single discipline the grandparent commenter mentioned, but they picked out duality.

I would agree with the critique: I don't think highlighting duality here is particularly useful. For example, the way dual numbers are used to extend the reals for automatic differentiation doesn't have a deep connection to duality in vector spaces. It's just a very general semantic concept that describes pairs of things. But it doesn't say that any given pair of dual things is related to another pair of dual things.

adamnemecek · on March 7, 2019

> For example, the way dual numbers are used to extend the reals for automatic differentiation doesn't have a deep connection to duality in vector spaces.

They don't. Because certain operations are hard to reason about in linear spaces. Such as optimization.

Don't get me wrong, I'm not shitting on vector spaces. All I'm saying is that some problems are hard to do in vector spaces, that are easy in the smooth spaces and vice versa. Like having these two APIs to the same space much more powerful, because again, you generalize over the conversions between the two spaces. You use whichever API is more appropriate in the particular context.

In some sense the linear spaces deal with things like infinity, the smooth spaces deal with cyclical things (signals, wavelets, modular arithmetic).

throwawaymath · on March 8, 2019

Quite a bit of optimization is easy to reason about in linear algebra. Take linear and mixed integer programming, for example. And convex optimization subsumes linear optimization in general. There is a lot of nonlinear optimization, but I can assure you with extremely high confidence that the common thread you're seeing here isn't duality, but more abstractly linearity.

Likewise cyclic things show up all the time in purely algebraic (read: discrete, non-smooth) contexts. We have that in vector spaces, group theory, rings, modules, etc.

adamnemecek · on March 8, 2019

They show up separately but not in tandem.

The canonical example is robotic motion and the reason why Lie theory is used there. You have very discrete states (positions) that you want to interpolate between smoothly.

KenoFischer · on March 8, 2019

> For example, the way dual numbers are used to extend the reals for automatic differentiation doesn't have a deep connection to duality in vector spaces.

Yes, the right way to think about dual numbers (esp once you generalize them beyond just the single e^2=0), is to think of them as tangent vectors (sections of the tangent bundle). I've never really liked the "dual number" terminology here. That's why I deliberately chose to use the duality of forward and reverse mode AD, because that notion of duality agrees with the underlying linear algebra (or in general differential geometry). I do agree it's a mess of terminology.