I thinks that's far from the only problem. To me the most obvious problem is tha...

ynik 8 months ago | parent | context | favorite | on: Transformers Can Do Arithmetic with the Right Embe...

I thinks that's far from the only problem. To me the most obvious problem is that we use right-to-left numbers (think about the order you're writing digits when doing long addition) in a left-to-right language. Without a special number-flipping step; the transformer is forced to produce the output token-by-token, i.e. from left-to-right. Without the ability to store additional internal state, this turns addition into an O(N²) problem purely due to the suboptimal output ordering!

Shrezzing 8 months ago [–]

The paper discusses this, and the approach taken in the paper implements a number-flip stage, so numbers are formatted with their least significant figure first.