Hacker News new | past | comments | ask | show | jobs | submit login

I thinks that's far from the only problem. To me the most obvious problem is that we use right-to-left numbers (think about the order you're writing digits when doing long addition) in a left-to-right language. Without a special number-flipping step; the transformer is forced to produce the output token-by-token, i.e. from left-to-right. Without the ability to store additional internal state, this turns addition into an O(N²) problem purely due to the suboptimal output ordering!



The paper discusses this, and the approach taken in the paper implements a number-flip stage, so numbers are formatted with their least significant figure first.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: