Hacker News new | past | comments | ask | show | jobs | submit login

That requires converting from a weird unhelpful form into a more helpful form first, so yes but the tokenisation makes things harder as it adds an extra step - they need to learn how these things relate while having significant amounts of the structure hidden from them.



This conversion is inherent in the problem of language and maths though - Two, too (misspelt), 2, duo, dos, $0.02, and one apple next to another apple, 0b10 and 二 can all represent the (fairly abstract) concept of two.

The conversion to a helpful form is required anyway (also lets remember that computers don't work in base 10, and there isn't really a reason to believe that base 10 is inherently great for LLM's either)


It is, but there's a reason I teach my son addition like this:

    hundreds | tens | ones

        1        2      3
    +   2        1      5
    -----------------------
        3        3      8
Rather than

unoDOOOOS(third) {}{}{} [512354]_ = three"ate

* replace {}{}{} with addition, {}{} is subtraction unless followed by three spaces in which case it's also addition * translate and correct any misspellings * [512354] look up in your tables * _ is 15 * dotted lines indicate repeated numbers

Technically they're doing the same thing. One we would assume is harder to learn the fundamental concepts from.


Right, which is why testing arithmetics is a good way to test how well LLMs generalize their capabilities to non text tasks. LLMs can in theory be excellent at it, but they aren't due to how they are trained.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: