Hacker News new | past | comments | ask | show | jobs | submit login

Isn't it more that they don't have ready access to the much-more-fundamental concept of decimal numbers?

My understanding was that they tokenized them into chunks and tried to learn associations between the chunks, the same as if one was breaking apart English words.

So "2+2=4" isn't being treated that differently from "all's well that ends well." This might lead to a kind of Benny's Rules [0] situation, where sufficient brute-force can make a collection of overfitted non-arithmetic rules appear to work.

[0] https://blog.mathed.net/2011/07/rysk-erlwangers-bennys-conce...




The current gen llms tokenize numbers digit by digit unlike earlier llms.


They don't. Which you can easily check with any of the dozen web apps currently implementing the GPT-4o tokenizer.


No, it doesn't help. Bloomberg tried this and it didn't seem to make much difference.


If someone else is interested in the Bloomberg tokenizer:

https://medium.com/generative-ai-insights-for-business-leade...


Fascinating article!


It looks like the math-notation formatting didn't survive, for that you might want to see a PDF, ex: https://people.wou.edu/~girodm/library/benny.pdf




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: