Hacker News new | past | comments | ask | show | jobs | submit login

Letting it decide how much time computing an answer instead of making it spend the same amount of time per text token could definitely help a lot. Right now it spends the same amount of energy processing each token, but that doesn't make sense, "hello, how are you" should take basically no processing while hard logical problems should take a lot.

Maybe these language models could be way smaller and cheaper if we added a recurse symbol to it that made it iterate many times. Hard tasks would still take a lot, but most banal conversations would be very cheap.




This does indeed improve efficiency significantly: https://arxiv.org/abs/2207.07061




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: