I think he's talking about computational efficiency. If you're loading in 29k to... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

delusional on May 6, 2023 | parent | context | favorite | on: It looks like GPT-4-32k is rolling out

I think he's talking about computational efficiency. If you're loading in 29k tokens and you're expecting to use those again, you wouldn't need to do the whole matrix multiplication song and dance again if you just kept the old buffers around for the next prompt.

weird-eye-issue on May 6, 2023 [–]

I don't think this can necessarily be optimized at least with how the models work right now

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact