Hacker News new | past | comments | ask | show | jobs | submit login

Thank you for the reference to the CUDA file [1]. It's always nice to see how complex data structures are handled in GPUs. Does anyone have any idea what the bit patterns are for (starting at line 1529)?

[1] https://github.com/ggerganov/llama.cpp/blob/master/ggml-cuda...




Those have to do with dequantization. It involves table lookups and some adjusting math.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: