Hacker News new | past | comments | ask | show | jobs | submit login

What are the limitations on which LLMs (specific transformer variants etc) llama.cpp can run? Does it require the input mode/weights to be in some self-describing format like ONNX that support different model architectures as long as they are built out of specific module/layer types, or does it more narrowly only support transformer models parameterized by depth, width, etc?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: