It's complicated, but basically because *most* are llama architecture. Meta all ... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

brucethemoose2 9 months ago | parent | context | favorite | on: Gemma.cpp: lightweight, standalone C++ inference e...

It's complicated, but basically because most are llama architecture. Meta all but set the standard for open source llms when they released llama1, and anyone trying to deviate from it has run into trouble because the models don't work with the hyper optimized llama runtumes.

Also, there's a lot of magic going on behind the scenes with configs stored in gguf/huggingface format models, and the libraries that use them. There are different tokenizers, but they mostly follow the same standards.

null_point 9 months ago [–]

I found the magic! https://github.com/search?q=repo%3Aggerganov%2Fggml%20magic&...

null_point 9 months ago | [–]

Hey, c'mon now. Just being playful about the "magic" string used in GGUF files to detect that it is in-fact a GGUF file.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact