What are the limitations on which LLMs (specific transformer variants etc) llama...

What are the limitations on which LLMs (specific transformer variants etc) llama.cpp can run? Does it require the input mode/weights to be in some self-describing format like ONNX that support different model architectures as long as they are built out of specific module/layer types, or does it more narrowly only support transformer models parameterized by depth, width, etc?