llm is Simon's command line front-end to a lot of the llm apis, local and cloud-based. Along with aider-chat, it's my main interface to any LLM work -- it works well with a chat model, one-off queries, and piping text or output into a llm chain. For people who live on the command line, or are just put-off by web interfaces, it's a godsend.
About the only thing I need to look further abroad for is when I'm working multi-modally -- I know Simon and the community are mainly noodling over the best command line UX for that: https://github.com/simonw/llm/issues/331
I use a fair amount of aider - what does Simon's solution offer that aider doesn't? I am usually using a mix of aider and the ChatGPT window. I use ChatGPT for one off queries that aren't super context heavy for my codebase, since pricing can still add up for the API and a lot of the times the questions that I ask don't really need deep context about what I'm doing in the terminal. But when I'm in flow state and I need deep integration with the files I'm changing I switch over to aider with Sonnet - my subjective experience is that Anthropic's models are significantly better for that use case. Curious if Simon's solution is more geared toward the first use case or the second.
The llm command is a general-purpose tool for writing shell scripts that use an llm somehow. For example, generating some llm output and sending it though a Unix pipeline. You can also use it interactively if you like working on the command line.
It’s not specifically about chatting or helping you write code, though you could use it for that if you like.
Ollama can’t talk to OpenAI / Anthropic / etc. LLM gives you a single interface that can talk to both hosted and local models.
It also logs everything you do to a SQLite database, which is great for further analysis.
I use LLM and Ollama together quite a bit, because Ollama are really good at getting new models working and their server keeps those models in memory between requests.
You can run llamafile as a server, too, right? Still need to download gguf files if you don't use one of their premade binaries, but if you haven't set up llm to hit the running llamafile server I'm sure that's easy to do
I haven't used Ollama, but from what I've seen, it seems to operate at a different level of abstraction compared to `llm`. I use `llm` to access both remote and local models through its plugin ecosystem[1]. One of the plugins allows you to use Ollama-served local models. This means you can use the same CLI interface with Ollama[2], as well as with OpenAI, Gemini, Anthropic, llamafile, llamacpp, mlc, and others. I select different models for different purposes. Recently, I've switched my default from OpenAI to Anthropic quite seamlessly.
It looks like a multi-purpose utility in the terminal for bridging together the terminal, your scripts or programs to both local and remote LLM providers.
And it looks very handy! I'll use this myself because I do want to invoke OpenAI and other cloud providers just like I do in ollama and piping things around and this accomplishes that, and more.
I guess you can also accomplish similar results if you're just looking for `/chat/completions` and such if you configured something like LiteLLM and connecting that to ollama and any other service.