I just gave this a shot on my laptop and it works reasonably well considering it has no discrete GPU.
One thing I’m unsure of is how to pick a model. I downloaded the 7B one from Huggingface, but how is anyone supposed to know what these models are for, or if they’re any good?
Read the README of the model.
You'll probably find some benchmark metrics there that can tell you more-or-less how "good" the model is, but keep in mind that it's not that hard to artifically boost those scores, so don't reject every model that isn't at the top of the benchmark.
I've listed some good starting models at the end of the post.
Usually, most LLM models like Qwen or Llama are general-purpose, some are fine-tuned for specific stuff, like CodeQwen for programming.
One thing I’m unsure of is how to pick a model. I downloaded the 7B one from Huggingface, but how is anyone supposed to know what these models are for, or if they’re any good?