Sure, except those smaller models are only useful for some novelty and creative ...

Sure, except those smaller models are only useful for some novelty and creative tasks. Give them a programming or logic problem and they fall flat on their face.

As I mentioned in a comment upthread, I find ~30B models to be the minimum for getting somewhat reliable output. Though even ~70B models pale in comparison with popular cloud LLMs. Local LLMs just can't compete with the quality of cloud services, so they're not worth using for most professional tasks.