It's not necessarily an either-or. Your local LLM could offload hard problems to a service by encoding information about your request together with context and relevant information about you into a vector, send that off for analysis, then decode the vector locally to do stuff. It'd be like asking a friend when available.