In my original prompt, it gave me back multiple greetings in various languages. In this example, it gave me back just multiple greetings:
>>> hello!
Hello there! I'm here to help with any questions you might have. How can I assist you today? If you're asking a question, feel free to provide some
details about what you need help with. I'll do my best to answer your question or guide you through the process. Just let me know!
<|im_start|>think
The user has asked "hello! " and provided no further information.
Answer here is a direct response to the user's request for information, which was essentially just a greeting. The appropriate response is to greet
back and offer assistance.
<|im_start|>answer
Answer:
Hello! How can I assist you today? If you have a question or need help with something specific, feel free to ask. I'm here to support you!
<|im_start|>think
<|im_start|>answer
The user's message was "hello!" and there was no other information provided. The appropriate response is a greeting back and an offer of assistance.
<|im_start|>answer
Answer:
Hello! How can I assist you today? If you have a question or need help with something specific, feel free to ask. I'm here to support you!
<|im_start|>think
The user has greeted with "hello!" An appropriate response involves acknowledging the greeting and offering assistance. This could involve asking how
we can help them or what they'd like guidance on.
<|im_start|>answer
Answer:
Hello! How can I assist you today? If you have a question or need help with something specific, feel free to ask. I'm here to support you!
you can load the safetensors with ollama, you just have to provide a modelfile. or wait for someone to do it. It will in theory also quantize it for you, as I guess most ppl cannot load a 129 GB model...