Yeah... So I did that which is how I got it to begin correctly. This is what I mean though.
I'll say "get a list of Blah from the following document in Json format like this:
Example"
Then I feed the document and add a spot for the answer.
The model begins correctly. But usually in the middle of the Json list generation, it will veer off, and start hallucinating as if it forgot the document and the task. I'm happy to share specifics and datasets but this is a cross cutting problem.
Rwkv is able to answer my questions when I ask simple yes/no or classification. It's the listing that throws it for a loop. Transformers do not have the same problem. Both llama and gpt are able to maintain focus.
Also, do you know where I'd find information on how the current weights were trained?
I'll say "get a list of Blah from the following document in Json format like this:
Example"
Then I feed the document and add a spot for the answer.
The model begins correctly. But usually in the middle of the Json list generation, it will veer off, and start hallucinating as if it forgot the document and the task. I'm happy to share specifics and datasets but this is a cross cutting problem.
Rwkv is able to answer my questions when I ask simple yes/no or classification. It's the listing that throws it for a loop. Transformers do not have the same problem. Both llama and gpt are able to maintain focus.
Also, do you know where I'd find information on how the current weights were trained?