Hacker News new | past | comments | ask | show | jobs | submit login

Yeah... So I did that which is how I got it to begin correctly. This is what I mean though.

I'll say "get a list of Blah from the following document in Json format like this:

Example"

Then I feed the document and add a spot for the answer.

The model begins correctly. But usually in the middle of the Json list generation, it will veer off, and start hallucinating as if it forgot the document and the task. I'm happy to share specifics and datasets but this is a cross cutting problem.

Rwkv is able to answer my questions when I ask simple yes/no or classification. It's the listing that throws it for a loop. Transformers do not have the same problem. Both llama and gpt are able to maintain focus.

Also, do you know where I'd find information on how the current weights were trained?




Hmm we might need to look into the instruct training data. Which is mostly based on gpt4all filtered and mixed with others

(You are using raven right? That’s the instruct trained varient)

Btw ping the discord if ur looking into finetuning for your usecase


Yeah I'm using raven. Raven does work better. And I'm on the discord.

Unfortunately I really would like machine readable responses and raven is a bit too verbose.

Looking at fine-tuning right now.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: