Hacker News new | past | comments | ask | show | jobs | submit login

Hey team, nice work. Can you help me understand this better. How does the process work in terms of the human agent evaluations? Is it real time so that the right (maybe a better word is best) answers go to users as they are needed, or is it done asynchronously/batch style so that the humans are training models to be better? Once the best answers are selected, is it fed back into an LLM / AI agent model? Thanks



Following up on my own question. I re-read the github, so I can clarify my question better. So the AI agent responses are saved to a database where the human sme can classify responses as good/bad right? Do you intend for the result of this analysis to retrain the AI agent, or is it purely to get a baseline on the as-is AI agent quality?


Indeed the evaluations are saved to DB. Right now it's possible to use this for regression testing with the help of the Optimize tab. In the optimize tab you can experiment with changing an input parameter (such as prompt or temperature, etc), then rerun recordings and see whether the the LLM response matches all previously accepted recordings or not - to see a similarity score which tells you if your change introduced regressions or not.

In the future we are planning to enable a retraining pipeline - most likely we will do this in our core offering at usefini.com


Thank you for explaining


Hi smarri, I hope you're well.

Sorry this might not be the best place to ask, but a few months ago you posted on my thread about a a part time, remote, Masters Degree in Software Development in the UK of which businesses recruit directly from - my apologies for the late reply, but could I get in touch with you to have more details?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: