Hacker News new | past | comments | ask | show | jobs | submit login

If true, makes me wonder what kind of regression testing OpenAI does for these models. It can’t be easy to write a unit test for hallucinations.



At a high level, ask it to produce a ToC of information about something that you know will exist in the future, but does not yet exist, but also tell it to decline the request if it doesn't verifiably know the answer.


How do you generalize that for all inputs though?


I am not sure I understand the question. I sampled various topics. I used this prompt: https://raw.githubusercontent.com/impredicative/podgenai/mas...

In the prompt, substitute {topic} with something from the near future. As I noted, it behaves correctly for turbo (rejecting the request), and very badly for o (hallucinating nonsense).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: