Hacker News new | past | comments | ask | show | jobs | submit login

It's actually interesting results in a sense we see the limitation of LLM to memorize complicated information correctly. Gemini ultra also reported around 50% accuracy



I think the SOTA is GPT4+tool use? I heard near 80%


Yes, tools help to advance over LLM limitations. GPT4 without tools is about 50% too.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: