It's actually interesting results in a sense we see the limitation of LLM to mem... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

riku_iki 10 months ago | parent | context | favorite | on: DeepSeekMath 7B achieved 51.7% on MATH benchmark

It's actually interesting results in a sense we see the limitation of LLM to memorize complicated information correctly. Gemini ultra also reported around 50% accuracy

Davidzheng 10 months ago [–]

I think the SOTA is GPT4+tool use? I heard near 80%

riku_iki 10 months ago | [–]

Yes, tools help to advance over LLM limitations. GPT4 without tools is about 50% too.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact