Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
riku_iki
10 months ago
|
parent
|
context
|
favorite
| on:
DeepSeekMath 7B achieved 51.7% on MATH benchmark
It's actually interesting results in a sense we see the limitation of LLM to memorize complicated information correctly. Gemini ultra also reported around 50% accuracy
Davidzheng
10 months ago
[–]
I think the SOTA is GPT4+tool use? I heard near 80%
riku_iki
10 months ago
|
parent
[–]
Yes, tools help to advance over LLM limitations. GPT4 without tools is about 50% too.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: