Hacker News new | past | comments | ask | show | jobs | submit login

> there's no evidence yet that the deepseek architecture would even yield a substantially more performant model with more compute.

It's supposed to. There was an info that the longer length of 'thinking' makes o3 model better than o1. I.e. at least at inference compute power still matters.




> It's supposed to. There was an info that the longer length of 'thinking' makes o3 model better than o1. I.e. at least at inference compute power still matters.

compute matters, but performance doesn't scale with compute from what I've heard about o3 vs o1.

you shouldn't take my word for it - go on the leaderboards and look at the top models from now, and then the top models from 2023 and look at the compute involved for both. there's obviously a huge increase, but it isn't proportional




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: