Nietzsche believed that a philosopher's philosophy boils down to a reflection of his personality and his daily life, and this article seems to assume the same.
I don't buy this for a second. I think it's easier to swallow the very opposite.
1 is plainly false. Enormous ressources have been poured into models since the end 2023 and the "intelligence" (for lack of a better term) has stayed roughly the same, around the level of GPT-4. Nothing has happened since then.
3 is a philosophical opinion, not based on any falsifiable evidence.
How is 1 false? Log improvement means for 10x the cost the model is 2x as good. For 100x the cost, the model is 3x as good.
Not a curve to be happy about TBH. You need to simultaneously find big efficiency wins and drive up costs substantially to get 4-5x improvements, and it is probably impossible to maintain good year on year improvements after the first 2-3 years when you get all the low hanging fruit.
1 is plainly false. Enormous ressources have been poured into models since the end 2023 and the "intelligence" (for lack of a better term) has stayed roughly the same, around the level of GPT-4. Nothing has happened since then.
You need to spend some quality time with o1-pro and/or Gemini Pro 2.0 Experimental. It is not the case that there has been no progress since GPT4. CoT reasoning is a BFD.
1 is plainly false. Enormous ressources have been poured into models since the end 2023 and the "intelligence" (for lack of a better term) has stayed roughly the same, around the level of GPT-4. Nothing has happened since then.
this would be true only for people
who have used the same model since 2023 :) Jesus!
I think the final point illustrates this one pretty succinctly: 'what will be left will no longer give us the joy of hacking.' Personally I build my own version of almost every software tool for which I regularly (like, daily) use a UI. So for e.g. personal note-taking, continuous integration, server orchestration, even for an IDE: I could use Apple Notes, CircleCI, Chef, VSCode, but I instead build my own versions of these.
I'm not a masochist; they're often built on top of components, e.g. my IDE uses the Monaco editor. But working on these tools gives me a sense of ownership, and lets me hack them into exactly the thing I want rather than e.g. the thing that Microsoft's (talented! well-paid! numerous!) designers want me to use. Hacking on them brings me joy, and gives me a sense of ownership.
Like an idealised traditional carpenter, I make (within reason) my own tools. Is this the most rational, engineering-minded approach? I mean, obviously not. But it brings me joy -- and it also helps me get shit done.
If I had to guess, antirez was describing engineering managers and tech leads that have (mis)read “clean code” or similar works, and take them as commandments from on high rather than tools and approaches that may be helpful or applicable in some circumstances.
Or, more generally, the fact that most of what the software industry produces is much more in line with “art” than “engineering” (especially when looked at from a Mechanical Engineer or Civil Engineer). We have so much implementation flexibility to achieve very similar results that it can be dizzying from the standpoint of other engineering fields. consider
In my view it is about design that requires taste and creativity. Engineering is about function, design is about form. If I build something that solves a problem but if it isn't well designed it can mean that no one actually uses even if it is good piece of engineering.
Anyone else wants more articles on how those benchmarks are created and how they work?
Those models can be trained in way tailored to have good results on specific benchmarks, making them way less general than it seems. No accusation from me, but I'm skeptical on all the recent so called 'breakthroughs'.
reply