Completely agree. I find it fails miserably at business logic, which is where we spend most of our time on. But does great at generic stuff, which is already trivial to find on stack overflow.
Might be that they work in a much more complex codebase, or a language/framework/religion that has less text written on it. Might also be that they (are required to) hold code to a higher standard than you and can't just push half-baked slop to prod.
I've spent a good amount of time in my career reading high quality code and slop. The difference is not some level of intelligence that sonnet does not possess. It's a well thought out design, good naming, and rigor. Sonnet is as good if not better than the average dev at most of this and with a some good prompting and a little editing can write code as good as most high quality open source projects.
Which is usually far higher than most commerical apps the vast majority of us devs work on.
> with a some good prompting and a little editing can write code good
I agree with a good developer "baby-sitting" the model it's capable of producing good code. Although this is more because the developer is skilled at producing good code so they can tell an AI where it should refactor and how (or they can just do it themselves). If you've spent significant time refitting AI code, it's not really AI code anymore its yours.
Blindly following an AI's lead is where the problem is and this is where bad to mediocre developers get stuck using an AI since the effort/skill required to take the AI off its path and get something good out is largely not practised. This is because they don't have to fix their own code, and what the AI spits out is largely functional - why would anyone spend time thinking about a solution that works that they don't understand how they arrived at?
I've spent soo much time in my life reviewing bad or mediocre code from mediocre devs and 95% of the time the code sonnet 3.5 generates is at least as correct and 99% of the time more legible than what a mediocre dev generates.
It's well commented, the naming is great it rarely tries to get overly clever, it usually does some amount of error handling, it'll at least try to read the documentation, it finds most of the edge cases.
It's easy to forget one major problem with this: we all have been mediocre devs at some point in our lives -- and there will always be times when we're mediocre, even with all our experience, because we can't be experienced in everything.
If these tools replace mediocre devs, leaving only the great devs to produce the code, what are we going to do when the great devs of today age out, and there's no one to replace them with, because all those mediocre devs went on to do something else, instead of hone their craft until they became great devs?
Or maybe we'll luck out, and by the time that happens, our AIs will be good enough that they can program everything, and do it even better than the best of us.
If you can call that "lucking out" -- some of us might disagree.
Which is deeply sad, because this both tarnishes good output and gives a free pass to the competitor - shit "content" generated by humans. AI models only recently started to match that in quantity.
But then again, most of software industry exists to create and support creation of human slop - advertising, content marketing, all that - so there's bound to be some double standards and salary-blindness present.
Without knowing much about my prompts and work you’ve just assumed it’s why AI gives me bad results. You’re wrong. (Can you see why this is a bad argument?)
Don't get me wrong I love sloppy code as much as the next cowboy, but don't delude yourself or others when the emperor doesn't have clothes on.
> But does great at generic stuff, which is already trivial to find on stack overflow.
The major difference is that with Cursor you just hit "tab", and that thing is done. Vs breaking focus to open up a browser, searching SO, finding an applicable answer (hopefully), translating it into your editor, then reloading context in your head to keep moving.
The benefits of exploring is finding alternatives and knowing about gotchas. And knowing more about both the problem spaces and how the language/library/framework solves it.
There was a thread about that and gist was: A calculator is a great tool because it's deterministic and the failures are known (mostly related to precision); It eliminates the entire need of doing computation by hand and you don't have to babysit it though the computation process with "you're an expert mathematician..."; Also it's just a tool and you still need to learn basic mathematics to use it.
The equivalent to that is a good IDE that offers good navigation (project and dependencies), great feedback (highlighting, static code analysis,..), semantic manipulation, integration with external tooling, and the build/test/deploy process.
Yes, I think agree. And when you use a calculator and get a result that doesn't make sense to you, you step in as the human and try and figure out what went wrong.
With the calculator, it's typically a human error that causes the issue. With AI, it's an AI error. But in practice it's not a different workflow.
Give inputs -> some machine does work much faster than you could -> use your human knowledge to verify outputs -> move forward or go back to step 1.
My experience has been different. My major use case for AI tools these days is writing tests. I've found that the generated test cases are very much in line with the domain. It might be because we've been strictly using domain driven design principles It even generates test cases that fail to show what we've missed
Yesterday, I got into an argument on the internet (shocking, I know), so I pulled out an old gravitation simulator that I had built for a game.
I had chatGPT give me the solar system parameters, which worked fine, but my simulation had an issue that I actually never resolved. So, working with the AI, I asked it to convert the simulation to constant-time (it was currently locked to render path -- it's over a decade old). Needless to say, it wrote code that set the simulation to be realtime ... in other words, we'd be waiting one year to see the planets go around the sun. After I pointed that out, it figured out what to do and still got things wrong or made some terrible readability decisions. I ended up using it as inspiration instead and then was able to have the simulation step at one second resolution (which was required for a stable orbit) but render at 60fps and compress a year into a second.
This sums up my experience as well. You can get an idea or just a direction from it, but itself AI stumbles upon its own legs instantly in any non-tutorial task. Sometimes I envy and at the same time feel sorry for successful AI-enabled devs, cause it feels like they do boilerplate and textbook features all day. What a release if something can write it for you.