Hacker News new | past | comments | ask | show | jobs | submit login

although these things are impressive at writing hello world in a bunch of ways, I've really found them pretty useless inside any sufficiently complex codebase. Am I just incompetently using them?



Canvas (and Claude Artifacts) aren't things you use inside a large codebase, they're more for one-off tools that you may want to run, without the hassle of setting up a stack for each, copying code between the LLM and your local editor etc.

Think "here's a schema for some internal API, write me a tool that lets me put in a product ID and see the number of purchases, chargebacks and returns for that product over time, on graphs, also provide sortable by-country tables for these stats".

With artifacts, software goes from something you make once and use for a long time, to something you make at need, to do exactly what you need and no more, and then immediately throw away once it has served its purpose.


Yeah, I think this is a shift that will be very interesting when commercialized. I already use chatGPT to write odd plugins for wordpress and programs for image and text manipulation. Now I can get an applet to do almost anything simple after an hour of messing around.


Not at all incompetence, it's just a very open ended UI for a tool that is best used for specific use cases. It can be tempting to try to get it to solve very broad problems since the UI lets you do that, but where it excels is when you use it to save time by constraining it to do the grunt work for you.

Don't ask it how to solve a problem for you, instead write out the solution in plain English, provide an API for it to fill in, and let it write out the solution with unit tests.


+1 for Cursor.

This guy [https://x.com/PrajwalTomar_] has been exploring workflows that involve using ChatGPT to assist in creating a Product Requirement Document (PRD), then using V0 by Vercel for mockups and bringing it all together using Cursor, maintaining continuity with markdown documents (.md) of the PRD and relevant database schemas etc inside the project to maintain continuity.


Using the front end of these websites to work in your complex codebase is very challenging. I use the chatbots for higher level questions about libraries and integrations, not for specific implementation details in my codebase. But without a data agreement in place, you shouldn't (or maybe can't) paste in code, and even if you could, it's an inferior way of providing context in comparison to better tools.

However, I do use copilot+vscode with claude 3.5 and the "Edit with Copilot feature" where it takes my open files, plus any other context I want to give it, to drive changes to my files, has been surprisingly good. It's not really a time saver in that the amount of time I spend verifying and fixing or enhancing the result isn't really faster than me just writing it myself, but I still find benefits for brainstorming, quickly iterating on alternate ideas and smaller refactors, and overcoming the "get started" hesitation on a seemingly complex change. I'm to the point where it can absolutely add tests to files using my existing patterns that are well done and rarely need feedback from me. I have been surprised to the see the progress because for most of the history of LLM I didn't find it useful.

It also helps that I work in a nodejs/react/etc codebase where the models have a ton of information and examples to work with.


> But without a data agreement in place, you shouldn't (or maybe can't) paste in code

There's a checkbox you can toggle so that Openai doesn't use your code to train their models.

And I find the "chatbot" experience different and better than aider/copilot. It forces me to refocus on the really useful interfaces instead of just sending everything, and makes it better to verify everything instead of just accepting a bunch of changes that might even be correct, but not what I exactly want. For me, the time spent verifying is actually a bonus, because I read faster than I can type. I think of it as a peer programmer who just happens to be able to type much, much faster and doesn't mind writing unit tests or rewriting the same thing over and over.


The problem with reading vs writing is building "true understanding". If you are reading code at a high level and building a perfect mental model and reading every bit of code, then you're doing it right. But many folks see finished code and get "LGTM brain" and don't necessarily fully think out every command on every line, leading to poor understanding of code. This is a huge drawback in LLM-assisted coding. Folks re-read code they "wrote" and have no memory of it at all.

In the edit experience I am using, the LLM provides a git style changelog where I can easily compare before/after with a really detailed diff. I find that much more useful than giant "blobs" of code where minor differences crop up that I don't notice.

The other massive drawback to the out-of-codebase chatbot experience (and the Edit With Copilot experience IS a chatbot, it's just integrated into the editor, changes files with diffs, and has a UI for managing file context) is context. I can effortlessly load all my open files into the LLM context with 1 click. The out-of-editor chatbox requires either a totally custom LLM with various layers to handle your codebase context, or you have to manually paste in a lot of context. It's nonsense to waste time pasting proprietary code into OpenAI (with no business agreement other than a privacy policy and a checkbox) when I can get Copilot to sign a BA with strict rules about privacy and storage, and then 1-click add my open files to my context.

Folks should give these new experiences a try. Having claude chatbot integrated into your editor with the ability to see and modify your open files in a collaborative chat experience is very nice.


I use claude every day to help me write code. I doubt it's incompetence, but maybe a lack of trust. Cursor + Claude does great and I use it to do real work for my business.


How do you use Cursor? I was scratching my head at its pricing model, how's that 2000 completions a month? Is it enough? What are the "slow premium requests"? Do I need a course only to understand what I get for my money? I'm confused.


I have never thought about it ever to be honest, for me it just works and I am able to use all the features without complication.

I use the tab auto-complete and the composer functionality very often.


> How do you use Cursor?

I am using windsurf which is similar, they have a two week trial which can give you an idea if it would be useful to you or not.


I find it's directly proportional to the inputs you give it. So either you need to break down the requests into smaller boxes, or use a tool like Cursor or my preferred, Aider. They both automatically bring in more context about your codebase, such as Aider bringing in your whole git repo map, to help it integrate better.


my biggest problem in coding is when I'm looking at things spanning 10 directories and 100 files with totally useless name like controller/viewer.js, viewer/controller.js, viewercontrols.js, etc and try to find where, 15 function calls ago and 80 statements ago some bug occurred. I was hoping that AI could help untangle these messes.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: