Hacker News new | past | comments | ask | show | jobs | submit login

>Everyone's trying to train GPT models to write code, but maybe we should be training them how to use a debugger. Though its a lot harder to map text generation to debugging...

We can actually try this now. Literally tell the LLM what you want to do and work with it. See how far you can take it. You will of course be asking the LLM for debug line inputs and you will be providing it with outputs as you run the debugger yourself.




Or just learn to write it ourselves? If it takes the same amount of time to work with the LLM as coding it from scratch then I'd prefer to improve my coding ability while I do the work.


With no experience in java, no coding for 30years since pascal in high school, no previous use of git or github, no hands on experience of the azure stack... I stood up 4 static web apps that do things I want in my hobby in 4 weeks- the first one took 12 hours including being shown git, installing npm etc etc. The last one took me 40 minutes. They do things for me in D&D that I have wanted for 20 years- now that capability is accessible. Whole monster manual ingested into a level, terrain and faction based encounter system that give ranges and features for the encounter ie a battle map. Scaling encounters suitable for the party at any level that theme with the terrain and dominant faction. The best thing about an MMO but for 5thed dnd.

Did I learn a bit of java and css and git?- sure, but I was up and running in about 4 hours with a mvp for my 1st one. There is NO way I could "learn" that in that timeframe. I just asked chatGPT 4 how to do it, and it told me. When I didn't know how to commit, it told me (actually I didn't even know the concept). It held my hand every step of the way.

I didn't need to learn something first, I just did it. And I have started doing it at work. "hmm 4 GB of fortinet logs in 20 files of gzip on mac.. how do I find a host name in that? - chatgpt.. oh- 1 line of zgrep.. never heard of it- hey it works.."

admittedly, I am bathed in tech, been hanging around folks talking about projects for years. But NOW I can execute- the problem? When it hits about 500 lines of java- maybe 10 functions, it is too big to drop into the prompt to debug and I don't know enough to fix myself. Solution, make smaller apps, get them working, create data files to reference in json, chain them together. eh, not perfect, but good enough for hobby.

Beware- fools like me who know nothing will be bringing code to production near you soon. Cool that you like to learn stuff, but syntax bores the crud out of me, each to their own, I'm just going to make. I find it more satisfying. Terrifying that code born like mine will end up in someone's prod, but it will.


It sounds like you've learned a lot in the process of using the LLM, and perhaps you will use the LLM less for the basic stuff next time.


Maybe, but I think it is more likely I will try a different type of project, a different stack. See if that is the only easy path. Try something with graphics (a visual map) or that uses the llm api (generate a narrative etc). But my mate who is a programmer agrees with you- he sees the same thing- it is a good way to learn while being productive.


This is a good answer. You don't have the bias of years of programming experience or training. You don't have your identity tied to the job.

If AI helps you, you'll emphasize on the overall benefit rather then nitpick at the details because of the clear conflict of interest that LLMs present to programmers.


I'm just saying the tech is already here. The core engine can do it.

Before you go on and write such a system it's better to test if the LLM can do debugging to an efficacy level that we require. I don't think anyone has tried this yet and we do know LLMs have certain issues.

But make no mistake, the possibility that an LLM knows how to debug programs is actually quite high. If it can do this: https://www.engraved.blog/building-a-virtual-machine-inside/ it can likely debug a program, but I can't say definitively because I'm too lazy to try.


Thanks for sharing that link, from that example I can see how LLMs could be used to speed up the learning process.

I do wonder though whether the methods that the LLM provides are reflective of best practice or whether they are simply what happens to be most written in SO or blog posts.


So an interesting behavior of LLMs is something like the following.

"Write C++ code that sorts the following inputs"

versus

"Write (version) C++ code that sorts the following inputs, ensure the code is secure and uses best practices"

And you'll likely get a different answer.


Doesn't matter if it can. You'll have to know how to do it too. Otherwise, you'll never be able to recognize a good fix from a bad one provided by the AI.


Imagine explaining to your boss "sorry for taking down prod but really it's all chatgpts fault!". I bet that would go over real real well...


No different from "the team that built that is all gone, they left no doco, we assumed X, added the feature you wanted, but Y happened under load" , which happens a lot in companies pushing to market older than a minute.

My default assumption now, after watching dozens of post mortems, is that beyond a certain scale, nobody understands the code in prod. (edited added 2nd para)


Going to have to disagree with a lot of this based on my experiences.


This is off topic. Clearly we all know the LLM is flawed. We are just talking about it's capabilities in debugging.

Why does it always get side tracked into a comparison on how useful it is compared to human capability? Everyone already knows it has issues.

It always descends into a "it won't replace me it's not smart enough" or a "AI will only help me do my job better" direction. Guys, keep your emotions out of discussions. The only way of dealing with AI is to discuss the ramifications and future projections impartially.


I fiddled around with some things on the weekend (i am not a programmer, i actually hate it so using LLMs is great for me - us EEs always write awful code) to automatically create a debug file of any output that gets a traceback and create a standard report using pdb, inspect, etc (never used them before) regarding the functions, parameters and variables, current state etc etc.

Though i was surprised i can't easily run pdb instance via a python program, still have to use stdin/out apparently.

Next i want to implement automerge (or semiautomerge) between different outputs which e.g. contain variants of the same function to automatically resolve issues spawned from the model forgetting. That's so annoying

I also suspect a lot of issues are due to the training data being on old SW. I think we can automatically remap this with whitelisted functions and parameters (i recall inspect can do this), blacklisted ones from old version NOT present in the current, and maybe a transformation between the two -- or automatisch regenerate if it's wrong, maybe with a modification to the prompt.

Also talking to it in other languages generates massively different code (i used deepl) so i had the crazy idea of spawning Dockers and just letting this automatic/semiautomatic trouble Shooting+ just parallel generating lots of functions using wildly different inputs (and models) to brute force the problem of having to code

I do need to look into a nice terminal interface for N-way merges and parallel gen monitoring.

The most useful thing for me was making some vim keybinds and scripts to automatically grab Codeblocks, run them and quickly regenerate. You can literally just tell it "DF" and if fixes a pandas issue sometimes

The holy grail will probably be local fine tunes/LoRAs for specific issues or libraries, since it only costs a few $ for one. Sign me up for an expert plotly AI in a box for neat plots please

Edit : i also have literally no idea what I'm doing either, but linting and analyzing generated Code blocks could help expedite this whole process as well. And in principle you don't even have to run it if you know the type is wrong or something.

I don't know what this is called but computer science is ostensibly mathematics so i assume/hope there is some rigor here




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: