Hacker News new | past | comments | ask | show | jobs | submit login

One of the most important papers in software engineering, which I believe everyone in this profession should read and internalize.

Every time I see another startup trying use LLMs for code generation I sigh in despair. As AI technology improves and becomes better at producing code, what looks like a win in the short term will end up creating more and more code that has been created without a human going through the necessary thought processes and problem solving steps to build the theory of the software as described in this paper.

It's also why its critically important for companies to do what they can to retain the people who built the software in the first place, or at least ensure there's enough continuity as new people join the team so they can build their mental model by working alongside the original developers.




> without a human going through the necessary thought processes and problem solving steps to build the theory of the software as described in this paper

We might not be there yet (well we definitely are not) but it does not seem out of the question that within a generous 10 years we will have systems which can leverage graphs, descriptive language, interpreters, and so on to plan out and document and iterate and refine the structure of a problem and its architectural solution in tandem with developing the solution itself iteratively at a very effective level, given a sufficient explanation of the goals/problem - or more importantly/phrased another way, following the initial theory of a problem formulated by the human; the kind of documentation produced by such systems can also be more easily ingested by other non-human systems, potentially remedying some of the challenges with outlining/documenting/transferring the theory of the problem that humans have.

And what prevents a human from doing code review on such a system’s outputs? Now maybe your point was that the simple expense of a human’s time is the barrier, especially given that you were talking about the context of companies using LLMs to speed up their code production (read: eliminate cost centers), but in that case the errors that may come from poorly designed procedurally generated codebases just reads like bad project management to me for which the chickens will ultimately come home to roost; the companies which can successfully integrate such procedurally codegen engines while still maintaining strong design principles, maintainability, simplicity, etc ought to outcompete their competitors’ slop in the long run, right?

Having said all that, I think the more important loss is that the human fails to build as much intuition for the problem space themself by not being on the ground in the weeds solving the problems with their own solutions, and this will struggle to develop their own effective theories of the problem (as indicated by the title of the article in the first place).


What you're describing is the siren call of No Code, which has been tempting manager-types for decades and which has so far failed every single time.

The trouble with No Code is that your first paragraph is already my job description: I plan out and document and refine the structure of a problem and its architectural solution while simultaneously developing the system itself. The "sufficient explanation of the goals/problem" is the code—anything less is totally insufficient. And once I have the code, it is both the fully-documented problem and the spec for the solution.

I won't pretend to know the final end state for these tools, but it's definitely not that engineers will write natural-language specs and the LLMs will translate them, because code (in varying degrees of high- and low-level languages) is the preferred language for solution specification for a reason. It's precise, unambiguous, and well understood by all engineers on a project. There is no need to be filled by swapping that out with natural language unless you're taking engineers out of the loop entirely.


> The "sufficient explanation of the goals/problem" is the code—anything less is totally insufficient.

somewhat in that spirit, I like Gerald Sussman's interpretation of software development as "problem solving by debugging-almost right plans", in e.g. https://www.youtube.com/watch?v=2MYzvQ1v8Ww


The point is also brought up a few times in SICP:

> First, we want to establish the idea that a computer language is not just a way of getting a computer to perform operations, but rather that it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read, and only incidentally for machines to execute.


I mostly agree with what you were saying, but I don’t think I was advocating for “no code” entirely, and certainly not the elimination of engineers entirely.

I was trying to articulate the idea that code generation tools will become increasingly sophisticated and capable, but still be tools that require operation by engineers for maximal effect. I see them as just another abstraction mechanism that will exist within the various layers that separate a dev from the metal. That doesn’t mean the capabilities of such tools are limited to where they are today, and it doesn’t mean that programmers won’t need to learn new ways of operating their tools.

I also hinted at it, but there’s nothing to say that our orchestration of such systems needs to be done in natural language. We are already skilled at representing procedures and systems in code like you said; there’s no reason to think we wouldn’t be adept at learning new languages specialized for specifying higher order designs in a more compact but still rigorous form to codegen systems. it seems reasonable to think that we will start developing DSLs and the like for communicating program and system design to codegen systems in a precise manner. One obvious way of thinking about that is by specifying interfaces and test cases in a rigorous manner and letting the details be filled in - obviously attempts at that now exhibit lots of poor implementation decisions inside of the methods, but that is not a universal phenomenon that will always hold.


The DSL paradigm is generally how I go about using LLMs on new projects, I.e use the LLM to design a language that best represents the abstractions and concepts of the project - and once the language is defined, the LLM can express usecases with the DSL and ultimately convert them into an existing high level language like Python.


That is s great idea. I’ve used ChatGPT to help me define the names of the functions of an API. Next time I face a problem where it calls for DSL I will give it a try.


Do you have any repos or examples you can share? Would love to see an example of that in action!


I am not the person you asked the question of;

Earlier an HN user had given an example of using Prolog as an intermediate DSL in the prompt to an LLM so as to transform English declarative -> Imperative code - https://news.ycombinator.com/item?id=41549823


Yep, this makes a lot of sense.

In general, we already have plenty of mechanisms for specifying interfaces/api specs, tests, relationships, etc in a declarative but more formal manner than natural language which probably all work , and I can only imagine we will continue to see the development of more options tailored to this use case.


In general, I subscribe to your thoughts, but also the ones you are replying to.

But: “And what prevents a human from doing code review on such a system’s outputs?” One word: cost.

At least in my experience, at least right now, it is more effort to review and correct, as doing from scratch.


Unfortunately, the book this was included in: _Computing: A Human Activity_

https://www.goodreads.com/book/show/4594604-computing

is out of print, as is _Concise survey of computer methods_ and rather pricey.

Oddly, _Knowing and the Mystique of Logic and Rules_ (which has an even lengthier title after a colon...) has four entries at Goodreads and is listed under "P. Naur" and is even pricier, quite expensive on Amazon:

https://www.amazon.com/Knowing-Mystique-Logic-Rules-Statemen...

even as an ebook.

It would be more influential if it was affordably in print....


It was reprinted elsewhere, in an agile book (which one?) which this (more readable than linked) copy [1] is from. I think the other one might be from another edition of the same book. I ordered Computing A Human Activity a few weeks ago, its still in shipping, probably got the cheapest remaining copy.

[1] https://pablo.rauzy.name/dev/naur1985programming.pdf


Paging Stripe Press


I don't think using AI to write code precludes learning deeply about the problem domain and even the solution. However, it could lead to those problems depending on how it's done. But done well you can still have a very knowledgeable team that understands the domain and large portions of the code, I believe anyway.

I think software engineers will drift towards only understanding the domain and creating tasks and then reviewing code written by AI. But the reviews will be necessary and will matter, at least for a while.


Respectfully, this seems upside down to me. Tools incorporating LLMs will be the knowledge repository for s/w projects of the future, and will capture and then summarize ideas, create mocks and finally render code (on command with guidance and iterations involving teams). My point being that the LLM era will be a deeper realization of code as theory building.


I thought it's about solving Leetcode problems.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: