Says more about how useless BCG consultants are.

I’m starting to think there’s an LLM equivalent to the old saying about how everything the media writes is accurate except on the topics you’re an expert in. All LLM output looks to be good quality except when it’s output you’re an expert in.

People who have no background in writing or editing think LLMs will revolutionize those fields. Actual writers and editors take one look at LLM output and can see it’s basically valueless because the time taken to fix it would be equivalent to the time taken to write it in the first place.

Similarly people who are poor programmers or have only a surface level understanding of a topic (especially management types who are trying to appear technical) look at LLM output and think it’s ready to ship but good programmers recognize that the output is broken in so many ways large and small that it’s not worth the time it would take to fix compared to just writing from scratch.

LLMs are not worthless for programming. You just cannot expect it to ship a full programm for you, but for generating functions with limited scope, I found it very useful. How to make use of a new and common libary for example. But of course you have to check and test.

And for text I know people who use it succesfully (professionally) to generate texts for them as a summary from some data. They still have to proof read, but it saves them time, so it is valuable.

I've been using it for code review. I just paste some of my code in and ask the AI to critique it, suggest ideas and improvements. Makes for a less lonely coding experience. Wish I could point it to my git repositories and have it review the entire projects.

I've had mixed experiences with getting it to generate new code. It produced good node.js command line application code. It didn't do so well at writing a program that creates 16 bit PCM audio file. I asked it to explain the WAV file format and things like lengths of structures got so confusing I had to research the stuff to figure out the truth.

This mirrors my experience. Very helpful writing node.js application code, but struggles to walk through simple operations in assembly. My hunch is that the tokenization process really hurts keeping the 1s and 0s straight.

It's been hit or miss with rust. It's super helpful in decrypting compilation errors, decent with "core rust" and less helpful with 3rd party libraries like the cursive TUI crate

Which comes as no surprise, really, as there's certainly less training data on the cursive crate than, say, expressjs

Also FWIW I have actually pointed it at entire git repos with the WebPilot plugin within ChatGPT and it could explain what the repo did, but getting it to actually incorporate the source files as it wrote new code didn't work quite so well (I pointed it to https://github.com/kean/Get and it would frequently fall back to writing native Swift code for HTTP requests instead of using the library)

>LLMs are not worthless for programming.

They can be worse than worthless. They can sabotage your work if you let them making you spend even more time fixing it afterwards.

For an example. I've used Gpt4 as a sort of Google on steroids with prompts like "do subnets in gcloud span azs" and ", "in gcloud secret manager can you access secrets across regions". I very quickly learned to ask "is it true" after every answer and to never rely on a given answer too much(verify it quickly, don't let misinformation get you too far down the wrong route). So is it useful? Yes, but can it lead you down the wrong path? It very well can. The least experience you have in the field the easier it will happen.

>You just cannot expect it to ship a full programm for you, but for generating functions with limited scope, I found it very useful

Entire functions? Wow. I found it useful for generating skeletons I then have to fill by hand or tweak. I don't think I ever got anything out of Gpt4 that is useful as is (maybe except short snippets 3 lines long).

However, I found it extremely useful in parsing emails received from people or writing nice sounding replies. For that it is really good (in English).

"They can be worse than worthless."

But that is the same, when you blindly follow some stackoverflow answer.

And yes, I always have to tweak and I use it only rarely. But when I did, it was faster than googling and parsing the results.

Nobody ever made a code editor plugin that reads random SO answers and automatically pastes them over your code.

The amount of fighting I needed against MS development tools mingling my code recently is absurd. (Also, who the fuck decided that autocomplete on space and enter was a reasonable thing? Was that person high?)

>"I found it useful for generating skeletons I then have to fill by hand or tweak".

Even this can be a big time saver, that increases productivity.

Just like others have said, it isn't going to write a Pynchon novel, but it does do a great job at the other 99% of general writing that is done.

Same for computers, the average programmer isn't creating some new Dijkstra Algorithm every day, they are really just cranking out connecting things together and doing the equivalent of 'generic boiler plate'.

> They can be worse than worthless. They can sabotage your work if you let them making you spend even more time fixing it afterwards.

I basically gave up on llms because i was spending more time figuring out what it did wrong than actually getting value.

People without programming skill are still impressed by them. But they yet have to learn or deliver anything of value even with the help of chat bots.

I have twenty years of programming experience and LLMs give me a significant productivity boost for programming on a daily basis: https://simonwillison.net/2023/Sep/29/llms-podcast/#code-int...

I have met my share of folks with decades of experience that was not of quality. The most hilarious are those that open tar gz files using notepad wondering where the code is to those that work on the web but dont know what xsrf is. Experience while long if it’s of the not so great type doesnt count. Not saying this is the case.

LLMs do produce impressive code. Even if they were indeed just procedural generators it would still be impressive. The code has structure and appears useful.

But the issue is that you can tell it makes no sense, there is no thought process behind it. It fits in no greater picture.

Even if you add more context it still has no purpose.

People that find this useful are the same type that copy stackoverflow code that they dont understand. It kinda works when it does but again it doesnt fit in the bigger picture.

Code isnt about spelling instructions - an…ai can do that - code is about what goes where in a way that the what changes as often as the where. It’s the bigger picture. So yes it can help and replace those that spell instructions but it will be hard to replace those that are required to deliver more.

"But the issue is that you can tell it makes no sense, there is no thought process behind it. It fits in no greater picture."

Completely agree with you. That's my job. The LLM is effectively my typing assistant.

Sorry, I may have gotten something wrong by skimming over your link. Is this the "significant project" you have been assisted by LLMs?


That's one of about a dozen at this point - but yeah, that's the one that I used LLMs to research the initial triggers and schema design for.

Here's the transcript (it pre-dates the ChatGPT share feature): https://gist.github.com/simonw/1aa4050f3f7d92b048ae414a40cdd...

I wrote more about it here: https://simonwillison.net/2023/Apr/15/sqlite-history/

Here's another one I built using AppleScript: https://github.com/dogsheep/apple-notes-to-sqlite - I wrote about that here: https://til.simonwillison.net/gpt3/chatgpt-applescript

While it is impressive that an ai can generate all this, the code is anything but significant. Using triggers for history is one sure way to bring a scalable system down fast and one of the first lessons a junior will learn.

Are you sure that holds with SQLite? My benchmarks so far have shown it to add a pretty inconsequential overhead.

Also: not every system has to be a scalable system. That's another lesson junior engineers (should) learn.

I honestly don’t understand how people can say LLMs are useless for coding. Have you tried ChatGPT 4, or are you basing this take on the obsolete 3.5? I’m a professional programmer and I think LLMs are extremely useful.

I’ve used GPT 4. It’s not helpful in any domain in which I’m already proficient. If I’m having to use a new language or platform for whatever reason it’s mildly quicker than alt-tabbing to stack overflow, but probably not worth the subscription.

For graphics tasks GenAI is absurdly helpful for me. I can code but I can’t draw. Getting icons and logos without having to pay a designer is great.

Programmers don't think that, though, or least not all the time.

You could say similar things about Stack Overflow, and yet we use it.

Stack Overflow responses are well known to be misranked. I’ve heard a rule of thumb that the actual correct answer is typically about #3.

And #1 is usually broken or wrong, due to its (typically) old age. The longer it has to accumulate upvotes the less relevant it becomes.

For any managers reading: Chat GPT and Stack Overflow are not the same kind of thing.

Indeed they're not. And GPT-4 tends to outperform SO in my experience.

Yep. ChatGPT is like having a junior engineer confidently asking to merge broken garbage into your codebase all the time. Adds negative value for anyone that knows what they’re doing.

But with one crucial difference: it's a junior programmer that can make changes based on your feedback in a few seconds, not a few hours. And it never gets tired or frustrated.

I find treating it like an intern is amazing productive: https://simonwillison.net/2023/Sep/29/llms-podcast/#code-int...

hahahah. A friend of mine has a problem with a contractor at his workplace that tries to PR in shell scripts written with Copilot. My friend spends an hour to explain why a script generated in 5 minutes is horrifically awful and will likely take down the company. He's legitimately angry about it.

It seems like the only ways to subordinate programming tasks are to write tests for your subordinate's code, or to review it tediously yourself, or to just trust the hell out of them.

> I’m starting to think there’s an LLM equivalent to the old saying about how everything the media writes is accurate except on the topics you’re an expert in.

This is true for media articles but for LLMs I feel like it's the opposite. Like people who aren't specialists don't fully appreciate how great it is at those tasks.

everyone you described share something in common.

they aren’t good at using language models.

Nor are 99.9% of humanity. I think that's the point.

Gell-Mann Amnesia!

Gell-Bott Amensia.



Most management consultants are useless. But there are some realities you must accept.

Number 1. In a team of 20-30 engineers there is only one extremely god "why is he with us" engineers who is great at technical stuff and being a people person. However, no matter how nice he is his approach to his job, it is a job and I will only drop hints how the management should be done. He doesn't care about where the company is headed because he plays video games, has a family and has a literal life. He doesn't care about management and taking on undue responsibilities. Moreover, the people up to has a label for him as an "engineer" does not see as a "manager".

For the rest of the engineers and managers, have also adopted the approach of "not my problem", you see a bizarre communication gap. Engineers working closesly with the product don't want to talk to their managers, becase the conversation goes like "if you know this so much, why don't you.... <a description of something results in more work that goes outside their JD>" and managers don't want to talk with engineers because "if you are you so interested, why don't you.... <a description of something results in more work that goes outside their JD>"

From this progressive distance between managers and engineers comes the "manaegment consultant". Management consultant have the upper management given flexibility of going back and forth between engineers and managers. They can have conversations with full flexibility but they are not bound to "why don't you...." phrases. They can talk with anyone and submit a report and take home 1 years worth of salary of managers/engineers in 1 month.

The conversation gap between product and business where management consultants come in. And the funny thing is that, management consultants target those "I don't want to but I should" work things and report to the upper management. They can do this so well, because they are not burdened with the "work" part.

Seriously, if you do some introspection, you will see there is plenty of things you know your company should do, but you don't want to voice them because it results in more work and in fact more risk. There comes a "good" management consultant who will discover those things and report to upper management who will create the system to get those jobs done.

That is my pitch if anyone wants a management consultant hire me. I am going to tell them why their company sucks in 20 different ways with 18 of those points being generated by ChatGPT.

Man, write drunk, but edit sober.

Off topic, but I had to follow up. It seems that "Write drunk, edit sober" is mis-attributed to Hemingway, who advised writing sober [1].

[1] https://www.writersdigest.com/be-inspired/did-hemingway-say-...

My apologies. I agree. How long does it take for Ketamine to wear off? I only had a little bit after breakfast. I hope the edit button stays.

Your horse tranquilizer-addled coworker seems to be expressing a few points about the workplace dynamics between engineers and managers. First, he believes that while there may be exceptional engineers who are also good with people, these engineers are generally not interested in managerial responsibilities. Second, he observes a communication gap between engineers and managers, where both parties avoid taking on additional tasks outside their job descriptions. Lastly, he argues that management consultants bridge this gap by identifying issues neither party wants to handle but should. He concludes by saying that he'd make a good management consultant because he can spot numerous ways a company could improve.

If all else fails, the LLM revolution will at least allow us to make sense of ketamine-induced rants on management.

Maybe it was done on Ketamine, but the points are valid. Have seen it, consultants don't really bring 'new' or 'creative' solutions, they just help move the ideas around the calcified layers in the organization.

just said in 100x words...

Is the post that bad?

What does it say about me if I didn’t think that it was that bad?

It is an honest and unfiltered take.

My theory is that honest takes should be written on first take without revisions and without edits. The moment I massage a statement to be more coherent I am compromising on my honesty.

It’s a bit long for the TLDR crowd, and it was passionate.

But no, it was a good post and the cultural expectation to keep things shorter and more buttoned up has some real downsides.

I would have written this with more punch, but, well, see above.

The new tasks people get from talking to each other are usually well within their job description. They are just new tasks, and neither developers nor middle managers are allowed to drop useless tasks just before something valuable appeared.

Either way, in my experience management consultants just add new useless tasks for everybody on that set. I have never seen them actually decreasing the number of tasks.

Needs an /s.

Somewhat agree, I know LLM have boosted my programming output mostly in writing jsdocs and pr descriptions. The things I don't really like doing

If your docs and PR descriptions can be generated off file diffs everyone's time could be better spent scanning the diff to come to the same conclusions.

Consider using your PRs and docs to capture the answers to the usual why questions which LLM won't be able to do.

Ah yes, but that would require actual effort, and in the end is only going to serve to improve someone else’s model.

The why is largely in the ticket and the what in the pr.

I've seen code bases survive three different ticket management systems. Meanwhile, the tickets never made it between the different systems, so if the 'why' isn't in the commit message, then it got lost to time.

I will admit that a lot of the really old decisions don't have much relevance to the current business, but the historical insight is sometimes nice.

Huh, your tickets aren't just a single vague title sentence and no description body?

Sometimes this is the case but most tickets have a detailed info of the bug or links to a confluence page of design specs.

Agreed: the study only shows that BCG consultant's work is 40% noise without real added value... I guess that customers should now ask for a 40% rebates !!! ;-)

Says more about how people will parrot the same phrase over and over for anything at all. It's just funny how you can predict a comment like this in every thread regardless of what it does.

"It says more about [insert]" anytime GPT does something just makes the phrase lose all meaning. Surely you have something meaningful to say?

Often effortposts aren’t worth it because someone will come along and Gish Gallop the post with opaquely nonsensical bad-faith counterarguments that are a lot of work to refute.

I agree with you in an ideal world, but sadly this isn’t one.

As I understand it, they have a very specific purpose. The customer needs someone to blame in making difficult decisions. The difficult decision process itself is secondary.

Yeah, that was my thought too... alternative headline: "ChatGPT-4 significantly decreases the need for business consultants".

So when AI is better at humans at everything, the takeaway will be that humans weren't so great after all?

perfect tool for a consultancy: take a fresh graduate, pair it with a LLM tool and charge big bucks. not much different from current but the client will get a much more confident consultant and will be happy to fork more money.

And how even more useless they will be in the near future.

