Hacker News new | past | comments | ask | show | jobs | submit login
ChatGPT-4 significantly increased performance of business consultants (d3.harvard.edu)
309 points by bx376 9 months ago | hide | past | favorite | 272 comments



Well, this sounds like perfect tasks for GPT:

"Participants responded to a total of 18 tasks (or as many as they could within the given time frame). These tasks spanned various domains. Specifically, they can be categorized into four types: creativity (e.g., “Propose at least 10 ideas for a new shoe targeting an underserved market or sport.”), analytical thinking (e.g., “Segment the footwear industry market based on users.”), writing proficiency (e.g., “Draft a press release marketing copy for your product.”), and persuasiveness (e.g., “Pen an inspirational memo to employees detailing why your product would outshine competitors.”)."

Here is the GPT response to the first task: https://chat.openai.com/share/db7556f7-6036-4b3d-a61a-9cd253...

A confident GPT hallucination is almost indistinguishable from typical management consulting material...


1) Your ideas are bad.

2) Spreadsheets exist.

3) No-one cares about your marketing copy.

4) No-one finds your c-suite babble inspirational.

This is almost perfect input to an LLM exactly because of how low value it is in the first place.


Ha it's fun to dunk on management consultants but I think the magic is they are like pop music producers.

Somehow they're able to make the C suite hoover up LLM shovelware the same way top producers can take super obvious music and sell I V vi IV but when we try the same chords it's uninspired and no one wants to listen to it


It’s insane that people think producing pop music is easy. Competing on the pop music market is a cutthroat business with a lot of competition and the good ones get their price back in gold.

When Scorpions wanted to be resurrected they hired Desmond Child to produce them and he absolutely crushed. These people are very good at what they do and there are very few of them.


The perception is that success in the field is driven largely by factors other than the quality of music. It'd be extremely interesting to see a Richard Bachman / Steven King [1] type experiment with a Desmond Child, Max Martin, or whoever else.

Keep their existence completely out of the picture, and have them scout and produce talented no-name, but require the no-name to use only the sort of avenues that would be openly available to anybody/everybody: YouTube, Tunecore, social media, etc. Would the new party now be meaningfully likely to have a real breakthrough?

[1] - https://en.wikipedia.org/wiki/Richard_Bachman


A lot of the public believes that what you're talking about already happens, for what it's worth. "Industry plants."

Something can be extremely catchy yet widely panned as low quality in music, so even within "just the music" there are several dimensions at play regardless of marketing, etc. Such as whether it's timed right - are there enough people ready for that song at that time?

The idea that "most people will just listen and be fans of whatever the big media companies put out there" doesn't stand up much examination or conversation with "most people."

People do often make breakthroughs on soundcloud, TikTok, whatever - do you think having the invisible support of a Max Martin would lower their chances? You'd need to do your experiment a hundred times or thousand times or so before you could really compare the success rate of your plants to the rest of the crowd, but it's hard for me to believe that they wouldn't have an advantage. The music industry isn't known for their charity, if they could get away with not paying those people without another label beating them in the market, why would they?


> other than the quality of the music

The ‘quality’ of a pop song is how much popular appeal it has. That’s the basis of the genre, even reflected in the name.


But the qualities of the song are not what make it popular. With "pop" music there are far more important forces at play, namely the quantity and quality of the song's publicity. One of the big things you get with a big time producer is big time connections and a lot of "airplay" in mixes, commercials, TV shows, etc. You also open the door for more collaborations with other popular artists.

Occasionally you'll have a song that breaks through due to sheer catchy-ness, but this is the exception rather than the rule.


In practice you need both. Max Martin himself has produced and songwritten for plenty of no-names, but for an artist that has the requisite marketing support, bad or uncatchy pop songs can absolutely ruin an artist who would otherwise make it big.


There is a third factor; the quality of the mix. People like Serban Ghenea[0] get hired to make the sound world class.

[0] https://en.wikipedia.org/wiki/Serban_Ghenea


Which is why Michael Jackson hired Quincy Jones to fix his music.


“Just Blaaaaazzzee”

This is admittedly very niche but I see your point.

Also: I remember a time in the early 00s when almost every song on MTV began with Rodney Jerkins whispering “DARKCHILD” over the music.

Ultimately isn’t it just branding though? Would you buy Coca-Cola if it had some other label on the bottle? Or watch Mission Impossible 14 starring Some Dude? I’m not sure there’s a lot of fields where things are really competing on their own merits rather than the accumulation of their past successes.


Define quality


It’s also insane people think running a large business is easy, or that management consultants aren’t worth what they’re paid.


Worth here is the key word. I did an MBA which was specifically designed to get people into consulting, we did many consulting projects for real multinantionals and the key lesson was: "What does the hidden client want?"

i.e. someone has been tasked with getting some consultants to come up with a suggestion, the key question however is what does THEIR boss want to hear. If you can work that out and give it to them then you've earned your 'worth'


Well, there's also the case where some internal engineering and product management think they have the right answer. But it may be a literally bet the company sort of thing. (Especially outside the software realm where once the bus has left the station it's not turning around.)

Now I know there are people here whose reaction is that executive management should just shut up and listen to what the worker bees say. But it actually doesn't seem unreasonable to me (and I've been on the product management end of things in a case like this) to have some outside perspective from some people who are mostly pretty smart to nod their heads and say this seems sensible.

As a bonus they create big spreadsheets that make the business planning types happy and keep them out from underfoot.


Wow I’m sure someone that paid some money for an MBA and then stopped at the gate of an entry level position has expertise in this matter.


I hope someone builds a management consultant gpt to find out.

Until then, our leaders, experts and institutions were few of those things during the pandemic.

How large businesses are built and run has changed faster and more in the past few years than the principled predictions of a business’ future vector that are based on lagging indicators.

It also depends on how cookie cut the management consultant frameworks and “toolkits” are.

It’s no coincidence that it’s mostly juniors doing so much of the work and billing. New or average talent is more profitable per hour to bill than experience.

Financially, if $7-8 of every $10 for improvement went to a management consulting undertaking, the other $2-3 is what’s left over for the rest without even knowing. This would be the coup, if this were true.

The fun part to watch for is is tech people will be able to learn business easier than business people will be able to learn and apply tech when they can’t understand it’s capabilities or possibilities beyond speaking points.

The technical analyst will M&A the business analyst. Maybe they learn to extract Management Consultant type value too.


Mangement consultants get comparatively small money from (large) firms. The total revenues going to McK, BCG, and Bain are only about $30B annually.


Well, if it was that hard, chatGPT wouldn't be able to help them.


Check out the movie "The Wrecking Crew". It's about studio musicians fixing, rewriting, playing, and singing the music created by the "bands" you've all heard of, so the albums were good.

Then, the bands belatedly had to learn how to sing and play in order to go on tour.

I think about The Wrecking Crew whenever I hear the sob stories about bands being underpaid and the producers reaping the lion's share of the profits.


There's the story of David Cassidy. He got cast in the Partridge Family (the rock band family) largely because of his looks and his mother (Shirley Jones). His voice was set to be dubbed for the songs, but it turned out he had a golden throat. (The rest of the cast, besides Jones, could neither play nor sing.)

The producers hired top shelf songwriters to write the songs, and several hit albums were produced. (It really is good music, despite being bubblegum.)

Cassidy, however, decided that he had songwriting talent and chafed. He eventually left the show, and with the megabucks he earned on the show, produced albums. They're terrible.

The same thing happened to the Monkees.


Is there a HN for management consultants?

Depending you ask pop music is a formula and that one dude from Sweden has the formula perfected.


fishbowl?


The magic is in:

A) Forming cross company cliques (a lot of C suite is ex consultant and they scratch each others' backs).

B) ego stroking and typical sales ("this executive is a visionary who must be furnished with top quality steak and strippers")

C) Letting you know on the sly what their other customers are doing that seems to be working.

D) Providing industrial grade ass cover for decisions that the C suite want to make but are afraid to make by themselves (like layoffs).


So much of it is about information awareness. Like it or not, these consultants and analysts talk to hundreds of C-levels all the time. They become excellent information sources about what is working, what is not working, and about business risks that a particular executive may not be aware of. Yes, there is the potential for group-think, and the bad ones shill for a particular technology or process without any basis in success. But the good ones provide guidance to the executives that might be working in information-free areas, making them aware of concepts, technologies, and processes that either present risks to their businesses or represent good practices they really should adopt. It's easy to be cynical about this, but there are many good business leaders who are not analytical, and are in need of this kind of guidance.


>It's easy to be cynical about this

It's probably easier to assume that their job is to provide objective expert advice since thats what they say they do.

I'm being realistic here, not cynical.


>It's probably easier to assume that their job is to provide objective expert advice since thats what they say they do.

You are mistaken "expertise" with fashion and a good voice.


the comment might be spot-on for some companies and industry, but really.. not all business culture is the same. By painting "all management" in this light you are showing the same one-dimensional thinking that is being criticized here..


Did you think I was making some comment about the millions of businesses and many governments that don't use the services of these consultants?

I can assure you I wasn't.


Don’t underestimate the ass covering.

I recon 60% of management consulting work is just to ass cover for a director with no conviction


The more expensive the cover the better.

“I paid a world-class consulting consultant company top dollar to vet this idea and they produced ton of documents about how great it was. And, yet it failed. But, I’m not at fault here. What more would you have had me do?”

There was an article on HN years ago about top grade from Harvard-like schools being sucked into consulting companies and discovering their job was to be paid tons of money writing reports that support whatever the exec of the moment wanted to hear.


> their job was to be paid tons of money writing reports that support whatever the exec of the moment wanted to hear

Sounds like an extremely nice job.


Not really.


Well, depends on how much money is "tons of money".


Do you have a link? I can't find it.


Yeah, I think that's the majority of the grunt work done by junior consultants.

I don't doubt that if the high level decision agreed upon is "more layoffs because AI" and they were asked for a 60 page report to justify it that ChatGPT would help inordinately in fleshing it out with something that sounds fairly plausible.


There's a lot of boilerplate that actually takes quite a while to write from scratch. If the people involved have a pretty good idea in their heads of what fundamentals are fairly sensible and which are probably sort of irrelevant or even wrong, something like ChatGPT is actually pretty good at churning out at least a decent pre-draft that can save quite a bit of time. I've used it a fair bit for introductory background that I can certainly clean up faster than I could put together from scratch.


Or, you have a break out hit like the Spice Girls.

If it was so bad, then why do people listen?

There is still a market.

Does the market suck? Full of idiots?

Your argument ends up being that successful things are bad, because humans are just idiots and thus if something is successful it is because it is just liked by idiots.

As much as I might agree generally, it doesn't get you far.


Back then, payola meant that people listened whatever the labels wanted to make popular. it's very much an intentional, manufacturered factory.


Honey, everything is an "intentional, manufactured factory". Or do you just wake up every morning and say to yourself "let me type some random code and see what kind of software comes out"??


> Your argument ends up being that successful things are bad, because humans are just idiots and thus if something is successful it is because it is just liked by idiots.

I mean I'm not sure that this is that far from the truth in some domains, though it depends on how you define idiocy. There is, for example, a market for demolition derbies. Of course all of us are idiots in some ways so we should be careful about whom we disparage.


Sounds like you are tunnel visioning in your analysis of the music? There are so many more things to it than the chord progression.

On the songwriting side, there's the lyrics which have both a phonetic and a semantic component. There's also the fact that many people will mishear the lyrics and their evaluation of them will be based on the mishearing. There's the melody. Does it work together with the chords to highlight the key parts of the lyrics?

Then there's the performance where there are a million ways to stand out or flop. Loudness, timbre, timing and even detuning can all be used for expression.


Are you trying to tell me that "starbucks lovers" was intentional?


Often the benefit of management consultants is they help the company feel better about firing or laying off people.

Actually a robot would be perfect at that. No one likes doing layoffs but chat gpt won't mind


> Actually a robot would be perfect at that. No one likes doing layoffs but chat gpt won't mind

Maybe chatgpt should be trained to care.


> A confident GPT hallucination is almost indistinguishable from typical management consulting material...

If you're measuring based on output, sure, but... the value of any knowledge worker is primarily driven by the input, that is, a client doesn't want "10 ideas" they want "10 [valuable] ideas [informed by the understanding of the business and the market they're operating in]". If a management consultant said "boat shoes" in response to this question they would not have a client much longer.

You could apply this same nonsense task to software engineering, i.e: ask ChatGPT to "write 10 lines of code" and it'll be indistinguishable from the code we churn out day after day.


So you ask for 20 ideas and filter. Even if you throw away 19, it's still useful.


Even if you throw all of them away, the point is the human is simply more effective in their role. Just breaking the ice, the writer's block; anything to get out of the creative rut that we all fall in, is worth its weight in GPUs.


Thank you, you express it much better than I can. "Diabetic-Friendly Shoes" from the above list immediately had me thinking in half a dozen different directions at once.


Not necessarily if the remaining one idea is still garbage.


It's not. That's the purpose of filtering them.


How many turds should i filter though until I suddenly find gold?


How bad do you want the gold?


Probably should have chosen a valuable but inedible element.


They’re all bad ideas, though.


> Target: Dog Walkers

> Features: Built-in waste bag dispenser

I'm not yet sure whether I hate it or love it.

> Target: Visually Impaired Individuals

> Features: Haptic feedback

Haptic shoes, how revolutionary!


I’m just imagining stepping in dog poop and my only way to help myself is also covered in dog poop.

The shoes seem to give you two options for cleaning up your dogs poop: (1) Bend down twice, once for the bag, once for the poop. (2) Bend down once nice and close to the poop and get a bigger whiff than otherwise.

I’m just not looking for ways to interact manually with my shoes more than I have to…


I saw a built-in waste bag dispenser recently while helping walk a friend's dog, and honestly it was pretty convenient.


Yes but how much more convenient would it be with haptic feedback


Yeah some of those have potential lol


> A confident GPT hallucination is almost indistinguishable from typical management consulting material...

Sounds perfect for both Harvard and those linked to the institution.

My employer has hired McKinsey a few times, known to recruit from HYP, and their output has been subpar to say the least. My entire experience with these institutions has been fairly uniform in that regard.

I know it’s anecdotal. But it feels like there’s a lot of confirmation bias with these sorts of studies.


The GPT responses read like they were lifted from MAD magazine.


Several of those aren't even new.


Makes sense. The bullshit generator can replace professional bullshitters.


does that say more about gpt or management consulting


Well, they buried the lede with this one. Using LLMs were better for some tasks and actually made it worse for others.

The first task was a generalist task ("inside the frontier" as they refer to it), which I'm not surprised has improved performance, as it purposely made to fall into an LLM's areas of strength: research into well-defined areas where you might not have strong domain knowledge. This also is the mainstay of early consultants' work, in which they are generalists in their early careers – usually as business analysts or similar – until they become more valuable and specialise later on.

LLMs are strong in this area of general research because they have generalised a lot of information. But this generalisation is also its weakness. A good way to think about it is it's like a journalist of research. If you've ever read a newspaper, you often think you're getting a lot of insight. However, as soon as you read an article on an area of your specialisation, you realise they've made many flaws with the analysis; they don't understand your subject anywhere near the level you would.

The second task (outside the frontier) required analysis of a spreadsheet, interviews and a more deeply analytical take with evidence to back it up. These are all tasks that LLMs aren't strong at currently. Unsurprisingly, the non-LLM group scored 84.5%, and between 60% and 70.6% for LLM users.

The takeaway should be that LLMs are great for generalised research but less good for specialist analytical tasks.


I was thinking about this last night. It’s a new version of Gell-Mann amnesia. I call it LLm-man amnesia.

When I ask a programming question, chat GPT hallucinates something about 20% of the time and I can only tell because I’m skilled enough to see it. For all the other domains I ask it questions if I should assume at least as much hallucination and incorrect information.


I see this as for drill-down thinking from a broad -> specific concept AI seems to be helpful when supplementing specialist work. However like you both mentioned: when needing more focused and integrated answers AI tends hinders performance.

However as the paper noted, when working within AIs areas of strength it improved not only efficiency but the quality of the work as well (accounting for the hallucinations). As you mentioned:

> When I ask a programming question, chat GPT hallucinates something about 20% of the time and I can only tell because I’m skilled enough to see it

This matches their Centaur approach, delineating between AI and one’s own skills for a task which—with generalized work—seems to fair better than not using AI at all.


LLMs are broadly good at things that average knowledge workers are good at or can be trained to be good at reasonably quickly.


Comparing LLM to journalists is good insight.


This is hilarious. As impressive as GPT-3/4 has been at writing, what's more shocking is just how bullshity-y human writing is.. And a "business consultant" is the epitome of a role requiring bullshit writing. Chat GPT could certainly out business-consultant the very best business consultants.

Sometimes to be taken seriously at work, you need to take some concise idea or data and fluff it up into a multiple pages or a slide deck JUST so that others can immediately see how much work you put in.

The ideal role for chatgpt at this moment is probably to take concise writings and to expand it into something way larger and full of filler. On the receiving end, people will endure your long-winded document or slide deck, recognize you "put in the work", and then feed it back into chatGPT to get the original key points summarized.


> As impressive as GPT-3/4 has been at writing, what's more shocking is just how bullshity-y human writing is..

Yeah. Most people have focused on what LLMs can do, but I think it’s equally if not more interesting what can they not do, and why?

When we say LLMs can generate text we’re painting brush strokes as broad as a 10-lane highway. Apparently we have quite limited vocabulary about what writing actually is, and specifically what categories and levels exist.

For instance, it’s fun (and in my view completely expected) to see that courteous emails, LinkedIn inspirational spam, corp-speech etc, GPT outperforms humans with flying colors, on the first attempt too! Whereas if you’re asking for the next book of Game of Thrones or any well-written literature it falls flat – incredibly boring, generic, full of platitudes and empty arcs and characters.

We have to start mapping the field of writing to a better conceptual space. Currently it seems like we can’t even differentiate between the equivalent of arithmetic and abstract algebra.


To me it looks very analogous to AI-generated "art", it's very easy to generate some generally esthetically pleasing visuals, but the depth of the art stays in proportion with the input effort... Which is often not much. All of this shouldn't be very surprising really, and there's still a lot of usefulness to it, if only for depreciating the low-quality copy-paste productions and making the really unique and novel ones even more valuable.


What is “depth of the art”?


Hah. Good question. You could write a whole grad level thesis on it.


Yeah, I couldn't say, it's just "vibes" I guess, just like the filler text produced by business consultants it's just not something that I feel would be missed if lost. It all looks the same at some level even when it's superficially different. Midjourney especially is very uniform in this regard, everything looks great, but it's kind of flat at the same time.


storytelling and perceived value


LLMs are stunningly good at language tasks: almost all of what us old-timers called NLP is just crushed these days. Summarization, Q&A, sentiment, the list goes on and on. Truly remarkable stuff.

And where there isn’t a bright line around “fact”, and where it doesn’t need to come together like a Pynchon novel, the generative stuff is smoking hot: short-form fiction, opinion pieces, product copy? Massive productivity booster, you can prototype 20 ideas in one minute.

But that’s about where we are: lift natural language into a latent space with some clear notion of separability, do some affine (ish) transformations, lower back down.

Fucking impressive for a computer. But if it can really carry water for an expensive Penn grad?

You’re paying for something other than blindingly insightful product strategy.


I wonder how long it takes AI to get good at law. Right now the verbal tasks it excels at are similar to the artistic ones: namely, solving problems with enormous solution spaces that are robust to small perturbations. That is, change a good picture of an angry tree man slightly and it's still probably a good picture of an angry tree man.


I've tried using a lot for writing motions. It can actually do a pretty decent job of writing motions, and it can come up with some arguments that you might not have thought of. You just have to ignore all its citations and look everything up yourself, otherwise this:

https://www.reuters.com/legal/new-york-lawyers-sanctioned-us...


Depends what you would classify as good, but IBM Watson is already used in law firms [today](https://www.ibm.com/case-studies/legalmation)

LLMs iare most often best at helping humans do their tasks more effectively, not replacing them completely


Isn't ChatGPT getting progressively better scores on medical and law exams? It will probably pass the USMLE and the bar one day. If it doesn't already.

It's gonna be interesting.


Yes, but we should expect that, the answers are in its training data.

The problem is passing tests are an okay proxy for competence in humans, but if you think of LLMs as a giant library search engine, the thing it is competent at is identifying and regurgitating compiled phrases from its records.

Which is awesome. It can't be a doctor.


Relatedly, a lot of the cap R fanfic crowd points at stuff pretty clearly lifted out of the Metamorphosis of Prime Intellect or whatever.

And it’s dope that it can do that!

But let’s keep our heads about what it is.


Yes and that's amazing -- but law exams resemble programming exams. In the wild, both labors require you to keep a mountain of project-specific context in your head, something that tests like the LSAT cannot evaluate.


I don't buy it. LLMs cannot do anything reliably, no matter how constrained the domain. Their outputs are of acceptable quality when back to a person who will use their human brain to paper over the cracks. People can recognize when the output is garbage, figure out minor ambiguities, and subconsciously correct minor factual or logical errors. But I would never feed LLM results directly into another computer program This rules out most traditional NLP tasks.


I’m sympathetic to the instinct to push back on the absurd boosterism (these things are an existential threat to humanity this year), it’s fucking annoying.

But they can do plenty of useful stuff reliably. It’s not “be generally intelligent”, which they are just nothing even remotely close to, but know you don’t dig the LLM hype from that comment? Yeah, they get that every time.


I have tried to use LLMs, namely GPT4 and Llama-2, for sentiment analysis. They did quite poorly. I asked them to identify which sentiments from a list are found in a given text, with output formatted as a comma-separated list. In response I usually got just that. But sometimes I got a prose explanation of the sentiment, a list containing different keywords than requested, a list formatted differently than I wanted, or nothing useful at all.

Sentiment analysis is easy. It wasn't the end goal, just a "hello world" example to verify my tools were set up correctly. I ran into unsolvable problems in the tutorial.

I have no use for tools which do amazing things sometimes but which cannot be reasoned about and cannot be prevented from producing garbage. Maybe other people will find uses for them, though. I'll keep an open mind and check back in five years.


Would you feed human output directly into a computer program?

I'm just saying, we invented backspace for a reason. LLMs have no backspace. It's insane they work as well as they do.


So your other reply got flagged which I thought was a little harsh (I mean you were pushing it but who am I to talk).

If you’re not convinced about sentiment analysis on e.g. LLaMA 2, I think you’re wrong, but maybe I’m wrong.

If you’re up for it, let’s run an expedient, I’ve got a GPU or two in my living room. This thread seems like a pretty great test set actually.

Maybe we both learn something richer than some benchmark stat?


I too thought flagging the comment was a bit too harsh so I tried vouching for it, but it didn't get resurrected.

I note that the comment is [dead] not [flagged] [dead], so maybe its state has to do with something else than the content of the comment? Just [dead] is, I think, shadowban.

I checked the poster's comments, but since it's a new account there's very few of them and I can't determine the reason for the [dead] from them.


> almost all of what us old-timers called NLP is just crushed these days

For this to be true for most production service use cases, LLMs would need to be at least ~10X faster. I generally agree they can be quite good at these tasks, but the performance is not there to do them on large datasets.


Try asking it for specifically NLP-ized text and it does it very well...

(but then tells you not to use it as it is "unethical")


You aren't talking about the NLP the GP is talking about.


You're right. That'll teach me to reply to comments before I had my morning coffee.

For others arriving here: I suspect OP meant Natural Language Processing and I was talking about Neuro-linguistic Programming.

I've had my caffeine now.


Ok now I’m /r/TooAfraidToAsk… NLP means something else relevant to probabilistic language models?


Natural language processing.

I actually don't know what you thought it meant.


Says more about how useless BCG consultants are.


I’m starting to think there’s an LLM equivalent to the old saying about how everything the media writes is accurate except on the topics you’re an expert in. All LLM output looks to be good quality except when it’s output you’re an expert in.

People who have no background in writing or editing think LLMs will revolutionize those fields. Actual writers and editors take one look at LLM output and can see it’s basically valueless because the time taken to fix it would be equivalent to the time taken to write it in the first place.

Similarly people who are poor programmers or have only a surface level understanding of a topic (especially management types who are trying to appear technical) look at LLM output and think it’s ready to ship but good programmers recognize that the output is broken in so many ways large and small that it’s not worth the time it would take to fix compared to just writing from scratch.


LLMs are not worthless for programming. You just cannot expect it to ship a full programm for you, but for generating functions with limited scope, I found it very useful. How to make use of a new and common libary for example. But of course you have to check and test.

And for text I know people who use it succesfully (professionally) to generate texts for them as a summary from some data. They still have to proof read, but it saves them time, so it is valuable.


I've been using it for code review. I just paste some of my code in and ask the AI to critique it, suggest ideas and improvements. Makes for a less lonely coding experience. Wish I could point it to my git repositories and have it review the entire projects.

I've had mixed experiences with getting it to generate new code. It produced good node.js command line application code. It didn't do so well at writing a program that creates 16 bit PCM audio file. I asked it to explain the WAV file format and things like lengths of structures got so confusing I had to research the stuff to figure out the truth.


This mirrors my experience. Very helpful writing node.js application code, but struggles to walk through simple operations in assembly. My hunch is that the tokenization process really hurts keeping the 1s and 0s straight.

It's been hit or miss with rust. It's super helpful in decrypting compilation errors, decent with "core rust" and less helpful with 3rd party libraries like the cursive TUI crate

Which comes as no surprise, really, as there's certainly less training data on the cursive crate than, say, expressjs

Also FWIW I have actually pointed it at entire git repos with the WebPilot plugin within ChatGPT and it could explain what the repo did, but getting it to actually incorporate the source files as it wrote new code didn't work quite so well (I pointed it to https://github.com/kean/Get and it would frequently fall back to writing native Swift code for HTTP requests instead of using the library)


>LLMs are not worthless for programming.

They can be worse than worthless. They can sabotage your work if you let them making you spend even more time fixing it afterwards.

For an example. I've used Gpt4 as a sort of Google on steroids with prompts like "do subnets in gcloud span azs" and ", "in gcloud secret manager can you access secrets across regions". I very quickly learned to ask "is it true" after every answer and to never rely on a given answer too much(verify it quickly, don't let misinformation get you too far down the wrong route). So is it useful? Yes, but can it lead you down the wrong path? It very well can. The least experience you have in the field the easier it will happen.

>You just cannot expect it to ship a full programm for you, but for generating functions with limited scope, I found it very useful

Entire functions? Wow. I found it useful for generating skeletons I then have to fill by hand or tweak. I don't think I ever got anything out of Gpt4 that is useful as is (maybe except short snippets 3 lines long).

However, I found it extremely useful in parsing emails received from people or writing nice sounding replies. For that it is really good (in English).


"They can be worse than worthless."

But that is the same, when you blindly follow some stackoverflow answer.

And yes, I always have to tweak and I use it only rarely. But when I did, it was faster than googling and parsing the results.


Nobody ever made a code editor plugin that reads random SO answers and automatically pastes them over your code.

The amount of fighting I needed against MS development tools mingling my code recently is absurd. (Also, who the fuck decided that autocomplete on space and enter was a reasonable thing? Was that person high?)


>"I found it useful for generating skeletons I then have to fill by hand or tweak".

Even this can be a big time saver, that increases productivity.

Just like others have said, it isn't going to write a Pynchon novel, but it does do a great job at the other 99% of general writing that is done.

Same for computers, the average programmer isn't creating some new Dijkstra Algorithm every day, they are really just cranking out connecting things together and doing the equivalent of 'generic boiler plate'.


> They can be worse than worthless. They can sabotage your work if you let them making you spend even more time fixing it afterwards.

I basically gave up on llms because i was spending more time figuring out what it did wrong than actually getting value.

People without programming skill are still impressed by them. But they yet have to learn or deliver anything of value even with the help of chat bots.


I have twenty years of programming experience and LLMs give me a significant productivity boost for programming on a daily basis: https://simonwillison.net/2023/Sep/29/llms-podcast/#code-int...


I have met my share of folks with decades of experience that was not of quality. The most hilarious are those that open tar gz files using notepad wondering where the code is to those that work on the web but dont know what xsrf is. Experience while long if it’s of the not so great type doesnt count. Not saying this is the case.

LLMs do produce impressive code. Even if they were indeed just procedural generators it would still be impressive. The code has structure and appears useful.

But the issue is that you can tell it makes no sense, there is no thought process behind it. It fits in no greater picture.

Even if you add more context it still has no purpose.

People that find this useful are the same type that copy stackoverflow code that they dont understand. It kinda works when it does but again it doesnt fit in the bigger picture.

Code isnt about spelling instructions - an…ai can do that - code is about what goes where in a way that the what changes as often as the where. It’s the bigger picture. So yes it can help and replace those that spell instructions but it will be hard to replace those that are required to deliver more.


"But the issue is that you can tell it makes no sense, there is no thought process behind it. It fits in no greater picture."

Completely agree with you. That's my job. The LLM is effectively my typing assistant.


Sorry, I may have gotten something wrong by skimming over your link. Is this the "significant project" you have been assisted by LLMs?

https://github.com/simonw/sqlite-history


That's one of about a dozen at this point - but yeah, that's the one that I used LLMs to research the initial triggers and schema design for.

Here's the transcript (it pre-dates the ChatGPT share feature): https://gist.github.com/simonw/1aa4050f3f7d92b048ae414a40cdd...

I wrote more about it here: https://simonwillison.net/2023/Apr/15/sqlite-history/

Here's another one I built using AppleScript: https://github.com/dogsheep/apple-notes-to-sqlite - I wrote about that here: https://til.simonwillison.net/gpt3/chatgpt-applescript


While it is impressive that an ai can generate all this, the code is anything but significant. Using triggers for history is one sure way to bring a scalable system down fast and one of the first lessons a junior will learn.


Are you sure that holds with SQLite? My benchmarks so far have shown it to add a pretty inconsequential overhead.

Also: not every system has to be a scalable system. That's another lesson junior engineers (should) learn.


I honestly don’t understand how people can say LLMs are useless for coding. Have you tried ChatGPT 4, or are you basing this take on the obsolete 3.5? I’m a professional programmer and I think LLMs are extremely useful.


I’ve used GPT 4. It’s not helpful in any domain in which I’m already proficient. If I’m having to use a new language or platform for whatever reason it’s mildly quicker than alt-tabbing to stack overflow, but probably not worth the subscription.

For graphics tasks GenAI is absurdly helpful for me. I can code but I can’t draw. Getting icons and logos without having to pay a designer is great.


Programmers don't think that, though, or least not all the time.

You could say similar things about Stack Overflow, and yet we use it.


Stack Overflow responses are well known to be misranked. I’ve heard a rule of thumb that the actual correct answer is typically about #3.


And #1 is usually broken or wrong, due to its (typically) old age. The longer it has to accumulate upvotes the less relevant it becomes.


For any managers reading: Chat GPT and Stack Overflow are not the same kind of thing.


Indeed they're not. And GPT-4 tends to outperform SO in my experience.


Yep. ChatGPT is like having a junior engineer confidently asking to merge broken garbage into your codebase all the time. Adds negative value for anyone that knows what they’re doing.


But with one crucial difference: it's a junior programmer that can make changes based on your feedback in a few seconds, not a few hours. And it never gets tired or frustrated.

I find treating it like an intern is amazing productive: https://simonwillison.net/2023/Sep/29/llms-podcast/#code-int...


hahahah. A friend of mine has a problem with a contractor at his workplace that tries to PR in shell scripts written with Copilot. My friend spends an hour to explain why a script generated in 5 minutes is horrifically awful and will likely take down the company. He's legitimately angry about it.


It seems like the only ways to subordinate programming tasks are to write tests for your subordinate's code, or to review it tediously yourself, or to just trust the hell out of them.


> I’m starting to think there’s an LLM equivalent to the old saying about how everything the media writes is accurate except on the topics you’re an expert in.

This is true for media articles but for LLMs I feel like it's the opposite. Like people who aren't specialists don't fully appreciate how great it is at those tasks.


everyone you described share something in common.

they aren’t good at using language models.


Nor are 99.9% of humanity. I think that's the point.


Gell-Mann Amnesia!


Gell-Bott Amensia.


[flagged]


https://news.ycombinator.com/newsguidelines.html

Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

When disagreeing, please reply to the argument instead of calling names. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."


Why the anger?


Most management consultants are useless. But there are some realities you must accept.

Number 1. In a team of 20-30 engineers there is only one extremely god "why is he with us" engineers who is great at technical stuff and being a people person. However, no matter how nice he is his approach to his job, it is a job and I will only drop hints how the management should be done. He doesn't care about where the company is headed because he plays video games, has a family and has a literal life. He doesn't care about management and taking on undue responsibilities. Moreover, the people up to has a label for him as an "engineer" does not see as a "manager".

For the rest of the engineers and managers, have also adopted the approach of "not my problem", you see a bizarre communication gap. Engineers working closesly with the product don't want to talk to their managers, becase the conversation goes like "if you know this so much, why don't you.... <a description of something results in more work that goes outside their JD>" and managers don't want to talk with engineers because "if you are you so interested, why don't you.... <a description of something results in more work that goes outside their JD>"

From this progressive distance between managers and engineers comes the "manaegment consultant". Management consultant have the upper management given flexibility of going back and forth between engineers and managers. They can have conversations with full flexibility but they are not bound to "why don't you...." phrases. They can talk with anyone and submit a report and take home 1 years worth of salary of managers/engineers in 1 month.

The conversation gap between product and business where management consultants come in. And the funny thing is that, management consultants target those "I don't want to but I should" work things and report to the upper management. They can do this so well, because they are not burdened with the "work" part.

Seriously, if you do some introspection, you will see there is plenty of things you know your company should do, but you don't want to voice them because it results in more work and in fact more risk. There comes a "good" management consultant who will discover those things and report to upper management who will create the system to get those jobs done.

That is my pitch if anyone wants a management consultant hire me. I am going to tell them why their company sucks in 20 different ways with 18 of those points being generated by ChatGPT.


Man, write drunk, but edit sober.


Off topic, but I had to follow up. It seems that "Write drunk, edit sober" is mis-attributed to Hemingway, who advised writing sober [1].

[1] https://www.writersdigest.com/be-inspired/did-hemingway-say-...


My apologies. I agree. How long does it take for Ketamine to wear off? I only had a little bit after breakfast. I hope the edit button stays.


Your horse tranquilizer-addled coworker seems to be expressing a few points about the workplace dynamics between engineers and managers. First, he believes that while there may be exceptional engineers who are also good with people, these engineers are generally not interested in managerial responsibilities. Second, he observes a communication gap between engineers and managers, where both parties avoid taking on additional tasks outside their job descriptions. Lastly, he argues that management consultants bridge this gap by identifying issues neither party wants to handle but should. He concludes by saying that he'd make a good management consultant because he can spot numerous ways a company could improve.

If all else fails, the LLM revolution will at least allow us to make sense of ketamine-induced rants on management.


Maybe it was done on Ketamine, but the points are valid. Have seen it, consultants don't really bring 'new' or 'creative' solutions, they just help move the ideas around the calcified layers in the organization.


just said in 100x words...


Is the post that bad?

What does it say about me if I didn’t think that it was that bad?


It is an honest and unfiltered take.

My theory is that honest takes should be written on first take without revisions and without edits. The moment I massage a statement to be more coherent I am compromising on my honesty.


It’s a bit long for the TLDR crowd, and it was passionate.

But no, it was a good post and the cultural expectation to keep things shorter and more buttoned up has some real downsides.

I would have written this with more punch, but, well, see above.


The new tasks people get from talking to each other are usually well within their job description. They are just new tasks, and neither developers nor middle managers are allowed to drop useless tasks just before something valuable appeared.

Either way, in my experience management consultants just add new useless tasks for everybody on that set. I have never seen them actually decreasing the number of tasks.


Needs an /s.


Somewhat agree, I know LLM have boosted my programming output mostly in writing jsdocs and pr descriptions. The things I don't really like doing


If your docs and PR descriptions can be generated off file diffs everyone's time could be better spent scanning the diff to come to the same conclusions.

Consider using your PRs and docs to capture the answers to the usual why questions which LLM won't be able to do.


Ah yes, but that would require actual effort, and in the end is only going to serve to improve someone else’s model.


The why is largely in the ticket and the what in the pr.


I've seen code bases survive three different ticket management systems. Meanwhile, the tickets never made it between the different systems, so if the 'why' isn't in the commit message, then it got lost to time.

I will admit that a lot of the really old decisions don't have much relevance to the current business, but the historical insight is sometimes nice.


Huh, your tickets aren't just a single vague title sentence and no description body?


Sometimes this is the case but most tickets have a detailed info of the bug or links to a confluence page of design specs.


Agreed: the study only shows that BCG consultant's work is 40% noise without real added value... I guess that customers should now ask for a 40% rebates !!! ;-)


Says more about how people will parrot the same phrase over and over for anything at all. It's just funny how you can predict a comment like this in every thread regardless of what it does.

"It says more about [insert]" anytime GPT does something just makes the phrase lose all meaning. Surely you have something meaningful to say?


Often effortposts aren’t worth it because someone will come along and Gish Gallop the post with opaquely nonsensical bad-faith counterarguments that are a lot of work to refute.

I agree with you in an ideal world, but sadly this isn’t one.


As I understand it, they have a very specific purpose. The customer needs someone to blame in making difficult decisions. The difficult decision process itself is secondary.


Yeah, that was my thought too... alternative headline: "ChatGPT-4 significantly decreases the need for business consultants".


So when AI is better at humans at everything, the takeaway will be that humans weren't so great after all?


perfect tool for a consultancy: take a fresh graduate, pair it with a LLM tool and charge big bucks. not much different from current but the client will get a much more confident consultant and will be happy to fork more money.


And how even more useless they will be in the near future.


Not surprised. It's frighteningly good, and a perfect match for programming.

I often ask GPT4 to write code for something, and try if it works, but I seldom copy and paste the code it writes - I rewrite it myself to fit into the context of the codebase. But it saves me a lot of time when I am unsure about how to do something.

Other times I don't like the suggestion at all, but that's useful as well, as it often clarifies the problem space in my head.


I used ChatGPT yesterday for code for the first time.

I gave it a nontrivial task I couldn’t google a solution for, and wasn’t sure it was even possible:

Given a python object, give me a list of functions that received this object as an argument. I cannot modify the existing code, only how the object is structured.

It gave me a few ideas that didn’t quite work (e.g modifying the functions or wrapping them in decorators, looking at the current stack trace to find such functions) and after some back and forth it came up with hijacking the python tracer to achieve this. And it actually worked.

The crazy thing is that I don’t believe it encountered anything like this in its training set, it was able to put pieces together which is near human level. When asked, it easily explained the shortcomings of this solution (e.g interfering with the debugger).


> The crazy thing is that I don’t believe it encountered anything like this in its training set, it was able to put pieces together which is near human level. When asked, it easily explained the shortcomings of this solution (e.g interfering with the debugger).

I have seen similar things. So, no, it's not regurgitating from its training data-set. The NN has some capacity for reasoning. That capacity is necessarily limited given that it's feed-forward only and computing is still expensive. But it doesn't take much imagination to see where things are going.

I'm an atheist, but I have this feeling we will need to start believing in "And [man] shall rule over the fish of the sea and over the fowl of the heaven and over the animals and over all the earth and over all the creeping things that creep upon the earth"[1] more than we believe in merit as the measuring stick of social justice, if we were to apply that stick to non-human things.

[^1]: Genesis 1:26-27, Torah


The published article is not at all about programming tasks but about generating text for "strategy consultant".

Some example found page 10 of the original article:

   - Propose at least 10 ideas for a new shoe targeting an underserved market or sport.
   - Segment the footwear industry market based on users.
   - Draft a press release marketing copy for your product.
   - Pen an inspirational memo to employees detailing why your product would outshine competitors.
Nothing of real value imho.


> Nothing of real value imho.

Without the right target market, business model, and effective methods to reach customers, the most brilliant pair of shoes or piece of code can be useless (unless someone works to repurpose them as art or a teaching tool).


Each question is too generic and there was apparently no specific input data to act upon. How a valid business model can be expected in those condition ?


I’ve also found the act of describing my problem to GPT4 is sometimes just a helpful as the answer itself. It’s almost like enhanced rubber duck debugging.


So true. I've written entire prompts with several lines worth of explanation, only to realize what my issue was and never hit the "send" button. Guess I should do that more often in life, in general


We need an inverse GPT4-style LLM that doesn't provide answers but instead asks relevant questions.


GPT4 can do that too. Just show it something (code or text) and ask it to ask coaching questions about it.


I have tried adding prompts like this and it works really well. "Rather than giving me the answer, guide me using questions in the Socratic method".


This is one step removed from "try different things until it works" style of programming.

Not to say you're of of those programmers, but it certainly enables those sorts of programmers.


And what's the harm in that? That's how I first started out programming decades ago.


Absolutely bonkers algorithms that no one can make sense of unless they dedicate time to study and debug it

Also, it will have to be scrapped when anyone wants to tweak it a little. Sure monkeys randomly typing on a typewriter will eventually write the greatest novel in existence... but most of it will be shit

May $entity have mercy on your soul if the business starts bleeding tons of money due to an issue with the code, because the codebase won't


perhaps the difference is that attitude is fine when building hobby or saas apps adding no real value to the world however that's not the type of behavior we'd expect to see for engineers having responsibility for critical systems dealing with finance, health, etc.


It's a hell of an articulate rubber duck!

https://en.wikipedia.org/wiki/Rubber_duck_debugging


Beware of that practice. If for some reasing you are get used to it too much, one day you may not have and you won't know where to start to write a function yourself.

It's simlar to what happens to people who knows a language (not coding language), stop using it or go back to use translator, and when they need to use it themselves, they are unable.


Having been a consultant, what strikes me about this is the next, to me seemingly obvious question: What if you just removed the consultants entirely and just had GPT-4 do the work directly for the client?

If you’re a client and need a consultant to do something, you have to explain the requirement to them, review the work, give feedback, and so forth. There will likely be a few meetings in there.

But if GPT-4 can make consultants so much better, I imagine it can also do their work for them. And if you combine this with the reduction in communications overhead that comes from not working with an outside group, why wouldn’t clients just accrue all the benefits to themselves, plus the benefit of not paying outside consultants or dealing with the overhead of managing them?

This is especially the case when the client is already a domain expert but just needs some additional horsepower. For example, marketing brand managers may work with marketing consultants even though they know their products and marketing very well. They just need more resources, which can come in the form of consultants for reasons such as internal head-count restrictions.

Anyway, I just wonder if BCG thought through the implications of participating in this study. To me it feels like a very short step from “helps consultants help their clients” to “helps clients directly and shows consultants aren’t really necessary.”

Especially so if the client just hires an intern and gives them GPT-4.


Companies like BCG and McKinsey are mostly about liability, as a CEO you call them, pay them the big bucks, have them make up plans and strategies, if it works out you get the credit, if it doesn't then well "we worked tightly with experts from McKinsey, etc. so the blame isn't on me"


The frustrating one is when you've been telling management something for months (if not years), and the consultant comes in, and their report says what you been saying, and only then does the company finally do what you've been saying all along! Coulda saved the company 5-figures just listening to me. sigh politics.


Not sure why this would frustrate you.

People have ideas all the time internally. I'm going to assume the idea you had was one of many.

The issue is getting the real decision makers to buy into it. They aren't going to take the word of someone who works in some division. They want some rigor to it.

Bringing in someone who isn't tainted by the groupthink of the company, can actually take a sober view of the situation, has puts some weight to the recommendation.


Why is that frustrating? I find it validating


It's frustrating because you don't get money or credit for the idea.


Yeah, but I wonder if it’s even more powerful to say, “we asked the world’s most powerful AI and it recommended that we lay off 20% of our staff, while ensuring we treated them all fairly.”


That's probably what they say to each other... Just looking at the garbage produced by movie studios recently you can't not ask yourself if the scripts are AI generated... And that's just the scripts let alone those crazy budgets that still produce movies that look like N64 games.


HN is so bad at predictions. Just a few months ago HN was awash with comments that confidently claimed LLMs were no more than stochastic parrots and unlikely to amount to anything.

> I can't help but think the next AI winter is around the corner. [0]

Yeah, right.

[0] https://news.ycombinator.com/item?id=23886325


Who claimed management consultants are not stochastic parrots?


That comment also said:

>If we're looking for a cost-effective way to replace content marketing spam... great! We've succeeded!

And if you read the article that’s almost exactly the level of output that we’re talking about.

   - Propose at least 10 ideas for a new shoe targeting an underserved market or sport.

   - Segment the footwear industry market based on users.

   - Draft a press release marketing copy for your product.

   - Pen an inspirational memo to employees detailing why your product would outshine competitors.
Also for the 2nd task the non LLM group performed significantly better.


I'm not sure this paper is proof of much? Regurgitating press releases is sort of a stochastic parrot task.


Being bad at predictions is ok. It's the absolute lack of re-calibration that does me in.

If you make a hilariously bad prediction then that tells you your model about that thing is off and needs correcting.

So if you do nothing to that model and still make predictions...


That comment is going to go down as a HN hall of fame like that guy who said Dropbox is trivially replaced by rsync or something.


Probably. I have a habit of bookmarkung predictions I see on HN to revisit them later and most of those related to LLMs are aging very badly.


The claim that large language models would replace software engineers this year is aging very badly.


Indeed. If you have a link, I'll bookmark it.


We’re going to have legit AGI that can outperform humans in every way and HN will still find something to complain about. I love the tech news on here, but the constant cynicism on everything is exhausting.


“It’s AGI, but does it really understand anything? And I can’t even load the AGI info my Linux mainframe — how useless . Just another crypto wave.”


Have "AGI" outperform truck drivers first. They said autonomous trucks would replace all truck drivers by 2018.


Wrong animal. Stochastic horses.


There is a lot of office work that will overtime be optimized over time using gpt like services. I was tech savvy enough to know that a lot of office work that I do is repeatable and can be done using scripts but not good enough to write those scripts myself. Using Chat gpt allowed me to write those scripts it took me I think 15-20hrs to get the scripts working perfectly. I knew just a little bit of python scripting did not know anything about python pandas or xls writer etc but was able to create something that saves me I would estimate 20-25 hours a week.

In my opinion a lot of people here on hackernews as they are themselves good at programing underestimate how services like chat gpt can open a new world to non programmers. They also probably make the non inquisitive learn less. Previously to learn how to stop multiple snapd services using a script I would have googled and then cobbled together something today I just ask chatgpt and get a working script in less than a min.


Couldn't agree more. I've gone multiple times now from "I wonder if X is possible/how would you do X" to hacking out a crude proof of concept to a problem that I wouldn't even know how to google.


Two things mentioned in the abstract that are worth pointing out.

> For each one of a set of 18 realistic consulting tasks within the frontier of AI capabilities

They specifically picked tasks that GPT-4 was capable of doing. GPT-4 could not do many tasks, so when we say that performance was significantly increased this is only for tasks GPT-4 is well suited to. There is still value here but let's put these results into context.

> Consultants across the skills distribution benefited significantly from having AI augmentation, with those below the average performance threshold increasing by 43% and those above increasing by 17% compared to their own scores

Even when cherry-picking tasks that GPT-4 is particularly suited for, above average performers only increased performance by 17%. This increase is still impressive, were it to be seen across the board. But I do think that 17% is a lot less than some people are trying to sell.


Hmmm. Perhaps below-average performers are more likely to take GPT output at face-value, being less competent to review and edit it. And above-average performers are more likely to hack the GPT output around, because they're confident in their own abilities.

Therefore below-average types will produce finished output more quickly; and this was a time-constrained test, so velocity matters.

ChatGPT is very good at waffling, and marketing-speak and inspirational messages are essentially waffle. IOW, the tasks were tailor-made for unaided ChatGPT, so high-performers were penalized.


You're underestimating because it compounds. Small gains in efficency lead to huge advantages in long term growth. 17% would be absolutely monumental improvement.


yet insignificant compared to how full automation would look, not 1.17x but 1000x

but I can't find any automated AI tasks of critical importance, they all need human support


Pipe /dev/random, transform to decimal, and you just got an amazing increase in performance for calculating decimals of Pi. Nobody said precision was important anyway.


Honestly if you don't care about precision, /dev/zero is going to give you more throughput. Plus, I personally guarantee it's correct to within an error margin of 4.0. You can't offer the same with /dev/random!


I always wanted the minor number of the device /dev/zero uses to select the byte you get, so if you go "mknod /dev/seven c 1 7" that would make an infinite source of beeps!


Reminds me of the study that found a massive change in GPT's proficiency at identifying primes.

It switched from always guessing composite to always guessing prime. Much less accurate.


What do you mean?


We're not trying to hit a comet with a rocket here. 1 significant figure is more than sufficient for an initial consultation. Any additional accuracy required would be billable follow-on work.


More details in this blog post by a Wharton professor: https://www.oneusefulthing.org/p/centaurs-and-cyborgs-on-the...

My questions to naysayers:

* Do you or anyone you know use GPT-4 (not the free GPT-3.5) to do productive tasks like coding and found it to help in many cases?

* If you insist it’s useless, why do millions of people pay $20 a month to access GPT-4 and plugins?


Yes, GPT-4 is great for doing “boring work” and allows me to focus on the “fun work”. You still need to know what you’re doing though, you can’t blindly copy and paste.

And for the second one, although I am paying for it too, this idea is more or less flawed nowadays. Utilization is a very hand wavy thing when it comes to this stuff. Like a purse, millions would pay money for it, some even pay thousands. But I have no use for it and wouldn’t even pay a $1 for one.


> You still need to know what you’re doing though, you can’t blindly copy and paste.

Agreed.

> Like a purse, millions would pay money for it, some even pay thousands.

Expensive purses have intangible value for some. They are often bought to signal social status.

I'm pretty sure a significant portion of ChatGPT Plus subscribers are paying because it can help them with information or cognitive work that some people value.


Consumer behavior around monthly subscription services that can be cancelled at any time looks very different from behavior around one-time luxury purchases.


I have free access to copilot because I do some open source work. I haven't been impressed by what it can do and I wouldn't pay even $3/month to use it.

The second question doesn't make sense to me. There are tons of things I think are useless (or worse) that people pay for anyway. Meal kit boxes come to mind, and at least you can eat those at the end of the day.


Have you spent any time learning how to use Copilot?

Getting great results out of it takes a lot of experimentation and practice. I wouldn't want to give it up now I've learned how to use it.


I've spent some time using it and experimenting with different prompts, but to be honest it's hard to be motivated to spend more time on it given the disappointing results so far.

I need to see a glimmer of it being useful before I decide the investment is worth it, I guess.


Are you using it with Python 3?


not really, typescript and Rust mostly


Obviously anecdata, but:

1: I’ve been scripting for 5 years using Python. I purchased a subscription to use GPT4 to see if it could assist me.

In the end it took me more time to fix its mistakes than to just apply my knowledge of knowing what to Google and reading docs.

Additionally the largest hurdle I encountered was when it hallucinated a package that didn’t exist and I spent time trying to find it.

2: I don’t know about most people but I’m terrible at cancelling services that are “cheap”. I used ChatGPT for a few hours that first month and didn’t cancel it for another 5 months.


My prediction? In about 6 months, every test, task, or use of a LLM for anything that requires a modicum of creativity is going to find that it only has a fixed set of "ideas" before it starts regurgitating them. [0] I can easily imagine this in their hypothetical shoe pitch question, and many models going for more factual answers have been rapidly showing this bias by design.

[0] https://www.marktechpost.com/2023/06/16/this-paper-tests-cha...


I'm very unimpressed by that study. Look at how they generated the jokes - they fed it a prompt that was a slight variation on "please tell me a joke" and then wrote about how the jokes weren't varied enough.

https://github.com/DLR-SC/JokeGPT-WASSA23/blob/main/01_joke_...

That's a bad way to use an LLM for joke generation.

Try "tell me a joke about a sea lion" - then replace sea lion with any other animal.

Or "tell me ten jokes about a lawyer on the moon" - combine concepts like that and you get an infinite variety of jokes.

Some of them might even be funny!


Can confirm. I popped the 20 bucks for GPT4, and have been using it more and more, every day for 3 weeks. Not sure how I can get by without it now. It's just so easy to have normal conversation and get answers. Like having an expert friend across the hall you can just shoutout questions, and ask for simple reminders, recommendations.

Who cares if it gets things wrong sometimes, you would double check your co-workers answers also. And there are times when I insist I am correct, and GPT will argue back and eventually I find I was wrong.


Maybe this tells more about BCG consultants than its does about GPT-4?


That's what you would like to think, isn't it? I'm afraid this would be just as much true with any other kind of subjects, and as far as I know, there's no evidence either way so this is just a cheap stab you're having at them.


After all the cheap stabs I had to take as a programmer... I allow myself to experience schadenfreude, even if there is no evidence...


Meh.. I mean a lot of consulting is tasks like writing or idea generation. Using something like chat GPT to do it [faster, better] doesn't negate the value in what they do, since they are hired to do those tasks, those tasks are required for the broader work.


I bet early search engines had similar or even better figures under similar conditions.

I suppose this because I recall how much search improved my productivity over flipping through books and I know how for certain tasks ChatGPT is a better source of knowledge on how to do it than search. While often the GPT output isn’t entirely correct, more often than not it suffices to make the correct solution obvious thus saving a lot of time.



Guilty! GPT is the best colleague I ever had, but boy does it speak. You can't just copy paste, but if you consider its responses as input I find myself less dependent on other senior consultants sharing their insights. It also makes me more confident in my assessments and deliveries.

Purpose of technology is to enhance our performance, GPT is very much doing so - but with great powers comes great responsibility.


This is a good thing, since increased perfomance means that the clients will have less billed hours, right? Right?


No, it increases the load one can successfully manage in a day. There isn't this tiny discrete amount of work that people need to handle. We gave that up when we left the campfires. We're trying to grow.


BCG : We know layoffs are in fashion and we'd just like you to know that if you need industrial grade ass covering excuses from a legitimate-ish sounding authority to justify what you were planning to do anyway, our 23 year old consultants and their PowerPoint presentations have got you covered.


If this is how so called consultants use AI… they should be very concerned. A moderately skilled intern with GPT Enterprise connected to data will make them quickly obsolete. Maybe they have some potential building their own fine tuned model but surely they will screw that up


So useless BCG consultants were faster in delivery bullshit with ChatGPT? That's impressive.


Didn’t have many interactions which BCG so far but in both we had, I was surprised at how much money they get for reshuffling information from what is all common knowledge and available in the net. I can see that this is something LLMs can do really well. It’s exactly the kind of “creativity” LLMs can do: “apply concept X to market / niche Y and give ideas on monetizing”.

I don’t blaim BCG for doing this, they are giving an outside view and political uninfluenced (except for the party that pays the tap) view.


The output of many professions is bag-of-words emotional persuation. eg. politicians, consultants, sociologists, psychologists, writers, economists, tv talking heads, media in general.

A characteristic of these professions is that there is no accountability for output they produce. It is not like a profession that builds an engine for a car. They can bullshit with confidence and get away with it.

chatGPT will replace all of them - as chatGPT itself can bullshit with the best of them.


Nonsense career threatened by nonsense generator. Beautiful.


... for a set of tasks selected to be answerable by AI

Also access to AI significantly increased (!) incorrect answers in the case where the tasks were outside of AI capabilities.


I was sort of wondering this with the latest (I think now resolved) writer's strike. The union wanted reassurance that they wouldn't be replaced by AI; however, if I was the studios, I would have said `sounds good` - knowing full well that the union members will likely be turning to it. Unless the union polices its members, the appeal to use it is just too high.


Where ChatGPT could excel is early education learning where the ideas are simple and universally agreed and written online. As you go higher level the chance of hallucinations becomes higher and you could be taught the wrong thing without knowing the risks


This only really confirms what we already know. Business consultants are useless.


Funnily enough, as a business consultant I use GPT to create executive summaries and sell people on the idea that my reports are as short as they possibly can be without information loss.


In other words, companies can replace consultants with GPT-4.


They might as well. All they do is repeat what a decent manager has been telling them, verbatim usually, get paid a shit tonne of cash, and then walk.

Absolutely zero add value in experience. The only add-value is the consultant overcoming the hearing deficiency of the Director involved.

ConsulatancyGPT: Feed all internal opinions of a company into an LLM. Ask for the a recommendation. Done.

/rant.


"The study introduces the concept of a “jagged technological frontier,” where AI excels in some tasks but falls short in others."

D'oh.


I always wondered if some of the biggest fear mongers against GPT are those who worry they’ll be outed as frauds.

If your job is to generate nonsense… well…


Dumb Offtopic question. Is there any way to ask gpt4 to summarize an article online?

I tried giving it the url and it was a disaster. Is there a plug-in?


Let the Turbocharged En-Shitifications commence.


Take some text and dump it into ChatGPT, and ask it to make it more formal.

Sounds like the same text in the average deck...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: