Hacker News new | past | comments | ask | show | jobs | submit login

I would say once people realize that AI answers and solutions are frequently “convincing but wrong.” AI is amazing, but when I try to use it for anything technical in my field (real estate), I quickly see that it is more than eager to tell you the wrong answer. It’s like reading a consulting intern’s analysis and recommendations. I will keep trying it, but I’m not sure at what point I’ll decide to rely on it, thanks to the garbage output I’ve come across.

I would guess, as with self driving, the initial sprint to 95% functionality is low hanging and requiring only a few 100 billion, but the last 5.9999% that makes it more usable than a human expert is a several more trillion dollars of spend away. That human expert only costs me $250,000/yr and I can hire him by the 6 min increment. We will get there though.




Personally, it blows my mind that there are people out there who haven't experienced LLMs writing paragraphs of polished flowing prose that are ultimately straight up wrong. Personally I only use LLMs for option/context discovery and to refine my own thinking, similar to how I'll bounce ideas off of friends. But even there they tend to get stuck on emphasizing obscure less-popular ideas. So it's more like checking with one friend who is happy to geek out on every topic, but has ended up going down some weird rabbit holes on many.

Still I don't think this is going to be enough to kill the energy for it (humans are confident and wrong quite often as well). But this should be guiding how it's adopted and regulations surrounding it. For example, it shouldn't be used to make any decisions itself, rather just helping a human who is ultimately responsible for applying judgement, with their decisions necessarily rebuttable/appealable. Although given our past several decades of "computer says no" and "policy says no", I'm not so hopeful on this front.


It is a very interesting idea generator, in terms of kicking ideas around with friends.

I think the sticking point is that, continuing with real estate, a very programmatic task I encounter is 'write a contract to purchase XYZ property at ABC address with the following requirements...' Not too different from spitting out some lines of code. An attorney does that for me, he will start with an old contract we've used in the past, and it takes his team 30 minutes to an hour, at a blended rate of $500/hr. So that starts to set an upper bound on the value of work to be performed, in one of the higher per-hour cost lines of work out there.

When you really drill down on the question of "but how will it generate value," it is relegated to being an efficiency tool for many and a replacement for google search for most. The smart will get more effective and powerful with better information, the less intelligent will be running off inaccurate or old info.


Convincing and wrong is pretty damning. I think people fail to appreciate just how trustworthy our computing systems have been till now. You don’t question the results of your spreadsheet. You don’t worry that a word was replaced in a communication with your lawyer. We’ll soon stop tolerating AI that is less than 100% accurate AI.

I predict an AI product bloodbath as users and companies remember that we hold computers to an extraordinarily high standard of accuracy.


A "consulting intern". That's a beautiful way to put it; it shows the problem perfectly. If I need someone to consult with, I don't pick an intern.


A very spiffy polished presentation that was very expensive and... partially correct?


I don't think so for several reasons:

1. That's fairly obvious to people that have used it. I mean I guess there are a lot of people that have never used ChatGPT and think it doesn't bullshit, but those people aren't behind the AI boom.

2. There are loads of applications where 100% accuracy isn't required (even if it would be nice!). Obvious example is GitHub Copilot. It saves me a ton of time overall even if I often have to fix its mistakes and bullshit.

3. I imagine like half of the AI research community is working on fixing this. And they don't need to get it to 0% bullshit, just less bullshit than humans (which tbf is sometimes a low bar!)


I dont dispute the incredible capability, I'm simply concerned that true value creation that is being wildly overblown.

Your 2nd example is a handy one to pick through. What % of your time is spent writing code compared to other work tasks? How much of that time did Copilot save? How much time do you have to spend fixing and validating (be honest, as if your quarterly bonus depends on it)? Whats your hourly rate? Whats the real cost of the high quality, cutting edge LLM (not the VC subsidized 'price').

And then.. does that mean you get to go to lunch early or didnt have to stay late? (no savings to your company). Or did it knock out a week's worth of work? What prevents Copilot from taking over your role completely ( I doubt it is anywhere close to that, you likely do a lot more than type out code, still requires the correct input and output validation, understanding goals with nuance and uncertainty)

So then in real dollars, it's a useful, potentially very expensive tool. which is great. Is it $48B great (for google)? Maybe! But thats a big bet!

I admit I do not understand the concept of researchers programming their models to be more accurate as questions progress from easy to esoteric, from 2+2=4 to "Should I avoid drinking corn syrup?" (ahem advertisers would like a word!) "which religion is better, A or B?" I dont see a way for AI to be unbiased and clean, and therefore trustworthy and useful at face value.


> What % of your time is spent writing code compared to other work tasks?

Probably 50%?

> How much of that time did Copilot save?

Unfortunately I can't use it at work, but I use it in my free time and I'd say it increases productivity by something like 10-50% depending on the task. 10% is more common of course, but given how expensive developer time is a Copilot subscription is worth it if it increases my programming productivity by like 0.5% which it easily exceeds.

> And then.. does that mean you get to go to lunch early or didnt have to stay late? (no savings to your company). Or did it knock out a week's worth of work?

My company would reap most of the benefit because they are paying me hourly.

> What prevents Copilot from taking over your role completely ( I doubt it is anywhere close to that,

Yeah basically it's nowhere near smart enough. Needs several orders of magnitude more intelligence before it could fully replace me. At that point I think society will have bigger problems because it will have replaced 90% of white collar jobs in general; not just programming.

> I dont see a way for AI to be unbiased and clean, and therefore trustworthy and useful at face value.

This also eliminates humans.


All great points, thanks.

> Needs several orders of magnitude more intelligence before it could fully replace me.

I think this is true for many applications of AI, where everyone assumes everyone elses job is screwed, but the expert looks under the hood and says 'well isn't that cute. delete it and start over.'


>And they don't need to get it to 0% bullshit, just less bullshit than humans

Less bullshit than humans is a terrible goal.

If you take the real estate industry examples, you want less bullshit than a really good real estate reference book, not less bullshit than random people on reddit.


I didn't say which humans. Humans wrote that real estate reference book.


>I didn't say which humans.

"which tbf is sometimes a low bar!"


Yes, sometimes. I'm not sure what your point is.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: