Hacker News new | past | comments | ask | show | jobs | submit login

I know it's controversial during this incredible AI hype cycle on this site, but as a AI skeptic and increasingly neo-Luddite, I've reduced the amount I spend on this site by 90% since the start of the year.

If only I could add my userscript on my iOS Firefox to just hide any post containing the words "GPT", "AI" or "LLM", but alas, I guess I will have to be productive instead of slacking off reading HN.

I honestly miss the Bitcoin hype era, or the short lived Erlang circlejerk that had pg himself put a stop to it. This one does not seem to die down anytime soon, and I'd like to read more criticism of the technology, rather than starry eyed engineers excited to add ChatGPT to their smart fridge and bird feeder.

EDIT: Funny. 1 hour ago when I wrote this comment, this post was in 3rd position. Now, with 82 votes and 103 comments, it's nowhere to be found in the first 10 pages. @dang, what's going on?




I think the difference between those other hype cycles and this one is utility. I found Bitcoin very interesting (Erlang as well, still do) but not very practical in my day-to-day life. Unless I'm going to be hired by a firm that specializes in that tech, they're just not very useful. Nerdy and interesting and fun yes, but I just don't need them in my day-to-day life.

On the other hand ChatGPT 3.5 has become immediately useful today. It's already saved multiple hours of time doing nonsense work, or digging around for an answer to that one problem I have. It's succeeded when Stack Overflow, Google, Reddit, and the Official Documentation™ failed me.

That's incredible!!

And like a car I don't really have to know or care how it works (traditionally I didn't care at all about AI tech). Sure, it's a new skill (like driving) that takes some practice to do safely and effectively, but I don't have to know the details of how it works. Just how to use it.

That's why this hype cycle is different, that's why it's not going away. It's useful, and we are collectively figuring out how to use it effectively. We do that through collaboration and communication and HN is one of the key points where ideas are shared. To put a stop on that now would be insanely short sighted. The criticism, post-mortems, retrospectives and "AI is dead" articles will come in time.


Yeah, more or less in this boat too. ChatGPT does things. Useful things. "Crypto" has some clever ideas, but always seemed too full of utopians and grifters and people with a solution in search of a problem.

I've used Erlang professionally and like the ecosystem very much.

I am a bit annoyed with myself for still not having a better understanding of how these LLM's work, because that's important, at the end of the day for evaluating where they are best applied, and where danger lurks.

But they are clearly going to be a game changer. Maybe more, maybe less than some people think, but I've seen enough hype cycles to think there is some substance here.


I feel the same way about not understanding how LLMs work. I've found Hugginface's tutorials a good place to start: https://huggingface.co/course/


It may be that the game has already changed and people are just starting to figure out the new rules. Im looking forward to continuing to slap my forehead in these "wish i had seen that low-hanging fruit" moments.

Suddenly being able to largely automate so many formerly human-only tasks using plain language instructions makes me feel like the nature of the computing experience itself may be evolving profoundly. It has been said that technological progress generally shifts people from being tool users to tool managers, and these language models certainly fit that trend.


“Suddenly being able to largely automate so many formerly human-only tasks using plain language instructions makes me feel like the nature of the computing experience itself may be evolving profoundly.”

Can you provide some examples of the so many tasks that these LLM’s are performing for you that a human was needed before?


Figuring out utility doesn’t bother me. The hype that bothers me is applying it to random shit it should not be yet (eg using it to write anything more than starter code). I’m also bothered by the absolute fantastical BS lay people are being fed about the possibilities. Like the problem with ChatGPT is the scams. Having it generate “offensive” material is a joke - not only is it trivial to bypass any/all protections OpenAI tries to put in place, the protections themselves are laughable trying to stay away from any possible thing that could possibility generate any controversy for anyone anywhere.


It does feel a bit like Bitcoin hype, in that I know there's something wrong with it but I can't quite put my finger on it or explain it. But this is why it's also likely to last a long time, just like Bitcoin did, because it's so enticing and yet so difficult to articulate exactly what it is that's wrong. I'm at that level where I'm just smart enough to know something is wrong here, but not smart enough to say what that is exactly.

I know that I can use ChatGPT for things where it doesn't matter much whether it contains factual inaccuracies. Where all I need is a lead, like the same things I would use Wikipedia for. Where the failure mode definitely doesn't lead to anyone being hurt (or fired). For work where the results matter to someone's career or health, yeah I would avoid it.


Yeah I think this is a good take. There's obvious usefulness to bitcoin, and I think there's even more obvious usefulness to ChatGPT. But they've both been adopted by hype bros, who are looking to make a quick buck off of 1) the popularity of it, and 2) the fact that most people don't really understand it.


Partially, it is hard to "prove the negative" in a way.

The real issue and logical fallacy is people are frontrunning the argument: they are making extraordinary claims of either the current capabilities of the tech or future capabilities, without offering evidence beyond "it's obvious!" or "have you tried it?" or "use the latest model, it's much better!" etc.. and maybe they don't realize it, but these are not good-faith conversations/arguments.

So, you can't respond to these - there really isn't anything to respond to, and the person is not conversing in good faith (they are assuming you're unfamiliar with the tech, they are assuming a bunch of things, really), though I do give them the benefit of the doubt and would say they aren't doing this to be trolls, they're just excited about the tech.

Same with BTC, but BTC had some extra benefits: with BTC it had a "dual use" really, the currency made promises "of course BTC is going to take off, it's obvious!" and the tech made promises "wow, can you imagine the uses of a blockchain!?"

I mean, I had my CTO still saying in 2023 that we should use blockchain tech for a CENTRALIZED platform. Unbelievable!


I know exactly where you're coming from. I've tried to avoid the hype train but I still feel the hot wind from it because I literally had my manager ask me if I could use ChatGPT to do my last project faster. I was astounded. It's tough when you know a dumb idea is not going to work, but now you have to articulate exactly why to your manager, without insulting them. Also without wasting a lot of valuable time proving something won't work, when you could be using that time to make actual progress on the project.

I feel like a lot of valuable time is going to be wasted by engineers trying to prove to managers why ChatGPT can't do a certain thing they want done faster. I don't believe ChatGPT will be a net negative for the world, but it won't be as positive as it could be exactly because of the hype effect.


One thing I learned, is you probably should not come across as someone who says "no" a lot to their manager - if you like working where you do in general. Ask open and positive questions, like "that could be a good idea, where do you think we can use it?" (or find even a better phrasing for this, managers are often very defensive, especially when they are not very skilled) rather than "how do you think that would work?" which is often my first response... hah!


I’m sure this is going to make you laugh and/or annoyed, but what about Bitcoin is exactly “wrong?” Curious if after all these years you’ve found the words.


GPT can tell you what's wrong with Bitcoin:

You throw 25 grand on a Bitcoin bet, The dough lands in a chest, smoky and wet. You've got your Bitcoin, feelin' alive, But the chest has plans, man, it's a damn beehive.

It pays the skeptics, those withdrawing in fright,

The electric bill's covered, in the dark of the night,

A Lambo, a threesome, a few folks living in sin, As the value hits 27 grand, you think it's your turn to win.

But then comes a big fish, an old-time player, He dumps his coins, like a dirty betrayer.

The price dives to 20 grand, you can't help but feel, Thousands like you are stuck in a rotten ordeal.

You scramble to withdraw, but the site's a damn mess,

The price keeps on fallin', and the chest reveals its emptiness.

The truth is, it was never full, not even close, Now you're left with your Bitcoin, feelin' morose.

A voice in the shadows tells you, "It's not about the bread, Bitcoin's a different way to pay, don't let it go to your head. Hold tight, and embrace your virtual piece, The future of money might just find its release."


We need a “!RemindMe 10 years” feature like they have on Reddit.


No, we don't.

Put the permalink on your calendar.


I love your positive attitude.


I can do anything!


Well, I wouldn't say "what bitcoin got wrong" but rather "what people who were overly hyping bitcoin got wrong"

It hasn't totally played out yet, but it doesn't seem like it is on track to be a currency, and the underlying tech hasn't proven to have a ton of uses beyond crypto currencies.

Not trying to straw man here, just trying to honestly reply to your comment.


That’s fair and I appreciate your response! While I’ve seen things that hint that Bitcoin might be starting to gain traction in high inflation nations commonly located in Africa and South America, we are a long ways out from seeing a Kroger accept Bitcoin.

Regardless, crypto bros are the worst and I hate how every bull run we see an avalanche of new coins that all claim to be the solution to all our problems when they are really just a Ponzi scheme. Bitcoin and Ethereum are the only two that I feel are worth investing in long-term, but that’s just me.


I can't really argue with investing in BTC or Eth, I probably should have for some portion of my portfolio!


It’s never too late :]!


It's not, but with the volatility of the assets I would only be comfortable putting a few % of my investments there. Unfortunately, that's a really small $ value, and the odds of my investment going up 100x+ is so low where it's not worth my time to invest in crypto :(


I can promise that it won’t go 100x, but 5-10x is still possible and would be outstanding.


I don't think 10x is worth the small amount of $ I'd be comfortable putting into it.


> what people who were overly hyping bitcoin got wrong

Did they? It's the best performing asset in the last decade. I think they got it right.


The people who were hyping it said it will be the next major currency. Right now it's literally no different to people getting excited about beanie babies - sure those beanie babies have appreciated in value by 1000s of percent, but at the end of the day they are still just stupid toys and nothing more.

So no, people didn't get it right. If anything they got it catastrophically wrong and the entire premise was hijacked by people who saw it as an excellent investment/scam opportunity.


> The people who were hyping it said it will be the next major currency.

Why do you think this prospect have failed? Do you think such a huge shift on the reserve global currency would happen in just 10 years? In the meantime, the market is pricing the likelihood of this scenario with increasing optimism.


No one said crypto should have displaced real currencies by now. But you'd think it would be used as a currency somewhere - and it just isn't, outside of extremely limited cases. Not to mention that Bitcoin specifically just can't work this way due to the way it's designed - it's destined to always be a shitty "investment" asset and nothing more.

>>the market is pricing the likelihood of this scenario with increasing optimism

That is actually very funny to read. The markets are also pricing what they want to happen not what is likely to happen. This sort of prediction is close to pointless.


So were credit default swaps until the history-based assumption they were built on went into freefall. Past performance is no guarantee of future results. Even an investment that has a real and solid foundation can fail fast and hard.


You'd have to substantiate that claim which would probably take significant work.


From 2015 to today it appreciated 8000%. Do I need to elaborate further?


Yes, of course you do. You made a specific claim.

You can't find a single other investment that would have returned higher? Very different, but maybe options plays on GME? Anything like that?

I'm sure there is one! You said it's the HIGHEST - very specific claim!

You're also picking a date period that works for you - certainly I can pick other dates that don't support your argument!


I get your point, but options aren't usually considered an asset. They do expire after all! Though now that I think about it, I'm not sure what would be the best performing asset of the past decade. Bitcoin has skyrocketed in the past decade, but I still think that something else probably gained value even faster. Maybe pokemon cards? Hahaha.


LOL or MTG cards! Some have definitely gone up 100x!


Someone else pointed out here that some Pokemon or other trading cards have also gone up quite a bit! Should we buy them, too?


I completely agree with this. The timing of this AI hype cycle is very suspicious. Like almost as soon as crypto started to go bust with the collapse of FTX, all of the sudden, here's ChatGPT and a huge inflow of cash into AI startups.

My hot take (or maybe it's not that hot) is that post-Trump, and especially post-Jan 6, the general public has really soured on big tech. Combined with the smartphone and social media growth really slowing down, and now, non-zero interest rates, the industry is desperately searching for its "next big thing". Which is why these technologies, first crypto, and now AI, are in turbo mode and speed-running the hype cycle.


> like Bitcoin did

You think it's over for Bitcoin?


It surely looks like it is getting there https://hn.curiosity.ai/#/trends?terms=Bitcoin;Web3


By that logic, your tool also hints at possible end days for Javascript.

Careful not to overfit ;)

Neat tool, thanks for sharing.


I see 3 kind of people, all common on HN:

- The luddites: it's just hype.

- The doomers: AI will kill us all.

- The AI indie hackers: dunno about AI, I'm trying to make money with it.

Somehow the intersection is right: AI is not really smart, but it will replace a lot of human activity anyhow.


For what it's worth, and at the risk of sounding like I'm nitpicking, traditional Luddism is not dismissal of technology as overhyped or smoke-and-mirrors. Quite the opposite; the technology is recognized as very real and very impactful. The question instead becomes, "what happens to us, the ones it replaces?". And at least in the original incarnation, given an insufficiently satisfying answer, you smash some looms up to try and slow the encroachment of automation.


From a historical standpoint, you're spot on.

Neo-Luddism is still opposition to a lot of modern technologies, but for a very disparate set of reasons. For example, many oppose the mass adoption of social media due to perceived mental health and social impacts. Others oppose smart phones and "screen addiction." These things can qualify as "neo-luddism" even though the opposition is not rooted in job displacement.

I've often said that I am myself becoming more and more of a "neo-luddite" but it's purely for personal reasons. I don't want to see social media or smart phones disappear as I couldn't care less about what other people do with their lives. I just find that the older I get, the less I want to use modern tech in general.

It might just be burnout and boredom. I am now middle aged and I've been coding since I was 10. I used to be extremely enthusiastic about technology but as time progresses I have less and less interest in it. The industry in which I have based my entire career just doesn't excite me anymore. Today I just couldn't care less about ChatGPT / "AI" / LLMs, Bitcoin, smart phones, video games, social media, fintech etc. In my free time I find myself doing more things like reading books, going hiking in the backcountry and pursuing craft-related hobbies like performing stage magic with my wife and partner.


I thought that this was me ("The industry in which I have based my entire career just doesn't excite me anymore."), but then I realized I actually am mostly just interested in programming the way some people are interested in any other skill or machine. Some people learn guitar because they need to make music and play out. Others just love playing the guitar, even and especially when nobody is watching, for its own pleasure.

To me, there is no greater toy than a programming language. ChatGPT is neat, but it isn't a programming language. Same for blockchain or agile or whatever other trend is happening in the industry. Some of the trends literally are new programming languages! Golang is really fun! So is TypeScript!

Some trends or technologies make programming even more fun (in my opinion anyway) and I embrace those feverishly: distributed version control, CI/CD, pair programming (sometimes it's even more fun with a friend!), configurable linters like Perl::Critic, Intellisense, JUnit-style testing frameworks. All this stuff helps me feel more in control of the computer (or distributed cluster of computers), which I've discovered is the main thing that gets me off about programming.

I'm even still hopeful that LLMs will have some role to play in my having more fun with programming. I've tried CoPilot and so far it hasn't grabbed me, but maybe this will change. In any case, there are clearly other people having fun with it so I guess that's good. Maybe somebody can find joy by debugging GPT-4 prompts the same way I enjoy pouring over stack traces.


This is completely valid and I wish that were me.

I'm a "maker." Although I try to bring a level of "craftsmanship" to my code, and I care a great deal about code quality, refactoring and solving problems at the code level - and I definitely enjoy the process - it is still a means to an end. It is the configuration of raw materials that contribute to the final form of something useful and tangible.

The most tragic part is realizing that it is very unlikely that I will ever care about what it is that I am producing in tech. I was self employed for 15 years and that was extremely rewarding because the business and the product was my vision, my creation etc. Now that I am back in the job market I find that what was a career for 20 years has become "just" a job. I am making something, and that matters, but I'm not making something I would personally use as an end-user. And that is not a slight against the things I am making. They are useful to someone. Just not to me. I have spent the last few years thinking about what it would look like to make something I myself use and that's when I realized that, relatively speaking, I hardly use any tech as an end-user in my personal life at all.


I could have written this myself. I'm currently going the scary "bootstrapped founder" route because at least the tool I'm paid to work on is something I care about and I can put a lot of my craftsmanship on.

But in general my long term goal is to go write mini Lisp interpreters in a cabin in the woods, and move away from the direction Big Tech is going.

I love computers, software, but I would not say I love technology anymore, nor I think that the Internet is a net benefit for humanity anymore. It's been quite hard to accept that my view of the tech world has turned upside down in no more than a couple years.


"[writing] mini Lisp interpreters in a cabin in the woods" sounds wonderful!


Indeed!

Though when my wife and I have talked about "going off grid" and living remote, she wanted to understand my limits and asked about electricity.

I pointed out that electricity led to the discovery of logic gates, which led to integrated programmable circuits, which led to the Von Neumann Architecture, which led to Ethernet, which led to the Internet which led to Twitter.

It's a slippery slope!


The demographics of this site are getting older (you can see a recent-ish poll on the site for some indication.) The same phenomenon was on Slashdot in the '00s writ large also. Lots of older disgruntled engineers who hated how the MPAA and RIAA controlled all of software, snooped everyone's packets, and how when they started in the industry in the '80s people took pride in shipping shrinkwrapped software. It's an interesting pathology but, like all pathologies, it becomes tiring on the site when people bring it up constantly.

Hype cycle topics seem to attract opinionated folks who have strong feelings on topics and have the need to shout them from the rooftops, whether that's doomer, booster, or luddite. That's what I find exhausting.


I feel the same. I don’t have the same kind of enthusiasm now as I do a kid.

I’ll also say, I think LLMs are different than bitcoin. It has its own killer app, and it has tremendous social impact, not necessarily positive. If anything, crypto doesn’t really make sense without AIs.

One thing though is that the foundational models are created and controlled by big tech. It’s been compared to silicon fabs and its scales of economy … However, I don’t see a future where foundational models can only be created by large orgs with a lot of resources is something beneficial for society.


You sound like me. Middle aged and tired of the bullshit.

The difference is I'm mostly tired of the social garbage that keeps piling up on top of things. The competitiveness, the pressure to "be productive", the grifters, the capitalists who put money before people, the bureaucrats. It's never enough more more more faster faster faster meanwhile the roadblocks that are put in place get bigger uglier and stickier than ever.

I just stopped participating in that stuff. Got off Facebook. Got off Twitter. Curated my Reddit feeds to be built around useful and helpful communities, not reactionary meme-ified BS. Started reading more. Started using RSS again (but again a reduced and focused subset of useful sites).

I'm feeling much better and I still find I enjoy technology. Microelectronics, 3d printing, functional programming, distributed systems.

I'm working on a cloud connected garage door opener. I don't care that it's never going to be productized. I don't care that I can go an Alibaba and order one for $15. I'm just doing it because I want to work on my microelectronics skills and I find it fulfilling. I'm doing it my way, at my own pace, without the BS.


Totally aside - but is there an excellent resource you can recommend for learning more/getting started with stage magic?


If I had to pick one "beginner resource" for the absolute newbie then it's hard to beat the book "Mark Wilson's Complete Course in Magic." From there you can decide what interests you and thus where to go next, since magic is such a broad field.

One of the reasons I love it so much is that it is a multi-skill discipline and it's really a type of theatre so it can be kept as simple and narrow or as broad and open ended as you want it to be.

https://www.amazon.com/Mark-Wilsons-Complete-Course-Magic/dp...


Thanks!


I argue that I don’t think those questions about looms have been adequately explored, and we’re still paying for that even as another wave of innovations roll in.

It’s broader than simply, that people lose jobs to automation. A much bigger societal impact comes from treating people as automatons, which is how business are incentivized to end up with non-people automations.


Agreed. That makes the AI will kill us all crowd the real luddites for this topic.


No the real luddites would the one who are attempting to kill AI before it hatches.


I am guessing people who think AI is such an existential threat will try to kill it. It would be irrational not to


You forgot about the rest of us. Developers who are interested in the subject and want to read about the latest developments and techniques. We're actually interested in the computer science.


> Somehow the intersection is right: AI is not really smart, but it will replace a lot of human activity anyhow.

Which will really show how much busywork we humans do in a lot of areas. If -- before AI is "smart" -- it can impact humanity.. well, i feel it says more about our current society foundations than it does AI heh.


Microsoft is bringing OpenAI into Office 365. I predict no loss of employment in busywork, just a bigger output of boring reports per worker.


“The bureaucracy is expanding to meet the needs of the expanding bureaucracy.”

And we thought there was an epidemic of bullshit jobs before! Not only will everyone be spending their days churning out this garbage, but everyone will be forced to read it as well!

This is not the dystopia I had in mind! It’s much more boring and missing the cool mirrored shades.


You don't read it, you just ask a bot to tell you what's in it. ;-)


Yah. I mean, a lot of human activity in employment is not really smart.

Anything that you can make into a fairly formulaic job for someone who's not really interested in doing it or improving too much is probably not very intellectually demanding (but perhaps may be a bit demanding in world perception or actuation than current AI/robotics systems).

And I believe that is "most jobs."


There's a 4th - AI is a bicycle and I am riding the bicycle more and more every day.


- The realists: what do LLM's actually do, and what are their limits

- The luddites: who is going to benefit from LLM's and who will suffer more?

- The doomers: AI will kill us all

- The children: look at this! look at me! look at this! Wheeeeee!

- The money: how many people can we pay less or not at all thanks to AI

- The scientists: ok, that's cool, but what happens if we ...


And the "true believers": AI will lead to the singularity, which we must try to accelerate by any means necessary so it can be our new God.


Also:

- The scammers - how many well-meaning people can we use an LLM to build meaningful (to them) relationships with, to the point where they're willing to just send us money?

- The griefers - how much discord can an LLM create, for the lulz?

(to be fair, those might be subsets of "realists", but I think they're important subsets)


In the circles I'm in 2019(GPT-2-era) AI was seen through the lens of the artist as a new sort of artistic tool - something that could enable new forms of expression and take media to places it had never gone before.

Sometime toward the end of the diffusion days though, the property rights scissor cut through the community. The conflict had a dramatic cooling effect on AI-created/assisted content.

I just think it's neat how the perception of AI shifts depending on proximity.


I would argue that a true luddite towards AI needs a corollary:

- The luddites: it's just hype — because I will kill it with fire before it takes over.


- the realists: it's just a stochastic parrot spitting regurgitated content back at you. Good at summarizing search results. Unreliable and will never be. A toy. Do not use if errors are expensive.


There's also a lot of faux-intellectual attention grifters who don't really have any beliefs other than whatever the local kool-aid is and will say whatever gets the number in the top right to go up because that makes monkey brain feel good.

These types of participants are the most easily simulated with AI because they're (whether they know it or not) just optimizing for a variable, something that existing AI tech is pretty good at so I expect their numbers to explode before the other three types you listed.


You can ask ChatGPT to write you a script that hides those posts for you ;)


Tags would help here, in that you could use them to filter out what you don't want to see.

I've asked many times for HN to allow tagging, but my sense is that it's never going to happen.

Another idea is to use a spam filter on an RSS feed of HN articles. I did this decades ago for my personal feed, using bayesean spam filters, and it worked well enough.

Finally, you could use an AI to filter for you.


You may enjoy https://lobste.rs then.


I would love an invite. I am not HN famous, but I have been here for a while as my account page can attest.


Don't they have a chat channel? I thought you could always just hop on the chat and ask for an invite.


I'm reading that as "no browsers on iOS support sufficiently powerful extensions to filter out individual HN posts by word".

Have you checked out any of the HN apps for iOS instead of reading with a browser? I'd expect some of them to include a keyword filtering feature.


Have you actually used it?

I can't believe how many people are totally refusing to try it but made judgements on it.

It just seems so slow and time consuming to do things like brainstorm snack or birthday party ideas. ChatGPT can make a list in moments, where with google, you get generic SEO spam.


I'm laughing because I read your comment, which I agree with and I know a lot of people do, and then immediately saw that the top ranked replies were all the predictable "this is different" posts, explaining to you how the chatgpt is not like other hype bubbles.

The most funny part is that your point seems to be around the annoyingness / ubiquity of the content, in no way a judgement on how much of a hype bubble it is, but you still get the reflexive "this time its different!!" Which is as good an indicator as any how big the bubble is relative to the long run usefulness.


Here's some criticism of AI for you:

• VENDOR LOCKIN •

It's possible that OpenAI is managing to establish a permanent competitive edge and will be the next Google. There is/was a lot of rather handwavy assumptions about open source models being competitive thanks to Stability AI and the Facebook leak, and last time I raised red flags about this on HN I got dunked on. But right now it looks more likely that LLMs are the new search engines and there won't be any competitive implementations that are both legal and that you can run yourself. It'll be APIs all the way, just like with web search engines.

Also, embeddings are model/vendor specific and expensive to compute. If you calculate 10 million embeddings using OpenAI it's going to be expensive to recalculate them all with another vendor.

• UNCLEAR PRICING •

It seems likely that OpenAI is either selling at below cost, or is at best break-even on compute. It's very unclear right now if prices are going to rise, fall or remain where they are and in fact AI prices have done all these things in just a few months. That makes it difficult to know if it's an acceptable business risk to deeply incorporate AI into your workflow.

• HUMAN BOTTLENECKS •

A lot of AI use cases that sound initially compelling actually bottleneck on human review, because it's too risky to put AI output straight into production. Some people don't care hence the wave of amusing "As an AI language model" spam, but it shows what can go wrong if you skip reviews.

• FACTUALITY •

Obvious, but after having studied this more and listened to a talk by one of the OpenAI team I think this will actually go away as a problem in the semi-near term future.

• INTEGRATION COMPLEXITY •

The limited LLM context window size means a lot of tricky workarounds are required for many use cases, which increases implementation complexity.

• TESTING DIFFICULTY •

LLMs aren't deterministic, so it's a testing nightmare. You never know when the LLM will just do something unexpected that breaks your use case or integration.

• BUSINESS UNCERTAINTY •

There's genuinely a ton of potential, the tech deserves the hype. But it can be hard to capitalize because so many people are trying to do things all at once, so where do you go that you aren't immediately drowned in VC-flush competition? And many ideas may be more naturally done as features of existing products, so if the vendor isn't doing them, should you wait or should you try developing something yourself and risk obsolesence? The pricing uncertainty compounds that, of course.


after having studied this more and listened to a talk by one of the OpenAI team I think this will actually go away as a problem in the semi-near term future.

Is that talk available online? I’m skeptical that they will ever solve the problem of factuality. I’d love to hear their arguments for why I’m wrong.


Sure, here you go:

https://www.youtube.com/watch?v=hhiLw5Q_UFg

Summarizing:

1. Obviously you can't completely "solve" truthfulness because people disagree on what is and is not true. But you can go a long way.

2. The models do know what they don't know. Their level of uncertainty is not only expressed in the final token logprobs but also seems to be reified somehow, such that they can express their own level of certainty in words.

3. Many of the problems are introduced by subtle issues introduced during training which effectively teach the AI to guess. One example is by training on QA data sets where the answer is never "I don't know". RLHF introduces its own problems because the human trainers don't know what the model knows, so may reward it for getting a correct answer by guessing.

4. Many failures are caused by lack of access to information. Giving the LLM tools to let it search the web and access other databases can help a lot, and it's relatively easy to do.

5. In some cases you want it to guess. Like, it's much more useful when coding for it to spit out a screenful of code that has one or two minor errors, than to just refuse to try at all because the result might not be perfect.


This is complete technical hubris.

You're effectively saying that the "truth" problem with AI is solved by letting it guess. Which is probably a fine enough answer, but doesn't actually "solve" the problem at all.

To the points:

3) Training on data sets where the answer is never "I don't know" is effectively the same as raising children to believe they're always right and teaching them to be needlessly confident. No one should trust those folks.

4) "Lack of access to information" is not the issue... These models are trained on the entire corpus of Twitter, Wikipedia, and more information than I've ever seen in my lifetime, and some of them already do this (Bing) and produce little more than summaries of blog posts based on keywords. If anything, the issue is that an LLM lacks the real world knowledge to discern any nuance as to whether something is correct or grounded in reality.


I'm summarizing the talk, not giving opinions of my own. It is by an OpenAI researcher who is saying why they think GPT guesses so much and what they are doing about it.

I don't quite follow the rest of your post. Nobody is saying the solution to truth is to let it guess. At most, sometimes something not 100% perfect is preferable to nothing at all, but obviously only sometimes.

The point is to identify what in the training process is accidentally causing it to guess too often instead of admit when it doesn't know or is uncertain. Some of this bias comes from the nature of the data set. On the internet, people don't normally post "I don't know" as an answer to a question because that's useless and would be considered spam, but in conversation it's normal and desirable. In other cases they have QA datasets where the goal is to impart knowledge so every question has an answer, but this accidentally trains the model that questions always have answers. Human raters may accidentally reward guessing. And so on.

The talk goes in to what can be done to correct these biases.

Finally, in many cases where the models hallucinate it's because they can't look anything up. Yes they know a lot but just like a human this knowledge is compressed. So they make up references that sound plausible but don't exist for true facts, for example, because they can't check Google Scholar to find the right reference. This is exactly what you'd expect to see from a human who was forced to come up with everything off the top of their head. Think about how much programmers hate whiteboarding interviews, it's for the same reason. Giving LLMs tooling access does make a noticeably large difference.


Thanks for the link!

I've now taken the time to watch John's talk and I have some thoughts. It's not only difficult to solve truthfulness due to disagreement (and subjectivity), but it's also very difficult because different contexts have different standards of evidence.

In some contexts, like programming, we'd rather have the model output its best guess for what the program should be, no matter how low the confidence, because we would like a starting point and we can debug the program from there. The answer "I don't know how to write that program" is not a useful starting point and it may even be an example of the model withholding information it does have due to low confidence.

In other contexts, such as scientific or historical questions, we want a high standard of evidence. Asking the question "what year did Neil Armstrong land on Mars?" should not produce a hallucinated response with fully unhedged language complete with fictitious date of landing. This problem may be solvable by training the model to hedge or even to question the premise when the confidence is low. Of course, this also suffers from the garbage-in-garbage-out problem of having falsehoods buried in the training set.

A more subtle and difficult problem with scientific/historical questions is with long-form answers. Currently, models tend to produce long-form answers that fairly consistently contain a mixture of true facts and falsehoods, and it can be quite difficult for even expert readers to spot all of the mistakes every time. Furthermore, the human labellers were given very sophisticated tools for highlighting sentences in long-form output but the information this produced had to be reduced down to a single bit per example since the detailed information did not improve training very much.

Personally, I think it's going to be very difficult to teach the model how to recognize the appropriate contexts and associated standards. This is a very subtle problem and one of the issues is that it relies on information the model does not have access to, for example: the identity of the question-asker. If a child asks an astrophysicist about black holes they're going to get a different answer than if an undergraduate student asks the same question in class. Yes, this additional context can be included in the prompt, but at some point it becomes a pain to have to copy-and-paste the context for every prompt.

Perhaps people will create a tool to save this additional context in the form of presets but this imposes additional effort on humans. At some point I think the amount of human curating and feedback that goes into these models will cause a collapse and backlash. We saw the same thing happen in the early days of search engines, when Google (fully automated) trounced Yahoo (human curated), leading to Yahoo's abandonment of human curation. We also see the same problem manifest itself at the Patent Office, where human review is policy. The entire patent system has become grossly dysfunctional at least partly due to the overwhelming complexity of this problem.

One thing I really liked was the "inner monologue" of the model performing a sequence of steps to answer a question by doing a search. If this could be generalized to other tasks it could be a home run for automated assistants (Google/Alex/Siri).


Thanks for the great comment. I agree and many of the points you're making here were touched on by John or in the QA section at the end, with the exception I think of the insight about the identity of the questioner. I think that can be solved pretty easily though just by asking people to add their age and country when signing up for a ChatGPT account. They'll probably have to add age anyway for legal reasons.

For the rest, it rapidly turns into the same set of problems you face when evaluating the truthfulness of any human authored answer. This is going to turn into a fight between people who want a Star Trek style truth machine that is far better than humans at generating true claims, people who think they want such a machine but don't (plenty of unpleasant truths out there), and people who are satisfied with "decent human" level honesty and integrity. Or maybe AI can achieve slightly better than human, I think even just the current set of improvements and ideas is likely enough to get there and OpenAI have already made a lot of progress in training opinions out of their LLM. A lot of why people get riled up about truth is the common practice of stating opinions as facts. GPT-4 is extremely reluctant to take positions on anything, which may yield unsatisfying prose but is an obvious and good move for turning down the heat on truthfulness fights.


My gut is that they mean AI generated comments vs AI focused topics.

I did see at least one account that was posting a generated comment seconds after each submission posted.


Maybe this is a bit too shameless of a plug, but I created EXACTLY this sort of thing: https://news.ycombinator.com/item?id=35458826

It's just a regex that filters out "GPT", "AI", "LLM", and "LLaMA" :)


Nice! Was checking the different terms, it's just crazy how ChatGPT is by far the most active one: https://hn.curiosity.ai/#/trends?terms=Llama;GPT;LLM;ChatGPT...


The Show HN you want: https://news.ycombinator.com/item?id=35654401

Update: Ah, sorry, "iOS", "Firefox", I missed that. I leave it here still, to honor the pain it took me to dig it up.


here's an extension I created for this https://chrome.google.com/webstore/detail/hacker-news-filter... firefox : https://addons.mozilla.org/en-US/firefox/addon/hacker-news-f... . no llm filter yet but I can add that too.


Nice! We're wondering if we should build a browser extension based on the "similar stories" model there, if you would be interested in doing something like that happy to have a chat!


thanks! yeah absolutely, thats what I had in mind and I was even thinking of using a model to filter out all ai based stories but couldn't figure out if a local embedded model in the extension can do that so I ended up using a regex only. happy to chat , email in profile.


I used to be skeptic but not anymore. Since last year I've been using "AI" more and more in day to day life. Like some sort of assistant/consultant on every topic.

Bitcoin changed my life as well.


I'd dearly love it if HN had tags. I could skip to all the weird quirky gems, and past all the $LATEST_FAD, or - worse - the VC broetry.


Bitcoin hype will return with the next cycle :)


Hopefully to replace the Web3 / Bitcoin cycle https://hn.curiosity.ai/#/trends?terms=Bitcoin;Web3


And yet paradoxically you currently have a top comment as a skeptic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: