Hacker News new | past | comments | ask | show | jobs | submit login
Anti-AI Hype LLM Reading List (gist.github.com)
208 points by atg_abhishek on Aug 27, 2023 | hide | past | favorite | 52 comments



"Anti-hype LLM reading list" - this is the actual title of the github page.

"Anti-AI Hype" seems to be a very bad editorializing. It corrupts the meaning of the title, and makes it ambiguous.


A solid reading list is useful because it's important to understand a technology before either hyping it or criticizing it, Too much description of LLMs both from the "hey this is cool and will revolutionize the world" crowd and this "this is scammy crap like NFTs and not AI" crowd seems to oversimplify the method making it sound like a simple Markov chain. We had Markov chains in the 1980s on hardware less powerful than what's probably powering your microwave today. For better or ill, LLMs are considerably more powerful than that.


Coming from someone who has never been a "true believer", I agree that the pushback is a bit harsh, but to be honest they brought it on themselves. When GPT-4 was first released you couldn't walk ten feet without seeing some Linkedin-tier bullshit about how this was going to be the final form of intelligence, the replacement for all developers, the replacement for all workers, the second coming of Christ on live broadcast, you get the idea.

When inevitably reality doesn't meet expectations, you get the pushback. The sooner we get over this ridiculous team game where we must either state that LLMs are AGI or LLMs are random number generators, the sooner we can start asking the real question: how can we take this technology, limitations and all, and actually use it in a production system? Where is it useful and where is it useless?

Because that's still the point, and if I haven't missed anything, it's still highly nontrivial to answer that question with any degree of seriousness.


"the sooner we can start asking the real question: how can we take this technology, limitations and all, and actually use it in a production system? Where is it useful and where is it useless?"

I just published a talk I gave on Friday at WordCamp US where I tried to cover pretty much exactly this: the actual practical things you can do with existing LLM technology right now.

https://simonwillison.net/2023/Aug/27/wordcamp-llms/


Yes, promoters and dismissive skeptics react to each other in an unproductive cycle. We can try to do better.

One trap to avoid is thinking of this cycle as a rough sort of justice. “They brought it on themselves” sort of hints at that? Ideally, someone else’s bad take shouldn’t be taken as an excuse for writing an opposing bad take, even though it often does provoke that sort of reaction. Noisy, polarized discussions make thing worse for everyone.

Bad takes do often make for a good writing prompt. Just about everything I write is reacting to things I read online. But usually it works out better to look for an indirect, non-opposing response that doesn’t feed the polarization cycle.


> Yes, promoters and dismissive skeptics react to each other in an unproductive cycle. We can try to do better.

True. But I think I can make the case that dismissive skeptics have greater value than promoters: they can counter the worst of the promoters (e.g. the hucksters and idiots who only recently deleted their tweets promoting crypto and NFTs).

Despite what people think, there is rarely a particular downside to not being first to a technology: early adopters do not routinely make truly excess profits, they just get to take advantage of a wide-eyed, uncynical marketplace. Skeptics tend to moderate over time, and are a little better at spotting the sustainable opportunities than the boosters.

There is a particular downside to being an early victim of a scam or a lie in a new field. The scams are worse, more painful, more humiliating, and often happen before regulation creates meaningful penalties.

The tech world is negative about skeptics for no good reason, when you consider that robust technology itself is much like harnessed skepticism. Code that doesn't trust its inputs, test harnesses, margins of error, etc.

The cycle I think that is the most dangerous is the hype cycle that starts where developers don't talk sales and marketing people out of their worst claims (because it is in our short term interest not to have to get involved with all that stuff while we are working).


By "dismissive skepticism" I was thinking of the repetitive, lazy, nearly content-free kind, like linking to "killed by Google" every time there's a topic about a Google product, spelling Microsoft as Micro$oft (back in the day), anyone talking about "techbros," or calling an LLM a glorified Markov chain.

If you can post it off the top of your head and everyone can predict that someone will say it, is it worth saying?

Something to be particularly wary of is reasoning by crude analogy. These people are all the same. These companies are all the same.

By contrast, there are skeptics who do research, like investigative journalism and the kind of short seller that digs up dirt and publishes it. This sort of research is often valuable.


Yep, absolutely with you here.

I feel like lately my mantra has become "the metric is usefulness", not just with this, but everything.

So many things seem to end up in these never-ending debates that boil down to essentially just peoples' differing aesthetics. But to me, I just want to know how useful things are. I see this all over the place now, not just technical things related to my work, but stuff like culture and politics too.

Of course, you could say it's still aesthetics, and my aesthetic just happens to be usefulness. And I think that's right!


> I just want to know how useful things are.

LLMs are somewhere between very useful to game-changing for me. But spammers will get far more utility and value out of LLMs than I can.


I agree!

My point is that this is where both the hypers and the dismissers have gotten it wrong for me. So many hypers are just talking about something that isn't what I'm interested in; about whether it's AGI, rather than whether it's useful. And the dismissers are just missing or denying that it is useful, and for me, it definitely is. Both of these groups just seem to be talking past each other and focusing on the wrong metrics, from my perspective.


I agree in some ways. Still, you can’t escape the uncertainty and complexity by seeking ‘usefulness’, at least to the degree that the term is undefined and contingent. Usefulness of LLMs (and whatever we combine with them) depends on research, funding, investment, attention, and applications.

People who talk about usefulness or ethics or politics too often fail to specify key details, such as a time frame and/or discount rate. Not to mention risk preferences.


A lot of people just don't want to spend their attention cycles on something that they don't find useful nor stable right away. There's just absolutely no reason for people to do free labors of figuring out what the thing is really good for nor testing the stuffs for billionaires. Sometimes the winning strategy is simple ignorance.


> There's just absolutely no reason for people to do free labors of figuring out what the thing is really good for nor testing the stuffs for billionaires.

This is an unexpectedly emotionally strong response, indicating antipathy towards billionares (presumably related to their wealth, influence, actions, morality, and so on). I'm not following how my relatively milquetoast comment brought about this reaction. Would you like to explain the backstory? There is clearly more to it, and I'd like to hear it.

There is also an implied 'thou should not' that I'm getting from the "absolutely no reason to do free labors". That is a very specific and pessimistic framing, which I'll comment on a few paragraphs down.

> A lot of people just don't want to spend their attention cycles on something that they don't find useful nor stable right away.

To situate the substance of your comment relative to mine: you are suggesting the importance of narrowly-defined, individual-oriented, easy-to-compute usefulness. That is certainly one view. And, yeah, I get it. For most of life's decisions, this is what people do. And for good reason.

Let's say for the sake of argument that I share your concerns about powerful forces playing outsized roles. For this paragraph, I'll be a committed socialist. From such a point of view, I will strongly reject claims that one should view usefulness in such a way with just one sentence: It is in one's self-interest to be curious and look beyond one's current frontier. Tools, even those tools created by forces you oppose, can serve your ends. It is better to understand and harness their power. Most, if not all, effective social activists throughout history take a pretty long view. The want change now, but they know they'll have to plan to get there. The ones that don't plan and use the tools at their disposal have a hard time building a movement.

Putting aside the political philosophy now... Remember, we're talking about the development of technology that, at the very least, creates the illusion of intelligent textual communication. That's no small thing. It doesn't matter if you think it is overhyped or not, it is an unmistakable force. The rate of change in unprecedented. This isn't anything like most 'typical' questions or decisions we face in life. Even from an individual point of view, learning how to best take advantage of AI can be an important time saver or for many in our field, a competitive advantage. So, reverting to a mindless form of usefulness isn't in our self-interest.

Anyhow, for what it is worth, this line of discussion is quite different from my point, which was largely epistemological, pointing out that the word 'usefulness' is hardly operational unless specified. I would also add an opinion: using the term when communicating with others pretty often invites misunderstanding, because statistically, the word has become quite comfortable. This combination of comfort despite inherent ambiguity is a recipe for comically predictable misunderstanding.


> This is an unexpectedly emotionally strong response, indicating antipathy towards billionares ... Would you like to explain the backstory?

You overthought there. It was meant to be a sarcasm against the toxic internet fanboyism.

From my experience, many of the tech hypes from the last decade have been driven by cult-like fandoms literally worshiping large tech companies and their billionaire CEOs. It's beyond my understanding how people spend their free time and energy just to endorse and protect such businesses, who will do good even without the worshipers anyway. Yet, fanboys actively try to express their loyalty, and keep disturbing others with bold claims (which are borderline misinformation). This hurts people by causing unnecessary FUDs and exposing unsuspecting victims to unstable experimental products. This is the dark side of hype.

> Tools, even those tools created by forces you oppose, can serve your ends.

First, it's not "oppose"-ing. It's about not choosing at all.

Second, that's basically how everything work. Even though more than 99.99% of the global population is uninterested in the optical system of EUV photolithography, its design has been continuously improved and allowed us to get power efficient high performance chips, impacting practically the entire planet. Impact alone can't be a reason for attention, or everyone must get bombarded by a large volume of information everyday.

> Anyhow, for what it is worth, this line of discussion is quite different from my point, which was largely epistemological, pointing out that the word 'usefulness' is hardly operational unless specified.

Your comment was being overly pedantic to the level that everything becomes meaningless.


I appreciate what you are saying about Internet fanboyism.

Personally, I can recognize Elon Musk for perhaps being the right person at the right time to advance Tesla while also disagreeing with many of his actions and detesting the example he sets.

It is interesting to think about how hype works and the role of devoted fans. Hype has been around longer than mass media; clearly there are deep roots in human psychology.

Fanaticism (i.e. around one's preferred sports team) has pretty primal roots. (Aside: I wonder if the popularity of sports decreased during WW2. Seems to me in the face of real war, a simulation is less compelling.)

> It's beyond my understanding...

Perhaps. But what if you find one such person, ask questions, and put yourself in their shoes? There is a story there; probably one that doesn't follow your logic. Two common drivers are identity and group membership. By e.g. becoming a public fan of some product, a person can identify with a narrative of something they care about and belong to a group. It often gives people a sense of being ahead of the curve, or creative, or intelligent, or whatever it is they value.

Even if your particular reasoning is sound (i.e. the billionaires don't need the fanboys, so why bother?), why do you think your logic would be the key driver? In other words, if your reasoning was already top-of-mind for such fanatics, they wouldn't be fanatics at all.

I'd guess you already know much of this. Perhaps what you mean is you dislike the consequences of fanboyism. But I doubt that is what really bothers you. Here's my guess. It is much deeper: you don't like that people are illogical and therefore self-defeating. It doesn't help that they also things worse for the rest of us.


> It is much deeper: you don't like that people are illogical and therefore self-defeating.

This is a typical case of over-generalization based on the Barnum–Forer effect[1]. At the very moment you label someone as "illogical", you're already having communication issues with the person, and, naturally, you don't have the best impression on the person anyway. So this statement practically fits every single person on the planet.

Mind that relying on tricks like these puts you in the realm of being illogical.

[1]: https://en.wikipedia.org/wiki/Barnum_effect

> Perhaps. But what if you find one such person, ask questions, and put yourself in their shoes? ...

Yeah I get the overall mechanism there, but I don't think it's critical here. My point is that the outcome of fanboyism has been harmful, and, especially, it has killed a lot of healthy discussions. This doesn't help anyone, so it needs to be trimmed somehow. A healthy fan base must be regulating itself and channel all the fans' energy to more productive stuffs rather than flaming internet forums.


I understand and largely agree with your general concerns.

I'm having a hard time getting a read on your intended tone here. I hope you are getting something out of the conversation, since you are responding. However, some of your word choice seems to indicate irritation (e.g. saying my comments are pedantic and so on).

Regarding my intended tone... First, I'm genuinely curious about your strong reactions. Any guesses I have as to your deeper causes are, like I say, guesses. They are based on imperfect pattern matching. Second, logically speaking, some of the supporting details, as you state them, are unpersuasive to me, but I appreciate where they are coming from. I also recognize I'm not getting the full context.

I sometimes engage in longer threads of discussion here on HN not only out of curiosity but also because I'm exploring various ways to communicate and connect. As such, I'm very aware of the non-textual human factors.


Lots of people are answering that question today, LLMs are being worked into production systems at an incredible pace


But not without trouble. Getting consistent, reproducible and only good results is really hard, plenty of examples of companies that would love to take it into production but that seem to be stuck in a loop where as long as the AI stuff goes off the rails in non-brand safe ways they aren't going to get a green light.

And I haven't seen any major change there over the last two months or so, what I have seen is that there are now better test suits and better tooling.


I think the fact that the hype cycle was so short is evidence in LLM's favor as a revolutionary technology.

Hype-bubble technologies live off their hype forever, and need to accelerate the hype to get more investment in a technology that doesn't actually have that many applications. The hype cycles are long, escalatory, and recurrent. Think blockchain, NFTs, IoT, AR, etc.

When a technology is really revolutionary, there's a big surge of interest at the hype level, and then it quickly becomes mundane as people stop talking about it because they're too busy actually using it. Think the internet, smartphones, social media, wikipedia, youtube, etc


I don't think we can get much useful signal out of these observations.

By the same logic one could argue that AI is the quintessential recurring hype cycle that always turns out to be a nothing-burger.

It's a particularly hard nut to crack because whether or not LLMs will turn out to be useful depends on future developments, particularly on whether there is a way to get them to behave consistently and predictably. Computers are useful because they are stupid pieces of junk that do always the same thing when given the same input.

If it turns out there isn't a viable solution to this, we are forever doomed to use LLMs pseudo-interactively, which is something, but a far cry from even the low end of what LLM proponents are promising. (e.g. all the current LLM "integrations" are very thin layers on top of the raw model).

It's fundamentally impossibile to tell, imho.


> whether or not LLMs will turn out to be useful depends on future developments

But LLMs are already useful. All kinds of people are using them to improve their writing, they now ship with grammar-check tools like Grammarly. All the big tech firms are working on integrating them into their virtual assistants. Every developer is using github copilot now. StackOverflow is in crisis because chatgpt is the perfect rubber duck. There are tons of applications in marketing, from drafting ad copy to keeping tabs on consumer buzz.

If the question you're trying to interrogate is "will LLMs become AGI", then yeah, that's a big open question, but also that's a bar that's much higher than "whether or not it'll be useful" or even "whether or not it'll be game-changing", and it already is undeniably both of those things.


I agree with the sibling comment that this is probably too little evidence to make such a grand claim, however, I can concur that I do notice people around me using LLMs in their daily life, sometimes as chatting buddies or for research. For my circle, it seems to be rather widespread.


The problem really kicked off with the "Sparks" paper. Microsoft Research joined the Marketing department and suddenly Eric Horvitz is co-authoring a paper that "proves" AGI is here because GPT-4 can draw a unicorn.

I wish that ML folks were more concerned with doing actual science (I know some are) and less about hype. Most of pushback is only a natural reaction to this, I believe.


The paper is fine. It doesn't "prove" anything. But it does show along with other recent insights that people who have a testable definition of general intelligence that is inclusive of humans and say 4 isn't don't have any leg to stand on.

The biggest indicator that we have something here is that no one wants to argue results anymore because then the arguments fall apart.

Suddenly it's "4 isn't AGI because it isn't doing "real" reasoning" instead of "4 isn't AGI because it can't do this intelligent task all humans can"


AGI is here because GPT-4 can draw a unicorn

Seeing how random gpt performs this task nowadays I became much more skeptical towards this (https://gpt-unicorn.adamkdean.co.uk/). Even if you argue the model is somehow crippled due to computation costs I'd still expect that gpt-4 could SOMETIMES draw something like a unicorn with vector shapes but perhaps not all the time. Certainly a one prompt one result conclusion seems to me really weird.


>Seeing how random gpt performs this task nowadays I became much more skeptical towards this

In the paper, they say RLHF GPT performs worse on the task so it's not too surprising


That paper did a number on me... I was dizzy for a day. Couldn't process the implications fast enough. It was really a dangerous paper to read, you need deep de-hyping when you finish.


Disclaimer: I don't doubt that most ML folk are doing their job and are doing it well -- as a matter of fact, my company regularly does projects with our local university, so I know that, first hand.

With that out of the way, I said it then and I'll say it now: a paper where Microsoft describes how well Microsoft products perform on benchmarks designed by Microsoft employees isn't science, and can't become science even if you write it in latex and put it on arxiv.


I wonder if Anti-hype lists become the next awesome lists


Hey layoff Markov chains! They may not be SOTA for NLP, but they're still real to me dammit.


llms are literally markov chains


There has been a lot of push back lately that the current hype cycle is completely un-warranted, in the process of fading away, and generally that LLM's aren't that good really.

I thought that was what this post was about.

Really, this is just trying to 'cut through the hype'.

When we get beyond the hype, here is a list of articles to explain current state?

Am I correct here?


Seems that way. "Hype free" would probably be a better way to put it.


Seems more just non-commercial, which is fine. Hype is more subjective. For example the gzip thing appeared to me to be hype, but it's there. And the mere existence of yet another AI list is a kind of hype.

Much better than "10 things you need to know [thread]" stuff.


where is the pushback ? can you share any links.

all i see is the hype cycle.

also, the title is click-bait. I was expecting a list of pushback articles. and whats the deal with developers curating such lists and posting on hn ?


Among the more amusing ones is Slate’s recent “See? It’s eight months later and there’s no mass unemployment yet”:

https://slate.com/technology/2023/08/chat-gpt-artificial-int...


The push back was in another post on HN couple days ago. Basically saying the current LLM's aren't that good, this is all hype, charlatan hype. Really ignoring the real results, and taking they hype as all their is.

Then this list of articles is kind of the anti-hype-hype. 'here are some real results'.

The anti-ai threads https://news.ycombinator.com/item?id=37256577#37260717

Might have link wrong, here is another https://news.ycombinator.com/item?id=37256577

and another https://news.ycombinator.com/item?id=37263231

Tech people are little bi-polar when it comes to hype. Everyone, seems to love the hype on the way up.

Then little while later, the 'hipsters' are like 'that sucks, this hype is all a lie you lemmings'.

Then a few year later, the silent majority of engineers are quietly plugging along and have adopted whatever settled out of they hype-hype-backlash cycle.


That recent pushback has been a bit exhausting, even more so than a hype preceding it. At least the hype was full of hope and excitement, the recent pushback is full of snide (and often unwarranted) I-told-you-so and a taste of luddite.

I guess that's how the pendulum goes.


I'm sure there are egregious examples, but I think "I told you so" is fine. You get these waves of bro-kids storming into established disciplines and acting like hot shit, it's not surprising to see people happy when their bubble burst. Importantly, I don't see pushback against ML researchers or practitioners, it's against the charlatans who are using the hype to sell snake oil. That's completely warranted.


Conflating luddism with anti technology sentiment just illustrates how shallow your analysis is. And isn't complaining about complaints just more noise?

The thing about luddism is, it correlated strongly with suppression of legitimate industrial action. It wasn't an act of ignorance or seeking regression. It was a violent manifestation of labor push back against rampant exploitation in the face of systematic suppression of legitimate channels.


Pendulum is probably way sharper this time around because we're dealing with the aftermath of the crypto fallout. Sure in practice this time there are actual applications and lots of people paying actual money for real products, but tech has lost a lot of it's sheen and there's going to be a lot more very understandable dismissal of the next shiny. Even though for this one we should probably actually take it seriously since the societal disruption potential is very high and humans dealt poorly with the previous few transitions when tons of people lost jobs. The AI doesn't even have to be that good. Tons of jobs involve trying to turn humans into robots, like call centers and offshore software consultancies, and I think llms passed the "good enough" point a long time ago


Those are some key aspects of Gary Marcus but not the worst. The worst is that so many people listen to him. It's actually problematic because it confuses lawmakers.


> I guess that's how the pendulum goes

https://plato.stanford.edu/entries/hegel-dialectics/


This is a double edged sword. While not being anything like crypto as it will have a real world impact almost immediately the important thing to understand and recognize is that AI is imperfect as it is formed from imperfect data as humans are imperfect. However this does mean that AI can for a lack of better judgement replace a lot of imperfect work.

The problem with this replacement is that it rips out a financial ecosystem a heavily interpersonal society depends on in the form of classes. Businesses can not make money without losing money and definitely can not continue to make more money without money to be made. It has almost like a hydrological diagram where instead it is money. The problem is that without a circular flow of income within a currency we experience what we call stagnation which will cause hyper inflation , money printed to meet demand and the depreciated value of a currency and increase in the cost of goods.There was a particularly good reason for having income tax brackets up to 70%.

AI needs laws and regulation. It will like keep progressing and ripping out key stones in the foundation of our economy. Less people working eventually means less profits for all businesses and industry.


Nice list, many are resources I used to get up to speed on this. I'm currently competing on kaggle on an NLP competition and I feel this is honestly great way to see the limitations of LMs and LLMs. Kaggle's community is like nothing else out there. They share incredible resources and experiments constantly.


Adding to the bits about deploying and training models: https://news.ycombinator.com/item?id=37121384

If anyone has anything to add, that would be great!


I thought it would be philosophy nonsense like consciousness is not computable/ consciousness need a body etc, but turn out a legit introduction booklist into deeping learning.


"nonsense"


Great list by Vicki Boykis. I just saw this posted by someone in Mastodon.

So far, my favorite linked article is on why you may want to self-host your own open LLMs.


See also: the Prolog AI hype


Wasn’t this posted before?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: