I think the author is far too pessimistic about the capabilities of ML. I think we're already beyond the "it might not live up to the hype" stage and well into - "holy moley this is actually really well (for certain things) already - now let's figure out how to use and iterate upon it" stage.
However - the level of Silicon Valley Bro VC funded hype is totally out of this world.
Many of the claims of 'science' being used in marketing of many services seem to be misleading a lot of people as to how it all works. Some might even jest that they're far closer to magic than science as far as our understanding goes.
Coupled with the very serious risks (not that we can 'fix' that now - the cat is out of the bag), how slow governments are when it comes to technology (not to mention the seriously lacking amount of regulation in the American tech sector), politicians with little-to-no technical understanding in general let alone of ML when combined with the corporations that already have market dominance in many areas - makes for an interesting time for certain.
Marketing departments and now the general public using the term AI instead of ML is just the tip of the iceberg.
With respect to marketing hype, it's comparable to the now-reduced scale of hype for quantum computing, which has mostly vanished from the headlines compared to a year or two ago.
The main difference is that LLMs have a lot of immediately valuable use cases in their current iteration, which has lead to widespread adoption. The hype cycle may have reached its peak and will now start to relax a bit.
> I think the author is far too pessimistic about the capabilities of ML.
Why do you have to be pessimistic or optimistic? How about measuring? Most of us are engineers of some kind here. We don't need to believe or disbelieve.
Being pessimistic or optimistic doesn't preclude measuring things. Many people are pessimistic about climate change precisely because we're measuring the amount of CO2 being released and because we're measuring changes in temperature, weather, etc. over time.
Same with believing and disbelieving - sure, some people believe in religious-type stuff that has an absence of evidence, but there are lots of other things to believe or disbelieve based on evidence. I believe that Covid vaccines help to prevent serious illness precisely because there is a tremendous amount of evidence to that end.
The practical applications of AI is undoubtedly overblown, but I do agree that it's ability to generate content is impressive and will (or rather, already is) being used for practical things.
> I think the author is far too pessimistic about the capabilities of ML. I think we're already beyond the "it might not live up to the hype" stage and well into - "holy moley this is actually really well (for certain things) already - now let's figure out how to use and iterate upon it" stage.
I think we are rather in the "oh wow there is a slight possibility that chatbots are not a waste of cpu cycles with this" stage.
The first 80% of AI tasks will be easy enough. The next 20% will be increasingly difficult with the last 1% virtually impossible. Same with driverless cars.
Folks who are pessimistic on driverless cars appear to move the goals on the last 1% repeatedly. Self driving cars are driving themselves around Phoenix and SF right at this minute.
I love when people share this stat not realizing they’re proving these vehicles are already better drivers than humans. A human disrupts traffic on the road in front of my home every 10 minutes
They've had no fatalities, which is great (and certainly better than other "self-driving" car options)! But it's hard to make really good comparisons as long as Waymo is off by a factor of millions of miles.
The last 1% is actual self driving. Not "allowed to control the vehicle as long as human supervisor is glued to steering wheel with eyes on road". Or "closely monitored test conditions". We're about 7 years later than Tesla predicted on this already, and not all that much closer even without moving the goalpost an inch.
This argument again. It's so incredibly obvious that it does not apply, but so many people seem to think "oho, but some humans can't do <thing no AI can do>! That means you think those humans aren't intelligent either!" is such a gotcha.
Just about any human can be taught to drive in the snow. The capability is there; the skill just hasn't been developed. Take a Californian to Upstate NY, force them to commute 40 minutes to work for several winters, and they will learn (or they'll die, I guess).
Take a self-driving car from LA to Syracuse and force it to drive in snow for years, and all it will do is crash or refuse to try. We do not yet (to the best of my knowledge) know if it will be possible to teach self-driving cars to drive in snow. It seems likely, but it hasn't yet been achieved, and in a field like this, "haven't achieved it yet, despite trying" and "don't know if it can be done" are very close together.
I think the term 'Intelligence' is what is causing the problem. If something like 'statistical modelling' or 'quasi-stochastic information compression' was used instead there would be much less hot air.
For example, in Real Time Analytics, there is something called a 'sketch', which though clever statistical techniques allows approximations of such things like top 10, average etc, without having to recalculate from the whole dataset each time of asking. Nobody claims this has anything to do with intelligence, artifical or otherwise.
My impression is that the vast majority of people who are currently engaged in the “discourse” surrounding AI do not have the technical background to make this distinction. In my opinion, there’s an element of “the man behind the curtain” which is being unethically leveraged by AI companies to drive sales and hype.
I agree and there is an enormous moral hazard to the latter. Just looking at what Hinton has been discussing post G, and you see ideological support for a narrow model, shameless anthropomorphization of LLM behavior, and downplaying of "hallucinations" (which I'd rather simply call what it is, which is lying).
The moral hazard (for sure not new) is that those most equipped to call out the overplaying of the technology's abilities are also those most personally incented (via fame for their creation), or commercially incented (as you mentioned) to drive sales and hype.
> which I'd rather simply call what it is, which is lying
I like the term "bullshitting" as it implies it acts like it knows, but doesn't.
I can't stand it when they call it "hallucinating", as it instead might imply the problem is with the interpretation of the sensory input. This term might be more appropriate to describe adversarial attacks executed through information input.
It's not being sold as long-form autofill. It's being sold as something which will respond to queries, and there is an expectation that the information it replies with should have some accuracy.
I am not in disagreement with your generalization of how it functions or that it lacks empirical knowledge. Some responses are accurate, some are not. It will always give a response but does not always produce a disclaimer saying that the response might have been inaccurate.
So, in criticism of the existing nomenclature used to describe this phenomena (hallucination), I simply suggested something more appropriate.
I think “malfunctioning” is the correct term you are looking for. Lying and Bullshitting and Hallucinating anthropomorphises the software. It is simply not functioning as advertised (not intended).
"Malfunctioning" isn't really accurate, either—because the common usage of that term isn't "not functioning as advertised", not in this world where salespeople will say literally anything to get someone to buy their product. It's "not functioning as designed".
"Hallucinations" are 100% an LLM functioning as designed. They are every bit as valid as any other output. LLMs have no concept of truth, no model of the world, no way even in theory to determine the validity of any particular statement they output.
The problem isn't the term "hallucinations", so much as it is the idea that anything they produce is not a hallucination.
I totally agree with this. I like the term "criti-hype". To me, the NYT interview with Hinton seems so obviously calculated for this kind of effect. What is much more worrisome to me than any of the supposed abilities of recent ML models is the fact that the average person just... doesn't seem to pick up on this kind of thing.
Although, I wouldn't personally say that it's "lying", "bullshitting", or anything else. That's just more anthropomorphization. It's "wrong" or "incorrect". A mathematical model, a clock, a map: these can all be "wrong" or "incorrect". You might say they were lying to you, but it would be for comedic effect. In this case, there are (again!) enough people who lack the ability to make the technical distinction that I think it's important we avoid any kind of anthropomorphization of any ML model to avoid confusion.
Same thing happened with VMs, "the cloud" and containers... Hapless CIOs and senior technical advisors around the world were convinced all their problems would be solved if they'd just sign a contract with <arbitrary vendor>.
And their on the ground technical people would tell them "this ain't it" but would be shushed to tell a story to CEOs and shareholders.
> Same thing happened with VMs, "the cloud" and containers... Hapless CIOs and senior technical advisors around the world were convinced all their problems would be solved if they'd just sign a contract with <arbitrary vendor>.
Pinning this on the phrase "the cloud" is a mistake, it's part of a much broader trend of using specialist contractors for things instead of doing everything in house. If you could somehow magically go back in time and prevent "the cloud" or any other equivalent phrase from existing, the same trend of companies hiring other companies to do the work they don't care to specialize in would still exist.
From the perspective of... an insurance company.. why shouldn't they contract another company to handle the servers? They already hire another company to hire janitorial work, building construction and maintenance, human resources, running their cafeteria, call centers, etc, etc... The computer stuff seems special to us because we're tech workers and this is a tech forum, but really it isn't special and it happens in virtually all aspects of doing business. The trend doesn't exist because of the buzzword, the buzzword exists because of the trend. If the buzzword didn't exist, a new one would be invented immediately, because language is a tool for conveying ideas and people make new tools on the fly whenever we find ourselves needing one. Ideas don't come from words; words come from ideas.
Could you expand on why you don't believe "intelligence" is an appropriate word?
I guess I don't understand why a system built on "statistical modelling" would be incompatible with a system that has properties of "intelligence".
We use "intelligence" fairly loosely already. Significantly less intellectually capable animals are often referred to as "intelligent" – pigs, dogs, etc. Plus, the field of research applied "statistical modelling" belongs to literally is artificial "intelligence". I think it's fine (perhaps even appropriate) to refer to artificial systems which perform intellectual tasks as forms of intelligence.
> For example, in Real Time Analytics, there is something called a 'sketch', which though clever statistical techniques allows approximations of such things like top 10, average etc
This is a very narrow and weak example of artificial intelligence in my opinion.
I don't think the issue here is the word "intelligence", it's more that people tend to use human intelligence as the benchmark for what it means to be "intelligent". So when someone says something like "ChatGPT is intelligent" there is a subset of people who get annoyed because they think that person is directly equating ChatGPT's abilities with human-level intelligence.
Yeah, 'tabooing' the word intelligence could help correct hype, the problem is it could overcorrect: just because you decide to taboo the word, doesn't mean that current-or-future AI isn't "intelligent". Words don't have that power, unfortunately, to make a system "intelligent"--or not.
One limitation of “intelligence” is that it isn’t task-specific. There’s no IQ or ‘g’ for machines (yet) and their performance varies along many dimensions.
We should see how well a machine does on each task we care about, using task-specific benchmarks and testing. This avoids a lot of fuzzy thinking about potential.
I think it’s reasonable to use the word intelligence, if you define intelligence as something like “a system’s ability to take in information and use that information rationally to decide what action to take to meet a goal”, which seems like a good definition to me. It might be tempting to define it in a way that only includes what humans do but I don’t think that’s a good general definition
What makes a behavior rational? Linear models take in information and use that information to decide what action to take to meet a goal. When you are fitting a best line, I suppose that could be rational, or it isn’t, it depends on the usage. I find it hard to justify calling one thing “intelligent” and the other thing not.
Not really, because computers on their own aren't really agents that optimise their actions towards a goal. But they can definitely be programmed to do that
I guess you could argue that following machine code instructions is as much of a goal as predicting the next token though, except it's not quite the same as there's a clear right way to follow the instructions but predicting tokens is more... fuzzy?
We don’t have perfect definitions for either word, but I’m comfortable with intelligence meaning something like goal seeking or reasoning, and sentience meaning something like self-awareness or identity.
The fact that we can’t perfectly define either “blue” or “green” does not mean that blue and green are semantically interchangeable.
I think the Legg&Hutter definition "Intelligence measures an agent’s ability to achieve goals in a wide range of environments" [1] is quite relevant for this issue - once we consider not a technique for a specific task, but a single system adapted for a variety of tasks and which (and this is the key part!) we also expect to perform well on a new task handed over to it, then that does mean that we are trying to build systems that are explicitly "more intelligent" according to this definition.
Yeah, the working definition I use, which is a simplification of the academic definition, is "second order solutions/algorithms", where rather than a program defining a solution to a problem given some set of arguments, you have a program that takes a problem definition as input, and produces a process that can solve many instances of that problem.
That ends up being very different from the "AI is anything a computer can't do yet" pop definition, because it includes everything from SAT solvers, propositional logic engines, and Bayesian network analyzers, to the more bleeding edge deep learning stuff.
The word 'intelligence' not being used would last about 5 minutes until somebody else had the idea to call these systems intelligent. Therefore word itself cannot be the root of it. The perception of these systems being intelligent is trivially derived from the nature and capabilities of the systems, it isn't some unfortunate fluke of language that foisted these comparisons on us.
Indeed no fluke. Probably sci-fi led the way and captured the public imagination, then it was an easy meme for researchers to latch on to as they try to gain notariaty, funding etc for their work.
The nature and capabilities of these systems are a very long way from anything else we consider intelligence. If they didn't contain a 'neural net' I suspect the connection would not really occur to anyone - anymore than people consider Google search to be intelligent.
"When they [AI companies] do ‘do’ science, they are still using science mostly as an aesthetic. They publish papers with grand claims but provide no access to any of the data or code used in the research. Their approach to science is what “my girlfriend goes to another school; you wouldn’t know her” is to high school cliques."
Functioning code and tools are available for almost all of the relevant AI and ML work in academic publications.
Yes, it took a big leak from Meta for a certain type of model, but the techniques to use LLMs and other transformer based tools are not vaporware. Most have code repositories and functioning open-source implementations.
We can certainly be critical of the datasets used for testing but we can’t accuse people of selling snake oil when so many of us are implementing and gaining value from the techniques described in these academic publications.
The GPT4 "paper" that's been talked about so much is little more than a marketing press release.
I certainly disagree that it's the case for the whole field of ML. But the AI that's in the news right now is LLMs with OpenAI in the lead who have done their utmost to imped reproducible research and even research built on top of their already released tools.
It's been a trend that started with Google years ago and their increasingly large models which are unreproducible for most of the scientific community and often not released (if I recall correctly).
I mean, there is no data support in the claim that this pattern is essentially universal in AI research. Citing a few instances doesn’t prove the point. I’m open to the idea that AI research is generally more closed than other areas of research, but there is no quantitative analysis of how open AI research is, or how open other areas of research are.
pick any other area of research, and I will happily find you some footnotes citations that support exactly these claims.
I haven’t read every linked reference, but for a relatively short article there are more than a few. It's certainly more annotated than the average article I find on HN. Shouldn’t the quality of the sources matter most? And if what matters is the number, how many would be enough in this case? What would prove the point?
> or how open other areas of research are.
Other areas of research are irrelevant to the point. Even if other areas suck, that doesn’t excuse this one sucking too.
> pick any other area of research, and I will happily find you some footnotes citations that support exactly these claims.
Even if that is true, it doesn’t make the argument automatically false or irrelevant for any particular case.
First you objected the author had no data or references. Now you’re saying that for any area of research you can find references to prove the point they’re making. Which is it? Is the problem that they didn’t provide enough sources, or that the sources aren’t believable?
I meant, the author cites anecdotes (“here’s a paper that’s not open!”) rather than data (“of the 500 most-cited papers in the AI space, 82% were not open, compared to only 36% of papers in anthropology being non-open”).
And, I meant if you pick any other discipline that produces thousands of academic papers a year, I will be able to find 20 that don’t count as open.
The word “relevant” is an important qualifier which may be doing a lot of lifting. Who defines what is “relevant”? How do you interpret it for this particular case?
I haven‘t looked at the matter myself, but the author did provide a source for their claim¹. Do you have any specific information on why it is wrong? It is not my intention to put you on the spot, I’m trying to understand if your point is based on data or gut feeling so I can myself get a better understanding of the validity of each argument.
I’d say the two most relevant models right now are Stable Diffusion and GPT-4.
Stable Diffusion is open in all senses of the word.
GPT-4 is closed but the techniques it enables are reproducible by anyone with a credit card.
GPT-like things like LLaMA are, well, available. Direct, individually reproducible, these tools and their related academic publications qualify as computer science.
> GPT-4 is closed but the techniques it enables are reproducible by anyone with a credit card.
Requiring a credit card to verify scientific claims is a large deterrent. And in addition one needs to provide them with a phone number, which many people will not be comfortable with when considering their CEO’s history.¹
Even ignoring the above, using the output of ChatGPT is definitely not the same as having “access to any of the data or code used in the research”, which is the author’s complaint and what the original commenter quoted.
As for the other cases, we could only properly assess them if we discussed specific studies which I don’t think we’re going to do in the limited context of HN comments. Thank you for taking the time to explain your view.
There's a difference between capital/operational expenditures to reproduce work, and directly paying the owner of the work.
The former is science. The latter is a commercial product.
Stretching their business model to its most charitable interpretation of "open", OpenAI is essentially offering "Here's the cost for the lab and labor for the work you want to do. We'll do it and send you the results, but won't give you any of our lab procedures or intermediate data."
I think a lot of this is because people are constantly conflating "science" and "engineering".
"Science" is a method of determining objective truth. "Engineering" is taking the truths science uncovers and using them to make things.
Commercial enterprises, including OpenAI, are about engineering first and foremost. They may also engage in science, but that is only to serve the engineering. (Saying that is not saying the there's something wrong with the science they engage in).
This is a bit of a false distinction, all told. More than a few of our "scientific discoveries" were the results of studying the best engineering could do at the time.
I think the distinction is valid and useful, but science and engineering do each inform the other.
The point is that the scientist is primarily intending to learn something solid about reality, and the engineer is primarily intending to create something that is practically useful.
I get the intent of the distinction. I just don't know that it has ever been as clean of a taxonomy as would be desired.
Maybe we have had some solid "pure scientists." Largely we would label them closer to philosophers? That said, the vast majority that you know the name of, were probably employed closer to what we think of as engineers in today's parlance.
Or am I off here and the likes of Feynman, Faraday, Fourier, Maxwell, etc. would not have been said to be working on engineering problems, by modern eyes?
My gut is that it gets difficult as looking at things with our modern lens of corporations today makes it hard to really consider institutions of the past?
> There's a difference between capital/operational expenditures to reproduce work, and directly paying the owner of the work.
But you can reproduce the work with a shitton of compute and a massive dataset. They aren’t sharing their exact strategy but the technique is fairly well known.
IMHO it really doesn’t matter, advancing science comes from trying different things and not from reproducing someone else’s work to prove it works.
> advancing science comes from trying different things and not from reproducing someone else’s work to prove it works.
I disagree with this. If scientists aren't reproducing the work of other scientists, then science cannot advance. Reproduction of results is one of the main mechanisms by which science separates reality from fantasy.
Of course, if reproducing other's results is the only thing scientists do, that also does not advance understanding. But that doesn't take away the extreme importance of reproducing the experiments of others.
> I disagree with this. If scientists aren't reproducing the work of other scientists, then science cannot advance. Reproduction of results is one of the main mechanisms by which science separates reality from fantasy.
Are you or have you been a professional scientist?
Because while that's how it should work, it generally doesn't anymore as it's really, really difficult to get straight replications (which are desperately needed) published.
> Are you or have you been a professional scientist?
Not directly, but I've worked as a research assistant in a number of labs for real, professional scientists and I'm a coauthor on a few papers.
> it's really, really difficult to get straight replications (which are desperately needed) published.
This is a fact. But nonetheless, such replications do get done. Not enough, certainly. But pointing out that the way institutional science is done is flawed doesn't invalidate my point at all.
Fair. But there's a difference between "openly accessible scientific information" and "trade secrets".
The US military has pushed state of the art in many fields, but I don't think anyone would say it pushed stealth composite science forward with the B-2/F-22/F-35 programs, because all the details are classified.
The risk is that non-shared research is eventually lost.
I think the catch is Open AI is risking turning "open" into a weasel word? That is, still science, but hardly open. Which is fine, but then don't call yourself open?
As near as I can tell, there's nothing really "open" about OpenAI. I think that they include that word in their name is about marketing, and really does help to make the word meaningless.
OpenAI originally was a nonprofit nominally focussed on AI that was actually open, then they branched off a for-profit subsidiary with heavy involvement with Microsoft, came up with “safety” as an excuse to abandon openness, and created the “Open”AI of today.
Arguing that academic papers for AI are snake oil by using one CEO’s previous and unrelated venture as an attack on his character is not convincing.
The methods for reproducing the LLMs in question have been published. Much like with the LHC, one hindrance to reproducing claims is the cost of building the machine. Unlike the LHC, mere mortals can interact with the machine and draw their own conclusions of the claims published by the academic journals.
This the the foundation of the issues you raise with regards to science in the 21st century: the cost of reproducing a claim is at odds with “taking no one’s word”.
The leak of the LLaMA models changes this dynamic. It is now possible to run the models locally and with the same datasets published in the journals. This is hardly egalitarian but science has always had a material cost. Competition from both public and private sources almost guarantees that these models will be cheap and abundant in not too long of a time.
> Arguing that academic papers for AI are snake oil by using one CEO’s previous and unrelated venture as an attack on his character is not convincing.
That’s a wild extrapolation and an absolute mischaracterisation of what I wrote. Please respond to what people write, not what you think they believe.
I only claimed many people are wary of giving OpenAI their phone number due to their CEO. I did not say AI papers are snake oil. On the contrary, I explicitly said I don’t know and have respectfully asked for clarifications.
Also, Worldcoin is neither a previous venture nor unrelated. They just pivoted to collect your eyeballs to prove you’re not an AI, “solving” the problem they created.
I think the OP means is that even for free (as in beer) Apple software you still have to have a credit card attached to your account (or at least you used to), even if it isn't actually charged, because of the way the App Store works.
Models like GPT-4 are highly functional, so they are not snake oil in that sense, but they are certainly not scientifically reproducible (or understood for that matter).
In fact there's generally a reproducibility crisis in machine learning, since things like models, datasets, training techniques, hyper-parameters are often only documented in the vaguest of terms.
Reproducibility doesn't mean I have a black box and when I push the button (after swiping my credit card) the light comes on as expected. Reproducibility in science means I can reproduce the experiment from scratch and get the same results.
Practically nothing is known about GPT-4. OpenAI's "GPT-4 Technical Report" doesn't say much more than the fact that it's "Transformer-based".
Well, crypto and blockchain have real, verified, science behind them, and that hasn’t prevented the crypto industry from becoming a ponzi garden of delights.
I suspect we’ll be seeing some highly entertaining stuff, coming down the pipes.
Might not be as bad as the crypto snake oil bazaar, but it should have some interesting hucksters.
I saw a billboard yesterday that was advertising some AI thing that probably had nothing to do with AI other than jumping on the hypetrain. Wish I could remember what it was selling, thinking it was something about lawn care.
That's like IBM doing research, or "thought leadership" from consulting companies. At the end of the day, it's sales / marketing content. It's a spectrum, and some of it is obviously high quality, bit a major aim is to draw in customers as opposed to directly supporting products or customer engagements.
On the plus side, doing research does advance the state of the art, so it's one of the most productive marketing activities.
It is? In my experience with the research world, providing access to the underlying data is table stakes. If you don't do that, then nobody can verify the correctness of your papers.
LOL this is such total bullshit. "Attention is All You Need" is from Google and is close to the inflection point of all this. This article is a complete clown show written in the style of online insult memes because that gets traction. It's by a clown masquerading as a technologist masquerading as a clown. See, I can do it too.
It's unfortunately too easy to find medieval thinking in tech and computer science circles. Many who are ready and eager to believe rather than remain skeptical. A great number of people who claim the title of researcher or engineer that can only use fractured metaphors and tenuous analogies to extrapolate science fictional predictions from marketing copy.
And a great deal of them have plenty of financial incentive to believe and convince others to do so.
It was disheartening to see so many in the tech community get taken in by the crypto-craze when all manner of inevitable futures were imagined for us.
LLM's are a tool. Data and algorithms. To attribute self awareness and intelligence to them is pure hyperbole.
Self-awareness and intelligence (in the sense in which AI has any and will likely have more in five years) are not the same thing. My cats have more self awareness than ChatGPT but the latter can do a lot of intellectual tasks that nonhuman animals can't. What's disheartening for me is to see people make confident pronouncements about how other people are being taken in by scifi and hype and there's nothing to worry about even while displaying what seem like basic conceptual confusions.
I'm not worried that GPT-4 is going to "escape its jail, become self aware, and start building a robot army to conquer the world." Or whatever.
I wasn't worried that the future of financial markets and payment systems were going to be replaced with crypto-currencies and the IMF replaced with the Ethereum developers. I didn't believe for a second that they would successfully supplant the financial institutions and state governments and escape regulation. We were also supposed to see the modern Internet replaced with web3. It was inevitable! These were all hyperbolic claims made by crypto enthusiasts for years. They were preposterous from day one but it didn't stop many smart, intelligent people from being taken in and converted to believers.
There are real dangers! However I think those dangers come from other humans and human institutions. Just as the dangers of crypto came from real humans pulling off huge scams and fraud. Whether it's corporations hyping their AI products to health care professionals and causing thousands of misdiagnoses that lead to chronic health problems or death; to scammers using them to trick family members of their targets to divulge useful information... that's all humans being humans and leveraging a tool that makes it easier for them to cause harm (intentionally or otherwise).
What we need are people cutting through the hype so that regular people don't mistake negligence or intentional malice with the "sentience" of "AGI."
If you want to put yourself in the position of cutting through the hype, conceptual clarity matters. But given your opening and final sentences you seem to have entirely missed the point about distinguishing between intelligence and sentience. The AI researchers you view as hypesters think more precisely than that.
Edit: there is a lot of hype and bullshit around AI that needs cutting through. But unlike crypto, there's also some substance, so a blanket attitude of scoffing isn't helpful.
TFA isn't a blanket scoffing and neither are my comments. I think LLM's are neat, interesting, and could be very useful.
However I do remain skeptical of the more extreme claims made by researchers that don't publish their experiments for others to reproduce, by companies with a vested interest in maintaining narratives that improve their gains, etc.
That kind of hogwash does make it all the more difficult to have sensible conversations and perform real research.
Edit: I, for one, am hopeful we will be able to add reasoning models to our tool-chain one day for the purposes of dispatching trivial proofs to. It would be nice to use formal methods in more places in software development and be able to move past the, cost and time excuses for avoiding them.
A poster in another thread lamented also that too many people conflate the concepts of intelligence and learning. Sadly, their comments were at the bottom of the thread and largely unnoticed.
I'm afraid this isn't a problem exclusive to AI. There is a kind of intellectual fraud that is being normalized sacrificing truth in the altar of tech hype.
AI being crucial for the escalated propaganda wars isn't exactly helping to alleviate this issue.
>There is a kind of intellectual fraud that is being normalized sacrificing truth in the altar of tech hype.
The smell test I use to sniff those out involve asking two questions: Who is this tech for? And what are its limitations? If the answer is a hyped "everyone, no limits!" then it's almost certainly tech woo-woo.
The Dutch system cited didn't use AI and is 15+ years old (ie predates current AI hype by far).
Most AI papers have working code available.
Etc
This entire article is rife with trivial to find errors that totally invalidate it's claims. there may be a problem, there may not, this article doesn’t help figure it out either way.
It feels like narrative was written first, then someone went looking to try to find newsbites that sound like they support it.
I see plenty of links to recent papers (last two years).
> This entire article is rife with trivial to find errors that totally invalidate it's claims.
Could you share some? Maybe they’re trivial for you but handwaving it doesn’t help others identify what is correct or not. For your claim to be true either the multiple papers cited would have to contain glaring flaws or the author would have to be wildly misinterpreting them. That doesn’t seem trivial to assess.
> It feels like narrative was written first, then someone went looking to try to find newsbites that sound like they support it.
From the author’s other writings, the opposite seems to be true:
I never found a novel useful product after many hours of scouring web3 startups. Usually it was “X for web3”, predicated on something else useful existing to justify the web3 ecosystem
I’m cautiously more optimistic about AI (this time) since we’ve seen impressive functionality from OpenAI and such, and the availability of an impressive LLM as an API excites possibilities. In my head a large open question is how flexible / powerful fine tuning and context/prompt engineering can be; if it’s not powerful enough, then I predict a wave of snake oil failures, and the space fizzles out for a few more years.
On the other hand, if you can take an API, some fine tuning data, and some prompt engineers, and build an actually useful service from that, then the space is about to explode with new unicorns
I'm not sure about Watson's "expensive disaster", because all the public knows is "that charming supercomputer that starred on Jeopardy! and beat the pants off all other human contestants". That was a massive, enormous PR win for IBM and AI across the board. Nobody in the general public will forget that, and that will give a noticeable boost to any AI remotely like Watson, science be damned.
Certainly, not all projects work and institutions need to take risk. However, there is a big financial and opportunity cost to this. Rather than spending $62 million on Watson, and assuming a $100k salary with a a cost multiplier of 2 ($200k for 1 FTE), that amount of money could have paid for 77.5 FTEs over the 4 year duration of the Watson contract (62e6/200e3/4). I suspect that the overall scientific contribution of those researchers would have been higher.
Sure, Watson+Jeopardy was good PR for IBM, as was DeepBlue, but even if the definition of success (vs disaster) is only PR-success, I'm not sure how lasting that is/was for IBM. Certainly from a technical POV Watson was just marketing hype - a common label applied to a grab-bag of unrelated unremarkable products.
If you polled people on the street today about which companies they associate with AI, I doubt IBM would come out anywhere near the top. In fact, I'm not sure if you surveyed the under-20 crowd they could even tell you what IBM does, assuming they've even heard of them, other than perhaps being in the computer business in some vague way.
I had forgotten about Watson. I don't normally watch Jeopardy. However, maybe I'm not part of the general public.
I did some fast checking, my numbers may be off, but it looks like Jeopardy has a viewership of around 11 million perhaps it was higher in 2011. Now news reports would have spread the word further, but that's a small percentage of the general public of the USA.
This is a tangent, but snake oil was medicinally effective.
The colloquialism "snake oil salesman" comes from the old scam of confidence men selling a cheap, adulterated (and sometimes drugged) mineral oil as snake oil (based on the Chinese water snake from Chinese traditional medicine). People would pay a premium for what they thought was medicine, only to find out later that the confidence men bamboozled them and skipped town with their money.
> In 18th-century Europe, especially in the UK, viper oil had been commonly recommended for many afflictions, including the ones for which oil from the rattlesnake (pit viper), a type of viper native to America, was subsequently favored to treat rheumatism and skin diseases.[6] Though there are accounts of oil obtained from the fat of various vipers in the Western world, the claims of its effectiveness as a medicine have never been thoroughly examined, and its efficacy is unknown. (emphasis added)
"Criti-hype", I like that word. It indeed is like car salesperson hyping that the very thing they are selling is almost not street legal, begging the buyer to try it on and placing that seed of thought.
On another note, everyone is now using "algorithms and machine learning" in the same way they used "blockchain" in 2021. The best part is, the majority of the "algorithms" being used are basically glorified Excel models; something you would expect a fresher in college to piece together for their Physics 101 class.
Honestly, I rarely see research as accessible as in the field of AI: a new method is published to fine-tune SD, dozens of repos reimplement the technique and improve it and this happens every day.
> “I would advocate developing these types of AGI technologies in a cautious manner using the scientific method, where you try and do very careful controlled experiments to understand what the underlying system does,” Mr. Hassabis said.
Myabe that's why Google was (seemingly) left in the dust?
OpenAI minmaxed LLM development, DeepMind minmaxed other avenues of research, such as game-playing AI development. LLMs proved themselves first. Ockham’s Razor suggests there’s nothing more to it.
Yes. Today LLM seems like an "obvious" choice, but before ChatGPT, one could easily argue that game-playing AI is with much higher business value (to "play" stock market or even fighters), and chatbots are only good for customer service.
I'd argue both areas are successfull because they found good ways to solve the data problem and go "big data" and the use case while nice and shiney now is not as relevant. Big models trained on text work, because there's a lot of text availabe and the task of predicting the next tokens is nice because you get autolabeled data just by collecting a lot of text. Similarily RL solves this issue by simulations of sorts (supersimplified play a lot of games against eachother, judge somehow cleverly what was better, update stuff, rinse, repeat). I distinctly remember this lightbulb going off in my head when I first played with NLP. I came from ML for computer vision and trained ULMFiT on the entire Wikipedia and was thinking...oh neat no labeling that's magical, if I had more compute I could basically feed this infinite data.
I'm not following synthetic data closely (I've used it to train models before) but I see a bright future for this space (like game engines used to simulate big worlds simply to generate data or training in said worlds and transfering).
Google certainly appear to be a bit behind, although it's hard to know what they have internally that they've chosen not to release.
It seems that Google have certainly been cautious about releasing this type of tech, but again hard to tell the exact mix of motivations - e.g. safety vs business risks. Almost all the authors of the Transformer paper have since left Google, and it seems a common theme is that they all thought Google was under-utilizing the tech - maybe just not knowing what to do with it as much as being worried about it ?
It seems things are now changing at Google, with the Brain + DeepMind merger and what appears to be a new productization focus. It seems competition may now be trumping safety concerns, even if they are still paying lip service to it!
To me it seems that demanding scientific rigour misses the mark.
What will be important in the end is results.
I think it's pretty obvious that image generation such as Midjourney and Stable Diffusion is an incredible technological advancement. Who cares if the science behind it is solid? Or even if the papers published by companies that sell the technology are solid?
LLMs still need to prove themselves to some extent but it's already clear that something unprecedented is actually going on here. That's why everyone is hyped/freaked out. But we have not yet figured out exactly how useful the technology really is.
The main theme of the article for me is the discrepancies between marketing (shiny cable news segments and social media) and real, dirty-hands research and development. They are completely out of sync.
I am studying Comp Ling and one of the first understandings you develop is that Natural Language Processing through neural networks can achieve a level of accuracy and efficiency (which is pretty high), but eventual limitations occur where improvement is impossible. And perfection is certainly an impossibility.
This is an article that is both timely and necessary. Way too many people are simply leaning into the AI hype and there is simply not enough scepticism to go with it.
I refer to one of the comments made in the thread about the state of self driving cars: “People who haven’t been able to make any progress in self driving cars are now claiming that full conscious AI is just an year away!”
We continue to see these “oh it’s not…” and “watch out for…” articles. Let’s take a step back and just take it for what it is.
If I walk up to a regular someone on the street and say “please write me an outline for a short story about tuna fish and bunny rabbits,” you’re going to get a blank stare. Heck, even an experienced story teller might have a tough time with that.
But these current AI tools will give you a credible outline for that bunny-tuna story in about 10 seconds. It’s not the best story. But it’s a good starting point. To me, that’s freaking great. Maybe it’s a little esoteric, but still amazing. And there are a zillion other things that these tools can produce very quickly that are excellent starting points.
Humans can’t do this. Period. There is no snake oil. The tools have instant access to decades of human development.
Some of us can’t find Washington, DC on a map let alone write a short story outline or 100 lines of code in 10 seconds.
How does that “talent” translate to productivity, i.e. useful output? That’s up to the users. Some people will know how to ask the right questions. Other people won’t. There will definitely be another “sorting” of people based on access and ability to use the tools.
But this is the course of human development. It’s evolutionary, really. From now on we go forward with these tools. That’s my take.
> If I walk up to a regular someone on the street and say “please write me an outline for a short story about tuna fish and bunny rabbits,” you’re going to get a blank stare.
Probably they'd be trying to assess if you're sane and not a danger to them. This is a task a 6 year-old could accomplish with style and vigor.
"But these current AI tools will give you a credible outline for that bunny-tuna story in about 10 seconds. It’s not the best story. But it’s a good starting point."
A good starting point for what? If the end result is a story that nobody but yourself will ever want to read, why not simply copy and paste "All work and no play makes Johnny a dull boy" a thousand times, no need for an A.I.
"To forestall having a “dasvidaniya comrade” moment where a self-aware GPT-5 shoves an icepick into their ear, Trotsky-style, they put together a ‘red team’ that tested whether GPT-4 was capable of ‘escaping’ or turning on its masters in some way"
You can agree with the author, you can disagree with the author, but he has a Style.
Consciousness is about experience. Intelligence is about capability. Dogs are conscious, but can't add two digit numbers. ChatGPT is not conscious (almost certainly, although we don't really understand consciousness well enough to prove this), but can add two digit numbers.
It's not necessarily about experience except in the discipline of Philosophy. Cognitive scientists have their own ideas and theories about consciousness, like Global Workspace Theory. However, since LLMs are feed forward networks incapable for remembering things beyond their input window they also conspicuously fail to be conscious by those definitions too:
Consciousness is subjective experience. Intelligence is problem solving, abstracting, reasoning, generalizing. There's no requirement for those things to be accompanied with subjectivity. And there's no reason to think LLMs have subjective experiences.
The common man’s perception of AI and its capabilities right now is similar to how people would look at you like you’re the Wizard of Oz just by building a web page back in 1999, and if you were young enough they would even herald you as “the next Bill Gates!”
While he certainly points out things that have been true about AI hyperbole driven by profit seeking, its ironic how he himself is hyperbolic about the ineptitude of modern AIs while hawking his pdf e-book on the subject for $35.
But the new Generative AIs are fundamentally different. They are trained without any notion of a ground truth. They therefore have no concept of reality. They only know what is real-ish. They are psychotic. Temper the hype with the caveat--"but what it produces is often untruthful or unreal" and the hype is instantly clear.
This essay fails to account for why GAI is so different. This problem can only be solved with new methods for training that have as yet not been invented. Older style ML is based on using some ground truth and comes with a metric that measures how close it is to this truth. Not so of any of the new GAI, so continue to beware until researcher figure out how to measure truth--and good luck with that.
One thing is for sure. His book will sell like hotcakes. The recipe to this is as follwoing:
- 50 oz of fear as our base ingredient
- mix in some grains of truth (like OpenAI being a company very restrained with their IP and competitional advantage but painting a different picture in the public)
- stuff it up with halftruths, distractions and handpicked cherries as you like to make it look colorful and shiny (nevermind how these taste because if they taste it they already bought it and you can never go completely wrong in taste if your base is sugar aka fear)
We might be going full circle soon and in need of "Beware of Beware of AI pseudoscience and snake oil". In my personal bubble there's roughly the same number of AI snake oil stuff popping up as there is finger wiggly "be weary off AI claims" stuff that lacks even the simplest understanding of AI let alone modern things like how transformers actually work.
not sure about that. i suspect that AGI might be a tool to help one of the bigcorps to grab power. Until now the state and the bigcorps were keeping each other in check, there was a kind of 'balance of power'. Now they might get a tool to upset this balance of power.
Whoever gets AGI first might turn into Buy-N-Large or ACME Corp, whatever... (i suspect it's all about the big price - ultimate power) - it's kind of the 'ring to rule them all' - if AGI is achievable within a realistic time frame...
* now let's see if that's the dystopian future that we are really after :-) *
Well, you can if the results are objectively unreliable, or the technology is wildly inappropriate in a workflow versus conditional logic where x always does y.
Some of the suggested uses are as absurd as blockchain mania trying to replace databases.
You can't downvote a whole thread, only comments. Threads can only be flagged or upvoted and the thread is not flagged. If there isn't enough interest it fades to the second page after a while. That's what happened here.
Thanks for the clarification. Does it penalize a submission when someone flags it (but it doesn't reach whatever threshold there is to actually mark it as flagged)? I'm assuming there's a threshold.
That's a good question and it looks like a flag is considered in ranking before it goes defunct. From the faq [0]:
How are stories ranked?
...upvotes [positive]...time [negative]...
Other factors affecting rank include user flags, anti-abuse software, software which demotes overheated discussions, account or site weighting, and moderator action.
So for all intents and purposes it looks like a flag acts in some way as a downvote. I was just confused by the term "downvote". They don't have the same effect as an upvote (maybe more effect?) but they do drop a story in the rankings even though they may not be removed.
Those last three factors look like it can be rather arbitrary. But in general I prefer the quality of articles and discussions here to other sites.
Yes it's pseudoscience at least until someone comes up with a good metric to benchmark the quality of these LLMs. And then don't make the metric the target.
If you ask the LLMs, they'll tell you that reverse engineering a crawler's frontier from an LLM's data source is somewhere on the positive spectrum from possible to probable, while noting that it would take a lot of time, patience, and resources.
Having worked in crawling for a few years and read the research, this strikes me as insightful in exactly the same way that AI presents the phenomenological insights of its human creators with no attribution.
I've done enough experiments in the past year to have some insights about what was crawled and what was not.
What we do not have is a DSM for AI, despite the fact that it mirrors the same self-report problems of its human creators. Sadly, we seem to have been unable to produce an AI without those problems - they permeate all of the current products in the space that I've tested.
The entire discussion here is framed by an indictment of scientism on the one side and self-report on the other. The correct answers are elusive somewhere in the middle.
Current SOTA in AI reminds me very much of the progression and disillusion from cybernetics and transpersonal psychology before and after the Dartmouth Workshop that created the field, respectively:
IDK, we don't seem lacking in metrics. For measuring the quality of LLMs at the core task of language modeling (which then gets applied to other tasks), the classic information theory metric of perplexity works; for evaluating how well they transfer to other tasks, we can (and do) use metrics of those specific tasks, like the tasks in SuperGLUE dataset. All reasonable LLM papers do evaluation on these metrics, and they don't really "make the metric the target", you really can't do much in the large model pretraining to target SuperGLUE specifically.
However - the level of Silicon Valley Bro VC funded hype is totally out of this world.
Many of the claims of 'science' being used in marketing of many services seem to be misleading a lot of people as to how it all works. Some might even jest that they're far closer to magic than science as far as our understanding goes.
Coupled with the very serious risks (not that we can 'fix' that now - the cat is out of the bag), how slow governments are when it comes to technology (not to mention the seriously lacking amount of regulation in the American tech sector), politicians with little-to-no technical understanding in general let alone of ML when combined with the corporations that already have market dominance in many areas - makes for an interesting time for certain.
Marketing departments and now the general public using the term AI instead of ML is just the tip of the iceberg.