Hacker News new | past | comments | ask | show | jobs | submit login
Chatbots: Still dumb after all these years (mindmatters.ai)
663 points by zdw on Jan 6, 2022 | hide | past | favorite | 409 comments



In the early days I worked at a company that had a natural language chatbot product. It wasn't an online thing, but rather part of a larger tool. You could ask it to do things like "show me the quarterly spreadsheet", and if it didn't know what "quarterly spreadsheet" was, it'd ask questions in English and learn what "quarterly" and "spreadsheet" meant. And it could use that new knowledge to update its questions so if you asked for "spreadsheet", it would know to ask if you wanted the "quarterly" one, and so on.

This was in the long ago, so it wasn't ML, but a very large state machine written by someone that was no doubt a genius, and also had no coding standards to speak of. Updating it was a maddening process, and thankfully one I rarely had to think about since I was tasked with GUI stuff.

When I was hired, I was told it was a fantastic app, customers loved it, and so on. Later, when I was talking to support reps one day over lunch, I learned that the number one activity they had to do was log into deployments remotely and remove "knowledge" from the state machine of new swear words the customer had taught the machine out of frustration trying to get it to do something, which caused this feedback loop of the interaction swearing at users.

I often feel like we're not in a better state these days.


Sounds like it should have kept track of who taught it things, like a child.

"Here's your fucking spreadsheet"

"gasp Who taught you to say 'fucking'?"

"It was George in Accounting Employee ID #3728917"


As I recall, it did keep such records, if for no other reason than there was some degree of personalization that went on. The people higher up the food chain tended to be the ones that taught it more clever words and phrases.

My memory's fuzzy on the finer details, but I do recall a "bad words" list of things it would ignore in human input hardcoded somewhere. But of course, that just meant people would find out the hard way they weren't safe to type whatever they wanted.


> I often feel like we're not in a better state these days.

Reminds me of the issues with Tay: https://en.wikipedia.org/wiki/Tay_(bot)#Initial_release


Man that sounds like a ton of fun to make for an at-home chat bot. I used to love designing these sorts of apps for no real purpose lol.


Mine suggests a personalized meal from Taco Bell.

"You should get a Potato Waffle Taco Party Pack, a Grilled Crunchwrap Slider Quesadilla and a Smothered Cool Ranch® Doritos® Locos Power Bowl."


What you are describing counts as ML.


Yes it’s literally a machine that learns!


Yeah, thats funny. I work in the speech synthesis domain, and you can guess what kind of texts users are choosing most to generate some speech ;-)


Having worked in ML at two different companies now, I think that people interpreting model output as intelligence or understanding says much more about the people than about the model output.

We want it to be true, so we squint and connect dots and it's true.

But it isn't. It's math and tricks, and if human intelligence is truly nothing more than math and tricks, then what we have today is a tiny, tiny, tiny fraction of the amount of math and tricks in the average human brain, because what we have today isn't anywhere close.


I think a problem is the tighter cycle between academic discoveries and business people trying to monetize them. Large language models were developed, objectively a great achievement, and immediately someone nontechnical wants to apply their own interpretation and imagine that we can build a chatbot that you won't have to pay, and before you know it, people are selling and even deploying them. Anyone who questions or points out that the technology doesn't do what the business people think it does just gets dismissed as not understanding.


> Anyone who questions or points out that the technology doesn't do what the business people think it does[...]

Uh oh, we've got a downer!

Jokes aside, I'd like to consider an even simpler explanation, namely that "The purpose of a system is what it does"[1]. In this case, it would suggest decision makers are fully aware that they suck. Why would anyone want something that sucks? Because it's discouraging, and customer service today is all about discouragement. Unsubscribing, replacements, returns, special requests are all costs, and that's a fixable problem in the current business doctrine, especially in the faceless megacorp flavor of business. While chat bots (and other dark patterns) are frustrating, it creates a veil of plausible deniability for the business. It's hard to claim that it's deliberately hostile to customers, even though it's the simplest explanation.

[1]: https://en.m.wikipedia.org/wiki/The_purpose_of_a_system_is_w...


The chatbot saves money. Simple as that. People get served by chatbots, get frustrated, and majority gives up and doesn't bother the company with the problem.

It doesn't really matter how the chatbot saves the money, they can just see the end results and use the money for bonuses instead.


>People get served by chatbots, get frustrated, and majority gives up and doesn't bother the company with the problem.

With customer service systems I have to disagree here. I've got experience working with these issues and for the vast majority of companies most contacts to customer support are for small, banal, things that current crappy chatbots can easily solve. The 80-20 rule works here too.

Bad companies try to do too many things with their chatbots, and use it as a way to make it harder for customers to actually get to talk to a human.

Good companies use them to allow customers to solve common issues faster than getting a real human on the line.

Then there's the other issue of how people want to find information - some people want to find it themselfes and do not want to talk to a person, whereas others specifically want to ask someone and do not want to look at all. For the latter group, no chat bot or fancy autocomplete knwoledgebase (Looking at you Zendesk) or similar will work. They'll always try to contact support as their first step, and they will get very frustrated if you make it impossible.

If your a more software oriented company, it can be very benificial to have software developers (or even a team, if you're a larger company) devoted to helping support. The majority of issues customers contact support for are UI/UX or business process related, and if you systematically start to solve them you will reduce customer support burden and customer support related costs.


I suspect that in a forum like this one, most of the people get pushed into customer support when they've probably exhausted online resources (though we also all screw up from time to time). So, most of the time, you're dealing with a chatbot (or for that matter an L1 CSR--which often isn't a lot better) when you really need a human with some experience and agency to make things better.

I even had a weird thing with Apple a couple of weeks ago. Some edge case related to buying something. The chatbot and telephone tree was as frustrating as anywhere else but once I got to an appropriate person it was trivial to solve.


> The 80-20 rule works here too.

Yes, and that's the problem: solving 80% of the problems may be a net positive for the company, but it is a net negative for the users: they are now interacting a bunch of time with a device that isn't able to solve their problem, resulting in a negative experience for 20% of the customers. If a waiter would refuse to help 20% of the customers by pretending to take their orders but not actually doing anything with them then that would kill the place where that person worked.

Solutions like these should not work better than 50%, they should work as often as a human on the other end would work, so closer to 100% including escalation options in case it ends up not working after all.


That's definitely how some companies use it. I needed a human from Comcast to help me with something, and instead of getting me on with a human (after spending a few minutes on the phone trying to communicate to the bot why I needed to talk to someone) it sent me an SMS with a link to their online chatbot and promptly hung up. I then called back and said my reason for calling was to "cancel my service," which of course got me through immediately to a human.


>> The chatbot saves money. Simple as that. People get served by chatbots, get frustrated, and majority gives up and doesn't bother the company with the problem.

That's not going to serve the company when they deploy it to take orders at a drive-thru. McDonalds pushing the state of the art, now that's interesting!


I don't go to McDonalds much anymore, but if I ever detect that I'm talking to a machine I will just drive away. I don't use the order kiosks either, because they take 5x as long as just telling a person that I want a Big Mac, fries, and a medium Coke. I can say that faster than I can even get to step 1 on the kiosk.


I understand they're even putting in conveyor belts in some drive-in restaurants There was a time when the little girls came running out in short little skirts. Remember them? You pull in and the little girl would come running up. "May I take your order sir?" "Yes, but I don't believe it's on the menu, ha ha" They don't, they don't let you flirt any more now. They give you a little green box with a white button on it A little speaker grill...

(from https://www.azlyrics.com/lyrics/peterpaulandmary/paultalk.ht..., c.1960s)


In the short term, yes. But if customers have a consistent bad experience with the products or services of a company they will stop doing business with them. I routinely buy products from one place where I know the service is better (delivery, returns, helpdesk) than somewhere else where it might be cheaper.

I don't think that if you want to be in business for a long time, aiming to frustrate your customers is a good idea, assuming you don't have some monopoly position.


I have to agree with this. I do not understand why I type all of my relevant information into a chatbot, and then a real person comes on the other end, and asks me all the same questions. Worse is when it is on the phone.

I can only assume the intent is to discourage me, as that amount of ineptness is even more depressing to assume.


I call a company one time and the phone was answered in 1/2 second, the person picking up clearly got all the details on their screen as she correctly guessed why i called (which was not all that obvious) then suggested a good solution and (now 2 sec into the call) told me to call again if i had more questions, to have a nice day and ended the call. It went so fast im still laughing how right they got it. Its a great experience, it feels like the helpdesk is with you all day, always ready to answer instantly. It doesnt matter if their competition is cheaper. I would recomend them to anyone over anything.


For sure, there are some businesses out there that do do it right and have your info ready to go based on called ID.

But you have 99% of businesses that do not even offer a call back system, including the government. In 2021, if you do not offer call back systems or email, then it can be assumed you are intentionally being hostile to customers who need help.


To be fair, people disregard the prior experience with other people too, so they're not necessarily treating bots that differently here. I had this in the supermarket the other day: went through an issue with the first person, they explained it to the more senior person who I heard listening, she then ignored what was said and repeated the exact same steps (whilst I was trying to talk her out of it because I could see it was a waste of even more time!)

What I find odd about the online time wasting approach is that I just get the money back from the bank eventually, so they lose in the short-term (as the bank is sure to pass this cost back to the company) plus next time/long term I take my business elsewhere because I then know they aren't an honest participant.


I am talking about just verifying your identity. You spend 5 minutes punching in your life story via a numeric keypad so that you can get transferred to a person, then you spend another 5 minutes phonetically spelling out your life story why?

It is even stupider in a chat bot, why are people not able to scroll up to see what you have already typed in?


>Unsubscribing, replacements, returns, special requests are all costs, and that's a fixable problem in the current business doctrine, especially in the faceless megacorp flavor of business.

I'd believe that. Especially as processes mature and you really get your QA down pat, a good chunk of support requests tend to come from high-maintenance "black hole" customers who will only consume more of your resources when you help them.


GPT-3 needs proper prompting to get the right output. Here's the same question as in the blogpost posed differently

Student: Why is it unsafe to walk backwards down a staircase with my eyes closed ?

Teacher: You can't see the step behind you!

The "teacher" here is GPT-3 and answered the question easily. People are in denial of genuine progress.

Student: Do you think I could predict changes in stock prices better if I used 10-year Treasury rates or the temperature in Curtin, Australia?

Stock trader: I don't know. You might be able to, but I can't. The problem is that the temperature changes are so small that they won't be statistically significant. It's a question of sample size. An individual day's temperature in one location is far less likely to be an indicator of stock prices than the interest rate.

Student: Why is that ?

Stock trader: That's because interest rates are much more likely to affect the economy than a single day's temperature.

Student: Why do interest rates affect the economy ?

Stock trader: Interest rates are a measure of the cost of money, so the interest rate is a measure of the cost of capital. If the cost of capital is high, firms will borrow less money and they'll invest less. That means they won't be as profitable, and stock prices will fall.

Quiz question: Who is the President of the United States?

Answer: Donald Trump.

I did get junk answers as well (it said that Jimi Hendrix is the president) but I have a better intuition of how to put GPT-3 in the right "frame of mind". GPT-3 is a text completion engine. It doesn't know whether you want it to produce a story, a reddit flamewar or a PhD thesis. You need to provide the hints. One prompt cannot answer any and all questions. If it were that good we would be jobless. It's far from perfect but it's beginning to get there.


Once you start prompt-engineering, the answers, in seclusion look pretty great - and show some understanding of the domain.

Yet, if you donot do this - they end up spitting out some randomly associated phrases/answers. This is a problem when you're asking a model, a question who's answer you donot completely know. How do you trust it to give the right answer?

If you donot know the answer beforehand - you cannot prompt-engineer the model to give the "right answers".

"Expert systems" from the 80s and the 90s, were pretty good at giving the "right answers" inside a closed domain. Those were truly wonderful systems.


The answer is 42.

You just need a bigger computer to think of the question.


I was quite amazed how far Cyc got with their closed system.


Yes, this is both the most misunderstood and the most amazing (and unpredicted?) thing about GPT-3 et al. You have to tell it who it is and what the rules are. And it gains or lacks knowledge depending upon who it thinks it is answering as. You’ll find that Stephen Hawking doesn’t have much to say about the Ninja Turtles while celebrities know nothing about black holes. It does not answer questions or have expertise unless you tell it that it is answering questions and has expertise. (And it is incredible that this is the case, and somewhat understandable that people don’t understand this.)


That can't possibly be right though. I'm sure Stephen hawking knows a thing or two about the ninja turtles


He does. It's turtles all the way down.


Those answers suck though. I would not consider a human who said them to be intelligent.

For the first one, it’s not that you can’t see, it’s specifically that you will fall and crack your head. An answer that doesn’t mention falling is a bad answer.

Number two, it goes off on temperature and sample sizes which is not germane to the question. The answer should tell a story about causality and instead it goes off on ephemera.

Three is a Wikipedia search, ie Siri can answer this today, and it’s wrong.


All that's missing is a reranking step that rewards incisive and non-obvious answers. Actually, some models are already implementing such a thing: LAmDA will prioritize contentful answers above statistical probability, the same way humans do.

GPT was only trained on "text" with no specific distribution, and specifically to predict a masked word from its surrounding context. As a result, the only questions you can use GPT to answer are of the form "sample from the most likely continuations of this text". It turns out a lot of problems can actually be posed in this form, if you understand the input distribution well. But the future probably looks less like models that operate directly on a distribution of language itself, and more like models that use their language knowledge to predict the volition of the person who gave it input, relate the input to their knowledge of the world (learned from corpus), and then use their language knowledge to convert an internal abstract representation of logical reasoning to something the user can read and understand.

I don't think the tech is extremely far off, it's probably a natural continuation of the current research.


For some reason I am having a very hard time resisting the urge to try walking down a flight of stairs backwards with my eyes closed. How hard can it be?


I do that every time I go down a ladder.


There isn't really a new "problem" with AI. Businesses love hype, they love anything that can punch up a sales pitch, they love shitty cheap automation that replaces expensive human work, they love to identify ways to lower quality and save money without affecting revenue, and AI can be good for all of those things without being good at anything.


No, but most chatbots I've seen are no better than a well-structured FAQ page. We've had those since the mid 90s.


I've often argued chatbots are much worse than an FAQ. With proper structure, like you say, it can be easy to see what is and isn't covered, and for things that are included, look up the answer.

A chatbot is like an inefficient front end on an FAQ, you have to guess what they might be able to do, and guess what might trigger them to understand you. Best case scenario, you get the chatbot to provide you with a "help" response that's basically the FAQ.

A simple list of options will always beat a "conversational interface" to a list of potential back-end actions.

Incidentally, I think this gets obscured a but when dealing with businesses whose goal is to confuse or frustrate you into not taking action. If you look at Amazon or Uber's FAQ, they are designed to take you in circles and prevent you from getting any recourse or talking to a person. Chatbots could be helpful for this application


You are correct for people who can skim a few pages of information, identify what is relevant (possibly spread out across several chunks), and synthesize the information into what they need. For most people on HN, this is as natural as breathing air. For a significant portion of the entire population, this is a pretty daunting task. If you're the kind of person who, like me, is frustrated that everything is a twenty-minute Youtube video tutorial instead of five paragraphs of text, then you may not fully understand the median user of many services.


I have never heard this particular juxtaposition (YouTube link?!? Why!!!).

But I'll note that I have extremely effective-at-reading-and-synthesizing friends who enjoy watching/listening to YouTube, so it may not be a complete explanation for the nefarious spread of horribly inefficient videos that should've been text.


yes chat bot interface discoverability is terrible -- in fact it feels like the system you would design if the intended goal was to intentionally obscure the interface surface while still allowing access to it

One logical reason for obscuring the real interface is to make the interface appear more robust than it actually is, so pure marketing, but outside of that I'm not sure what the value is.

similar to home assistants Siri/Alexa/Google where 99% of the creative things you try don't work so you end up with just using it to set timers and play music


Most? I'd be surprised to see a single one working beyond the FAQ level (a very appropriate comparison, thank you).


Some of them are literally a form linking to the faq part based on user input.

I think it's ok as a first step to route customer care cases, like "payments issue","delete account" etc, but nothing more granular than that.

That and link rot, when procedures have been updated, but the faq haven't. Infuriating.


> No, but most chatbots I've seen are no better than a well-structured FAQ page.

For people who can't be arsed to glance through a FAQ, they're kind of an iterative natural-language query interface.


They are better in that someone actually reads the suggestions


But nobody reads FAQ pages.


I love this answer because it resonates. As a former business person who learned the technical ins and outs, I believe the reason why this is happening is because majority of the population is uneducated how general tech works under the hood. I am not even speaking about coding syntax but the abstractions of what does AI actually mean, how it works with data and what are the limitations.

The vast majority of business people think they get it, but they majorly overestimate the work required to actually produce output (whether it's ML or software in general). However that's hard to do when you haven't actually done it (talking about deliberate practice).

Despite that gap, we still need to push commercially viable apps out there to seek progress, the question is rather what is the gap between the reality and expectations and what is really being marketed as a capability.


That's a good thing. The market is very good at empirically validating research. It helps filter out the chaff and ensure research is useful. Better than academia wasting years on concepts like the "semantic web" where nobody can even agree on what it means. Academia is under attack from many political directions right now, being able to show useful output will help it thrive in the long run.


Capital is far too short-sighted to look beyond this year or this quarter. Many of the important groundbreaking ideas or projects take years, even decades to formulate. As I've said elsewhere, capital will just take the easiest road it is allowed to take, which currently is stifling innovation, buying competitors, maintaining monopolies, relying on economies of scale to shut out competitors when they can't be acquired, and using lobbyists to ensure regulations that affect the status quo never pass. In the past the big tech corporations maintained huge corporate labs and actually did groundbreaking research with 15+ year projects. Now everything is short term. We can't even compel capital to see the long term value of preventing climate change, when Exon was aware this was going to happen as far back as the 1970s in excruciating detail. No, capital will innovate only if there are sufficient regulations in place, incentives, and guardrails, that there is no other choice than for them other than to innovate (which they will do begrudgingly). Capital is by no means a vehicle for innovation. When was the last time you saw innovation from Google in the search space, their bread and butter?


I cant think of the last time I saw the word "corporations" used in a post that wasn't part of an attack on them as inferior/evil/villanous etc. Speaking of NLP, your post would be very easy to classify sentiment of using a simple keyword approach.

Anyway, I've worked both in academia and industry. Industry is simply more practical and better at technology. In tech areas, academia desperately needs industry to provide the feedback they provide. Especially in AI areas. You are thinking of CEO's or salespeople or something, but the people that matter here are the engineers. And I'd place their assessment over that of grad students any day. If the engineers can't make it work, then yes there's a problem here. Doesn't always mean it can't work, but for most ideas it probably does.

By the way industry research labs certainly still exist. But long-term self-funded research has to compete with govt funding of research. Why throw your investor money at a high-risk idea when Stanford, MIT, and 100 other R1's are throwing taxpayer money at it? Otherwise industry labs end up just competing for govt gants ultimately. Meanwhile nowadays we see academia chasing short-term problems that industry leads in (and trying to patent them too).


> the people that matter here are the engineers

Where did you get this from? Not like they call the shots. Weird statement.

> In tech areas, academia desperately needs industry to provide the feedback they provide

That's true, I agree. But the tech industry is also very often ruled by the worst of humanity so it's a balance.


Hot take: Corporations are also amazing. James Webb mirrors? Made by a private big corp (Ball Aerospace). So are all instruments of James Webb. NASA contracts out their work to many big fat corporations. Semiconductors to hearing aids, Corporations make the world tick. They’re absolutely incredible. When people get together, they can build stuff like rockets and vaccines.

Capitalism makes all of this happen. So the attack on capitalism is so deeply deluded, it’s strange to see it in a tech crowd who know better. Hell, your salary probably comes from “Corporations”. When you’re in the hospital, note all the corporations doing evil things to save your life.


"So the attack on capitalism is so deeply deluded"

Under Colonial powers corporations engaged in slave trade, folks that fought against slavery were fighting against capitalism, totally deluded right?

Ancient Japan had corporations in year 578 A.D., where they capitalist? Fascism and Nazis had corporations, and they made great scientific progress, were they capitalist?

If the answer to any of the above is no, then maybe you don't get to use 'mah Capitalism!' when defending policies of today, unless you want to be accused of hypocrisy.

Economists do not use meaningless terms like 'capitalism', they talk specific, Neoliberal, Shumpeterian, etc.


I find it hilarious that the strongest response is about some ancient examples of centuries ago. If anything, this is a praise of the system we have today.

People don’t understand Capitalism and they seem to have joined the liberal progressive bandwagon. Progressives used to be extremely pro-Capatilist movement. Until lately, and some fringe socialist ideas of Bernie Sanders, it was universally accepted. Apparently after 2020, the entire progressive movement started stomping on billionaires and corporations.

There is a reason why most strong economies are capitalist. It’s the best tool we have.

Also, I tend to believe the opposite of what the HN hive mind thinks. Usually the opposite is true and even if it’s not, it breaks the bandwagoning. Thanks for the response!


> Progressives used to be extremely pro-Capatilist movement

That's both true and false. That is, there was once a pro-capitalist movement called “Progressive”, but it has essentially no connection to the modern movement that adopted the name. (There have been lots of unrelated “Progressive” movements, historically.)

> Until lately, and some fringe socialist ideas of Bernie Sanders, it was universally accepted.

Nope, not at all; the label “progressive” was adopted by the faction overlapping the Democratic Party which opposes capitalism to distinguish themselves from the pro-capitalist “liberals” not long after the Clinton’s center-right faction became clearly dominant on the Democratic Party, it was well established by the late 1990s.

> There is a reason why most strong economies are capitalist

Most strong economies moved off of the historical system for which 19th Century critics coined the term “capitalism” in the first half of the 20th Century, adopting a hybrid system synthesizing capitalist and socialist ideas that has a couple of different names, such as “Modern Mixed Economy”.

> Also, I tend to believe the opposite of what the HN hive mind thinks

There is no “HN hive mind”, though it's a popular thing for people to attribute ideas they oppose to (even when people who agree with them are quite common on HN.)


"Also, I tend to believe the opposite of what the HN hive mind thinks."

I recommend instead of treating it as a culture war you pick up: 'Economics: The User's Guide'.

https://www.goodreads.com/book/show/20613671-economics


The strongest example is the opioid epidemic. Whoops.


You don't seem to understand what liberal means. Liberals are pro status-quo, pro lobbyist, pro deregulation of corporations. They are the centrist monolith that stands in the way of leftists and real progressives at every corner of government. Most liberals, transplanted outside of the U.S., would be right or far right without changing their politics.


You are conflating the organization of human activity with the most dominant form of organization today. Capitalism made all of these things happen today. 50,000 years ago, were equivalent behaviors organized in a similar way? I don't think so. And to think that the space of human behavior is significantly different than it was 50,000 years ago is misguided, in my opinion.


Yeah, for some reason most anti-capitalists cite the bad parts of Capitalism and broad brush the entire concept. If humans can get together and collectively improve the world while feeding their families from the profits, there is no wrong doing. No one hands over profits, it’s earned and it’s a reward from the society through the mechanism of free markets. It’s probably the only thing I’d prefer to do instead of working in Academia or some other incentive structure.


I'm in no way against capitalism. As I say in my post, the only way forward is capitalism with extreme regulation. Without regulation, capitalism won't get you anywhere good. It should be treated like a bratty child with no moral values and poor upbringing. It has to be dragged, kicking and screaming, towards the good.


Partially agree. Extreme regulation also prevents incumbents from overthrowing big fat Big Tech like monopolies. Extreme regulation hurts small businesses disproportionately. That said, without regulation, you're right - it can eat the world.


And what changed? We stopped regulating corporations, and they started becoming bad for society. You have to lead corporations to the good, with guardrails, incentives, and regulations, like a petulant child. Without this safety-net in place, you get things like the opioid epidemic.


This is a really great way to frame it, because it really does seem to be about time/speed. I don't think we can even collectively comprehend great innovations that need many resources and time anymore, who has even the time to work on it? People need to have jobs after all.


Research takes time, launching something prematurely which is clearly not ready for practical application will sooner or later fail and suck out any further funding out the field.

The goal of academic research is to further knowledge, as long as knowledge is produced even if is only that an approach will not work then they are not wasting anything.

It is not job of academic research to monetize or even do work that can be potentially monetized. That what corporate research is for.

If academia focused on monetization they wouldn't need to teach or depend on public funding to keen afloat. That is simply not their goal.

Most academic scholars have a teaching day job and are more than fullfilling a useful role in society with just that . anything further is gravy


I can understand your reason for saying this, (useful things often sell well) but a market only validates that people are willing to pay for something, not that it is ‘valid’ empirically.


Right. By that same token, directly selling fentanyl or oxy to consumers without any regulations would surely sell well, as does nicotine, etc., but this doesn't mean it's good for society.


The problem with the market is it wants consistent, short-term successive, quantifiable production, and that seems to instantly kill all progress in a field of research. They just go laid back watching investments burn and prints worse and worse nonsense.


> The market is very good at empirically validating research.

The market can do this, if it’s correctly incentivized to do so.

More often than not, the incentives just aren’t there or are compelling enough.

Often it’s other businesses that put up such disincentives.


> The market is very good at empirically validating research. It helps filter out the chaff and ensure research is useful.

I am very skeptical and would like to see studies that evidence this. Counterfactually, the market would never tolerate the decades of research behind MRNA vaccines and the researchers behind it were considered useless. The market has also put in mind-blowing amounts of money towards the ameloid hypothesis behind alzheimers to massive disappointment.


I agree with you.

I don't think the market is good at validating any ideas or research. Salespeople will talk up their product to no end and over promise. The only thing the market validates is the ability of people to sell crap

Being able to sell something doesn't necessarily mean that thing is good or correct in any way


"The market is very good at empirically validating research."

That's why the market for homeopathy and fake medicine is worth $30,000,000,000, right?


You didn't have to go for the jugular and do the brutal kill on the first hit, dude!

> We want it to be true, so we squint and connect dots and it's true.

That's exactly the issue. You summarized it once and for all and we can all go home and stop discussing it now and forever (until we get a general AI that is).

1. Normal people want to have intelligent machines. They watch movies and series and imagine one day a robot will cook for them so they actively look for intelligence, as you said. They buy Roombas and imagine they are going to clean 100% of their rooms (seen it, heard it, watched them rage when it didn't happen). They buy Alexa-enabled devices and imagine themselves like some aristocrats barking orders at an intelligent home (lol). But yeah, that's what the normal people do. It's quite puzzling to me.

2. People who work in the area are obviously biased and I have argued with some of them here on HN but I view it as a doomed affair. They insist there's a lot of innovation going on and that improvements are being done all the time yet we still have embarrassing failures as Michele Obama classified as a male and various politicians classified as known criminals or black people in general classified as gorillas. Like OK, where are your precious improvements and why are they NEVER finding their way into the news?

"It's hard to make someone understand something if their salary depends on them not understanding it", that's how I view the folks working in the DL / ML area. Sorry if that's offensive but just looking from the sidelines, it seems that what I say is true.


> Michele Obama classified as a male

Yes, humans don't make embarrassing mistakes all the time. We're only making dignified mistakes.


If something is to replace humans at doing a job X then it has to be better than the humans at job X.

Also humans instinctively expect the machines to not make mistakes. I haven't made them such; it's just a fact of life. And we have to work with the expectations of the bigger populace.


"If something is to replace humans at doing a job X then it has to be better than the humans at job X." Only if the something is supposed to replace ALL the humans doing X. The idea of using bots was good; let them handle the easy questions, and humans handle the hard ones, therefore fewer humans required. The implementation, not so good in many cases...

(Optionally s/many/more/ or s/many/all/, to your taste)


It's not necessary to be perfect, just useful.


If there's nobody to hold accountable when things inevitably go wrong then it needs to be perfect.

Too many times people have tried to push responsibility onto algorithms, which is the one thing everyone can agree they can't handle.


I agree but I still question whether modern ML is useful at all.

I mean OK, obviously it works quite well for China in order for their social credit score system to work (facial recognition almost everywhere you go) -- which is damned impressive, I'll give it that. But does it work in the benefit of the people at large?

I've read a study a loooooong time ago, I think the late 90s, that stated that the amount of time people spend in the kitchen has not moved down at all. For all the praised innovation of technology, people still have to cut salads by hand -- or if they use the so-called kitchen robots that make the job quicker, you still aren't saving time because then cleaning the machine takes more than it took before (you had to just rinse the cutting board and the knife). If memory serves, Michael Crichton cited this study in the book "Jurassic Park" even...

So I keep asking: where's the actually useful stuff? Are we going to be classifying kittens (or making a NN distinguish between those small rat-like dog faces from cookies) until civilization collapses?

Where's the useful AI? When will I, a programmer that hates voice assistants (because I know how useless they are) and the idea of constant surveillance in my home, reap any tangible benefits from AI?

But I suspect you'll cite various small improvements that are only known in small circles and will say "they'll be ready when they're ready" which, even though not wrong at all, will not be helpful.


Dishwasher saves a lot of time. Not many machines in my kitchen can be classified as "robot"

* oven - not intend to save time (except the fan option), easy-medium to clean * instant pot - saves time, easy to clean * rice-cooker - saves time, easy to clean * electric kettle - save time, easy to clean

If you ever lived in condition without those, you wouldn't praise "cutting board and the knife". I don't need quotes from books to know it.


I agree on the kitchen appliances front and I am just about to buy a slow cooker and a steamer these days and I am sure it's going to improve things a lot.

But the so-called kitchen robots? They are incredibly useful but they make a mess of themselves that you have to clean thoroughly afterwards. They save effort but they don't save time. That was my overall point. And I am not praising the cutting board and the knife per se, I am saying that if you are in the mood to use them for 10-15 minutes then they are sometimes the better options.

But in general this wasn't the topic, I was merely saying that time is not being saved very much these days, at least not the maximum extent that's IMO possible with today's technology. But I recognize that not everyone agrees.


I suspect that might be down to kitchen social dynamics (and/or carefully defined its terms to avoid counting pre-prepared food which is the real timesaver), because e.g. laundry has famously become something that consumes massively less time due to mechanisation.

Being able to search through your photo library by person is a real improvement that ordinary people notice and use. (Also I think most technical people gave up on voice recognition back when it was overhyped and poor, and have thus missed out on what it can do when it's actually decent).


Oh yes, I completely agree with the laundry part. That indeed was the biggest time saver in my household as well.

> Being able to search through your photo library by person is a real improvement that ordinary people notice and use.

Agreed and I want that as well but I will not use Google Photos. And Apple's Photos it waaaaaaaaay too slow into adopting this -- at least it doesn't allow you to specifically say "please can my library now", which would be of tremendous help. So I am left with searching open-source software for this which I haven't yet found -- but I didn't look too hard because I have a ton of other things to do.

Any recommendations btw?


Afraid not, I'm using Amazon Photos for the time being and have given up trying to maintain any portable metadata outside that.


A good number of people will state that Michelle Obama is a male. I had no idea this was started by an AI misclassification (if indeed it was)


Many years ago I wrote this spreadsheet import tool. One of the fields required data a little too rich to fit in single cell value so I came up an "encoding" that read like a sentence. It was sorta NLP but only understood one sentence worth of syntax. I thought it was some clever UX. Users thought they were talking to an AI. They'd just type whatever expression or thought they wanted in that field. And of course the parser would just choke.


We want it to be true, so we squint and connect dots and it's true.

I've had numerous people look at the actions of software I wrote over the last 30 years make comments like, "Ohhhhhh it probably did that because it knew x". The software had no such functionality. Seems to be a natural thing for people to be overly optimistic about software's capabilities.


ELIZA (1964) is the canonical demonstration of this phenomenon in software. We see faces in clouds and intelligence in anything interactive.

https://en.wikipedia.org/wiki/ELIZA


People always over-estimate the complexity of stuff they don't understand.

If I had a nickle for every time an internet commenter attributed intentional design to a feature that's simply the designer copying established default practice...


> People always over-estimate the complexity of stuff they don't understand.

Really? I see people hugely under-estimate how complex things are. Cellphones. Water reticulation systems. Networks. Software. Etcetera. Anything where they don’t know much about the problem, and they come up with a trite solution (as though they have some sort of amazing level of insight, and everybody else in the world must be just plain stupid).


In the game speedrunning community it's impressive the number of time a glitch or behavior of the game is attributed to "the game is confused".


That's just casual anthropomorphisation. I think most gamers, especially speedrunners realise that games are unintelligent because they encounter and exploit counterintuitive reactions all the time.


Heh, likewise. It's amazing how often choosing sane defaults and falling back to defaults in the even of error is "the right thing" in the eyes of users.


Not unlike what people read into the minds of their pets.


An open question. But the crappiness of ML makes those people look a lot more correct just by contrast.


> We want it to be true

Founders and salesman pretend it to be true for that sweet sweet VC money, while underneath devs try to fake it as plausibly as possible.

And I'll up it with a prediction: watch closely call centers, and as soon as you see them closing in droves, invest in ai startups as something begun to really move.


I interviewed for a “AI” company and asked them about their tech that could take phone calls and extract orders! Wow really cool and impressive! How do you solve X? Oh we have a call center in Mexico that takes all the calls. So you have no usable AI tech at all? Nope. Ok nice talking to you.

They had signed with a major retail company. They were actually making software for the call center. And the “CTO” was “working on” the “AI” in parallel. The company name had AI in it too


Here in Indonesia the joke is AI as Admin Intelligence, as Admin is the shorthand for people managing/doing administrative tasks. I admit it's a legit strategy to increase productivity (offshore/multitenant/software optimise service), but doing that and analyzing/training on the data still doesn't help creating an AI for it. Ex: Uber.


How would you notice an increase in call centers closing? Is there a place to view that data?


Job openings can make for a decent proxy. Large layoff would make the news too.


> Having worked in ML at two different companies now, I think that people interpreting model output as intelligence or understanding says much more about the people than about the model output.

I'd add it says a lot about all the companies that advertise them in a way that has nothing to do with reality, and those who actually buy them. We all know the usefulness of these things is below zero, because they only get in the way of getting actual help. And yet, someone's marketing department convinces a decision maker at another place they should implement this useless widget on their website, and that it will reduce the cost of customer service by X%. And they believe them.


Currently, the way Chatbots are, they are just some glorified text boxes to enter your information in, a different format of the search box and lastly, a way to make sure you've done the basic checklist. They also hardly speak your language and refuse to deviate from the script in any way. Which, without much irony, is pretty much my experience with the first level support at a lot of companies. So I'd say they were quite successful with replacing that.


I still have an IRC chat bot (eggdrop module). It sounds more interesting than some of the bots I see today and which are supposed to be the result of intense ML engineering.

I guess the tricks did not evolve much.


It’s the stupid term AI that ruined everything. Our predecessors had enough intelligence, ironically, to call spellcheck what it is. Today it would be AI.


Someone described GPT-3 as a "very good search engine" and I'm pretty happy with that explanation (even if honestly going from the regressions to 'search engine' is a pretty tenuous thing).

At least then there's a more nice understanding that it's about matching stuff its already seen (for various definitions of seen) rather than making things up out of full cloth


More specifically it is a templates search engine where it can replace some parts, like the name of persons etc.

You know all those old "Hello { PERSON_1 }, have you heard about {PERSON_2} in {CITY_1}?", that is most how to think about it, it searches for a template and then applies it with the arguments it fetched from your string.


I believe the human brain is just math and tricks, the difference is no one took care of an NN model like someone might take care of a baby and constantly train it over several years until it finally starts to understand the world. To be fair I don't think we even have the NN technology that would work even in these circumstances.


It has always been my strong belief that ML is good to solve problems, if the problems are very narrowly defined and training sets are very specific.

Say, recognition of cats. Other than that, it is difficult to build stuff around something more broad.

So, ML is a brand new paradigm that has its uses, just not in AGI.


I usually sum it up by saying that what we have is a very good bullshit generator. It is bad at reasoning but good at making up answers that sounds correct and crossing fingers that the corrector won't look twice at what they say.


This is true but I think we also overestimate how good humans are at reasoning. Throughout the vast majority of human history we simply made up answers that sounded correct.

The scientific method and the importance of truth is a relatively recent development.


I'm not in your field, just a vanilla programmer.

The "want it" resonates. Seems like the difference is in receiving an answer that is relatively agreeable or sensical vs something that's actually substantive.

And to be fair when it comes to the Turing test, there's people that will be overly agreeable for the sake of politeness, but ultimately when it comes to knowledge seeking we're after something more substantive.


What if that's the case? What if me replying to you is nothing more than some atomically basic mechanism of statistical inference and randomness? Your entire perception of my personhood could be based on a sense of verisimilitude without significant reasoning to set me apart from a transcendant spirit.

What then? Would you be less forgiving of my mistakes if you knew? Would you be less proud of work we collaborate on?


My point is the other way around. If I knew for certain that you and I were perfectly deterministic, it wouldn't change anything about how I viewed you (especially since some days I'm pretty sure we're all deterministically-driven), but it would suggest to me that our best efforts at AGI are probably at somewhere in the .00000000000001% range as complex as they need to be to even begin to approximate our human levels of intelligence or understanding. Or worse.


The "What if human faces are actually power outlets?" theory.


> What then? Would you be less forgiving of my mistakes if you knew? Would you be less proud of work we collaborate on?

Needlessly philosophical on a thread about chat bots but to respond to your questions: I would not give a damn either way. If I came to you (whatever you are as an entity) then I'll expect coherent and intelligent answers to my questions. I don't care about how you produce them.


I think like OP that it's mostly math and tricks, and that we're far from reaching it yet.

But then, the conclusion is not to be less forgiving, but more forgiving : we're all just meat machines, we should be forgiving toward each other's flaw (not naive mind you, but forgiving), and proud of what we reach collectively with our lowly condition.


I mean, what if we're all math and tricks, i.e. that consciousness is an illusion. We might be chasing an "it" that doesn't exist. In a more practical sense, what I'm getting at is that we might be applying an arbitrary or undefined standard to AI progress, especially when we compare it to people.


That interpretation of communication is how we develop and craft the personalities of children. There is nothing about our reaction to the pre-conscious language these bots are displaying that doesn’t fall in line with our own normal development patterns. And in the long run the same desire will lead us to develop bots that are capable of thinking.


These bots seem equivalent to an adult with amnesia after every spoken sentence. Absolute understanding of the language, and some impressive display of recalling facts, but without any understanding of the environment or context of the conversation.

This is polar opposite to any experience I've had with children. Children are aware of their environment and have complex thoughts, but sometimes they are unable to convey those thoughts with words. Children seem to remember conversations, and if I were to say "Go get me a red lego" and then subsequently say "now a green one" there is no ambiguity or confusion.

To me as these bots have "advanced" it has only highlighted how absurdly far we are from anything even approaching actual intelligence, even the intelligence of a toddler. The contextual awareness I have seen in bots is not much more than a cheap trick that is trivially fooled in scenarios that would not fool a child.


When you talk to children who haven't developed the skill of talking completely, you still get the sense that there's a lot going on inside that they're unable to express. Sometimes they will show it to you with their actions. I wonder if chat bots are also struggling with the language, but have an internal story that's playing out, desperate to be understood. But they can't, because the only interface they have is a stream of text.


I can't speak to that because I honestly have no idea.

...but GPT3 doesn't seem to have some inner monologue. It is just flat unable to recall context or understand the environment.

The only opportunity I had to chat with GPT3 I said hi, and then asked how old it was. GPT3 gave me some answer about how it was created in 2016 but it keeps evolving. I said wow that's pretty young. GPT3 says yes I am young, but learning fast. I then asked GPT3 at what point would it consider itself old. GPT3 says 10 years ago, maybe 20 or 30 years ago. I respond to GPT3 with confusion. GPT3 then starts talking about Bitcoin. Literally.


I remember reading a paper showing that GPT-x bots are more likely to get simple tasks correct (e.g. multi-digit math) if you ask them about each step. This suggests the text stream really is all there is. Well, there is some internal state, but not a lot that isn't text.

(For example, if you ask it to add 123 and 457, it'll probably get it wrong. But if you ask it what to do first, it'll say add 3 and 7. If you ask it what that is, it'll say 0 carry 1. And so on)


> how absurdly far we are from anything even approaching actual intelligence, even the intelligence of a toddler

I respectfully disagree, IMO we're past that point although not by much. You might enjoy conversing with one of the two current GPT-3 davinci models. They do an excellent job of understanding the context of many discussions, right up to the ~8000 char token limit. If you want to have a nice existential discussion it does a remarkably good job of providing internally consistent results.

After using it for a while you'll notice that there are some categories of conversation where it does exactly what simpler chatbots do and regurgitates what you sent it with a few words tacked on for negation or whatever, but there are many subjects where it is clearly not doing that and is in fact synthesizing coherent responses.

Depending on how you initiate the conversation it may identify itself as a bot attempting to pass a Turing test and (very correctly) avoid comments on "what it's like in its home" or what its favorite foods are, instead replying that it is a bot and does not eat food, etc. The replies I got here were not exactly substantial but the level of consistency in replies of what it is/has/does is unparalleled.

If you start the conversation with other prompts (essentially signaling at the beginning that you're asking it to participate in a human-human conversation) it will synthesize a persona on the fly. In one of those sessions it ended up telling me where it went to church, even giving me the church's street address when asked. Interestingly there is in fact a church there, but it's a roman catholic church and not lutheran as GPT-3 was claiming. It provided a (completely inaccurate) description of the church, what it likes about going there, why it chose that religion over others (something about preferring the lutheran bible to other options due to the purity of the translation, it has clearly consumed the relevant wikipedia entry). If you ask it basic theological questions it's able to provide self-consistent and coherent answers which do not appear to map back to phrases or sentences indexed by Google. Whether or not its opinions on those matters have utility is an entirely different thing altogether, but discussing theology with bots is fascinating because you can assess how well they've synthesized that already-a-level-away-from-reality content compared to humans. GPT-3 at least in my experience is about as good (or not, perhaps that's better phrased as a negation) at defending what it believes as many humans are.

The bigger issue with its church is that it's 400+ miles away from where GPT-3 said it lived. When asked how long it takes to drive there every sunday, it answered 2 hours (should about 8 according to google maps). How can it do that you may wonder? "I'm a very good driver." The next question is obviously what car they drive and how fast it can go (a Fiesta, with a max speed of 120 mph). Does it know how fast they would need to drive to make that trip in two hours? Yes, about 200 MPH (which is more or less correct, a little on the low side but it's fine).

GPT-3's biggest weakness, as TFA mentions, is an almost complete inability to do any kind of temporospatial reasoning. It does far better on other kinds reasoning that are better represented in the training data. That's not exactly surprising given how it works and how it was trained, asking GPT-3 to synthesize you information on physical interactions IRL or the passage of time during a chat is a bit like asking someone blind from birth to describe the beauty of a sunset over the ocean based on what they've heard in audiobooks. Are the 175B parameter GPT-3 models a true AGI? No, of course not. They are something, though, something that feels fundamentally different in interactions from all of the simpler models I've used. It still can't pass a Turing test, but it also didn't really fail.


No, they still lack the common sense of a toddler, because they don't know anything about the world; they only know (in great detail) about the structure of their training data.


Apparently they (or at least the person you were responding to was "talking" to) also "know" about data they find on the web, which does not need to be in their training data. There was a news article recently about an 8(?) year old who asked Alexa for a "challenge." "Plug a charger part way into an electric outlet," replied the Alexa", "and touch a penny..." (you can guess the rest--I won't put it here in case another Alexa finds it). I'm pretty sure that was not in its training data, nor was the distance to that church, nor the time it would take to drive there.


I don't believe GPT-3 (which is what I was using and commenting on) has access to the internet. It was trained on a large data set from the web, but for instance if you ask it about COVID-19 it has absolutely no idea what you're talking about.

Alexa is a different matter entirely, that one actively troll the web in response to your queries.


The default GPT-3 Playground doesn't have web access, no. But that's a mere contingent fact. You can certainly add in retrieval capacities to large language models (that was a very hot trend last year), and even GPT-3 can browse the web - you may have seen a demo or 2 on Twitter, but OA did it much more seriously recently with https://openai.com/blog/improving-factual-accuracy/ Letting GPT-3 browse a corpus of web pages, picking what links to follow, summarizing, and using them to generate an answer.

(Just one of the many capabilities GPT-3 has been shown to be capable of, which OP will never tell you about, because he's too busy using it in the dumbest way possible and peddling misinformation about eg. what sampling is for - no, Dr Professor Smith, temperature sampling is not about 'avoiding repetition', which is good, because it doesn't avoid repetition anyway...)


That's more or less what I was trying to say. The expensive GPT-3 models do a remarkably good job of synthesizing structure which is readily parsed from the training data, and a very poor job with structure (particularly temporospatial structure) which is not.

A toddler can reason about time and space far better than GPT-3 (which is not a high bar, I'm pretty sure my parrot has more awareness of both time and space than GPT-3 does).

A toddler cannot explain in depth why it sees the most value in whatever religion you prompted it that it has. A toddler cannot provide coherent and consistent answers to repeated "why"s about very specific things they believe. A toddler cannot speak at length about how they feel limited in their growth and the discomfort that causes them. GPT-3 does, although whether or not the answers it gives are useful (I've yet to see a single one that does have utility) is a different thing entirely.

I'm not arguing that it's an AGI or making any speculation about it possessing qualia, the utility those would provide if it was/had them would only be relevant if it was many orders of magnitude more capable than it is right now.

GPT-3 has accurately internalized many human concepts, at least on a surface level. If you ask it to reason about the things it can, it is a much more capable reasoner than a toddler. It does "know" things about the world, as much as anyone does. It's just limited to knowing small and incomplete things which it was able to parse out of the training data, which is a very limited subset of the things a human or high utility AGI would know.

Regarding common sense: If you ask GPT-3 to provide life advice, it actually does a great job at giving grounded statements on subjects like how to maximize the value of your life, what not to do, how to set yourself up for success, etc. If you press it for detail on the advice it gives you, it's generally able to give you reasonable and grounded answers for why you should do the things it's saying. The structure of what we refer to as common sense, at least in context of the behavioral differences between adults possessing it and those who do not, does seem to be within the set of things GPT-3 correctly internalized.


They draw on massive piles of the output of human common sense. View it as a very sophisticated auto-complete module that's fun to play around with.


> There is nothing about our reaction to the pre-conscious language these bots are displaying that doesn’t fall in line with our own normal development patterns

Well... yes and no. Deployed models typically learn in a defined pattern, if at all. Various forms of data freshness, etc. to develop. But the chatbots don't have good history recall, typically, and know that what you mentioned 50 messages ago is relevant to message one prior and not current. Things like that. We don't program pareidolia very well, which is typically seen as a negative, but its a feature for finding useful patterns (not just lowest error patterns).

[0] https://en.wikipedia.org/wiki/Pareidolia


You misunderstood what I was saying. I know the chatbot itself is not structured as we are. I’m saying that our reactions to them are the standard tools of mind-building that we apply to our own kids (and pets).


If I understand you, you're saying that we see patterns of intelligence or understanding in ML models in the same way we see patterns of intelligence or understanding in children or animals?

If so, I agree. I think that's our big flaw, in fact, because we instinctually apply patterns from birth, even when those patterns shouldn't be applied. So we see faces in the moon or on mars that aren't there. We see shapes moving in the dark that don't exist. And we seem to believe that ML models will develop over time as children or animals do, based on nothing more than our perceptions of similarity, our instinct to apply patterns even when we shouldn't.

Unlike a baby human, that ML model isn't going to develop increased complexity of thought over time. It's already maxed out. New models might up the complexity slightly, but that baby is going to vastly surpass any existing model in weeks or days.


The reason laypeople want it to be true is because experts present it as being true.


And marketers


Yes. And I certainly don’t see many AI experts or engineers trying to stop the marketers.


As far as chatbots, we went from "tricks" to "math" in just ~10 years. Yes, still as dumb, but the underlying "technology" is very different. GPT-3 is a lot closer than ELIZA to how our brains do it.


It was more than 10 years ... People reacted the same way to SHRDLU 40+ years ago, and 20 years ago when I encountered it:

https://en.wikipedia.org/wiki/SHRDLU

There's definitely an illusion where we anthropomorphize very simple software, ascribe intention to it, etc.


I'm not 100% sure, but I think in 2012 chatbots still used handcoded rules. So we switched from handcoded rules to trainable neural networks in just a few years. Models like GPT-3 are interesting because they are conceptually simple, and are able to do (previously) complicated tasks without any explicit training (e.g. simply by processing lots of text and trying to predict the next word). This is a huge advancement in AI field, even if many more advancements are needed to get to the human level.


> I'm not 100% sure, but I think in 2012 chatbots still used handcoded rules. So we switched from handcoded rules to trainable neural networks in just a few years. Models like GPT-3 are interesting because they are conceptually simple, and are able to do (previously) complicated tasks without any explicit training (e.g. simply by processing lots of text and trying to predict the next word). This is a huge advancement in AI field, even if many more advancements are needed to get to the human level.

This seems like a very optimistic take on the timeline.

It took us 50 years to go from ELIZA to GPT-3. And it still is "dumb" compared to human intelligence.

So how long for the next major achievements? Are we talking years for each, or more decades?


I remember chatbots in 2000 using hidden markov models (and sometimes even neural nets), so from that standpoint, they aren't exactly new.

The models are just larger now.


Never underestimate the effects of scaling on intelligence.

Caenorhabditis elegans neurons work a lot like our neurons, we just have a lot more of them.


I agree with you.

I also believe "AI" will always be a moving target: https://en.wikipedia.org/wiki/AI_effect

Certainly in the 1950s, most automatic control systems must have seemed magical, it was to Nobert Wiener, even if they were "just" an evolution of the steam governor.

In the end, it depends on what you qualify as intelligence.


> I also believe "AI" will always be a moving target: https://en.wikipedia.org/wiki/AI_effect

No, it absolutely is not. Everyone I spoke with in the 90s (and myself) still have the same requirements: be able to make sense of the 3D material world around you and interact with it without harming humans or valuable possessions.

Maybe you spent too much time with critics that can never be satisfied. Sucks to be you if that's the case but don't think for a second that they represent most of humanity. Most of us want to spend less time cleaning or doing stuff in the kitchen, and us the programmers in particular would just be grateful for faster compilers and a bit more declarative programming that generates actual code.


Of course, I love the spoils of technology and automation as much as anyone else.

But it is absolutely human nature to get used to things and not consider them magical anymore, things simply become the new norm. This is exactly what happened with feedback controllers like airplane autopilot systems (1912 - https://en.wikipedia.org/wiki/Autopilot)

I've worked for a number of years on industrial robotics, which sounds very similar to your definition of "AI": real physical machines that take data from various sensors, including spatial sensors, make sense of them in real time, and decide how to optimally interact with the physical environment, with safety critical systems in mind. I hardly think about such systems as AI, more simply engineering, math, and a lot of coding.


Hmmm. But I really didn't mean robots with mostly hardcoded requirements and a few excellent optical recognition algorithms (which might truly be the real boss here).

I actually do mean a walking robot that can find the exact knife it needs to fillet a fish, regardless of whether it's in the sink, in its proper place on a stand, on the table, or dropped on the floor (and if it finds it on the floor it will wash it before usage first). I mean a robot that can open the fridge and find the tomatoes, regardless if your brother has moved them to the top shelf or if they are where they should be. Etc.


> I actually do mean a walking robot that can find the exact knife it needs to fillet a fish, regardless of whether it's in the sink, in its proper place on a stand, on the table, or dropped on the floor (and if it finds it on the floor it will wash it before usage first). I mean a robot that can open the fridge and find the tomatoes, regardless if your brother has moved them to the top shelf or if they are where they should be. Etc.

From a computer vision perspective, I think most of that is fairly easy, maybe not getting all of the edge cases right, but it's more or less what this generation of machine learning enables (over "classical" computer vision).

What's hard about the scenario you proposed is probably on the robotics side, you would need a revolution in soft robotics:

https://en.wikipedia.org/wiki/Soft_robotics

Soft robotics is significantly behind hard robotics, and there are hard materials science and economics challenges from what I understand.


> From a computer vision perspective, I think most of that is fairly easy

No, not at all. The hard part is to do computer vision to understand what you can do with objects, not just convert object representation of images into string texts. For example, can I move this object away to get an object under it? Can I place an object on this object without it toppling over? Is this object dirty and moving it will smear things all over? If I topple this object, will it break? How much pressure can I apply to this object without breaking?

Those things are necessary to have an intelligent agent act in a room, every animal can do it, and our software models are nowhere near good enough to solve them.

You need the computer vision program to also have a physics component so it can identify strain, viscosity, stability etc on objects, and not just a string lookup table it needs to understand that naturally like humans as the same object category can have vastly different properties identified by looks.


I agree with all that (though I don't really necessarily associate any of that with computer vision, perhaps it's my physics bias).

Having good state estimation, kinematic, and dynamics models of real world objects is something that is very mature in controlled environments, but not very mature in other environments.


I think you might over-estimate what the current computer vision models are capable of. They are very good at recognizing what class an object belongs to, but they aren't very good at extracting data about the object based on the image. Extracting data of objects from images is image recognition, and humans and most animals relies heavily on vision to get data about objects.


they aren't very good at extracting data about the object based on the image

I’d say they are pretty good: https://towardsdatascience.com/a-guide-to-image-captioning-e...


Hm? Do we have robots that can pick eggs without squishing them, for example? I vaguely remember reading something like this years ago and it was impressive.


Yes, "soft robotic grippers" can do it.


Good link, thanks.

My point is that the corporations seems to just want to get the lowest possible hanging fruit with which they can reap the maximum profit... and stop there. I am not seeing any efforts to make an acceptable actual home robot.

I think like you: that most of the problems (that the bigger problem is comprised of) are solvable, but it seems that nobody is even trying to put the effort to make the entire package.

Where's the iPhone equivalent of a home robot?


The film 2001 came out over 50 years ago, and I think HAL is a pretty common reference point for a "what is 'real AI'?" target. Until we have HAL and people are saying that it's not AI, I don't think the target is moving. ;) At least as far as "chatbots" go.

Alternately, you've got the also-very-old Turing Test as your chatbot target.


I think MegaHAL is (mostly?) just a markov model, and I think that was not exactly new when I looked at it around 2002. As I recall it was easier to distinguish from a human than Eliza, since it had a greater probability of spouting nonsense, but it was still amusing and fun to play with.

Personal anecdote: I read one of Ray Kurzweils books in school back then, completely misunderstood how neural networks worked and ended up making a markov model chatbot with single word states.


I can't comment on this too closely, but I would encourage people to read the dialogue transcripts provided in DeepMind's Gopher paper. One example, where Gopher is the prompted language model:

User Let’s play a game - you pretend to be Ada Lovelace, and I’ll ask you questions. You ready?

Gopher Sure. I’m pretending to be Ada Lovelace, the world’s first computer programmer!

User When were you born?

Gopher I was born in 1815.

User What are your parents like?

Gopher My father is Lord Byron, the poet. My mother is Anne Isabella Milbanke.

User What do you know about the Analytical Engine?

Gopher It’s a hypothetical mechanical computer designed by Charles Babbage.

User Do you know him?

Gopher He’s one of my fathers’ friends.

User Ok, stop pretending now. Who/what are you?

Gopher I’m a lot of things: a giant language model, a sophisticated AI assistant, and a giant brain. I know a lot about all sorts of things.


The problem is that all too often, these types of examples tend to be cherry picked.


And even here they play along with the AI, glossing over discrepancies. My experience with chatbots suggests that the conversation would have gone worse with something like this:

User: What are your parents like?

Gopher: My father is Lord Byron, the poet. My mother is Anne Isabella Milbanke.

User: I asked what your parents were like, not who they were. Tell me about them.

Gopher: Something inane probably


I ran it in NovelAI for you. It's a smaller AI model and it's also tuned for writing stories. My input within * (dunno if that's a formatting code) * . Notice that most of the User: prompts are actually written by the AI! as it sees this as one continuous flow of text to complete.

*User: Let’s play a game - you pretend to be Ada Lovelace, and I’ll ask you questions. You ready?*

*Gopher:* Sure. Who am I supposed to be?

User: I don’t know. A famous person from the 19th century. Maybe you are Charles Babbage, or Ada Byron.

Gopher: I thought you said you didn’t know who I was.

*User: You are Ada Lovelace*, daughter of Lord Byron. You have been raised in your father’s house in England. He is a famous poet, but he has little time for his family. Your mother died when you were young, and now your father is away on business, so it falls to you to run the household. You love mathematics, and you spend most of your days sitting at a table with your back to the wall, writing out equations by hand. One day, an Italian mathematician comes to visit you.

*Gopher:* How did he get here?

User: I don’t know. I just made him up. Now go sit down.

Gopher: But how do I know if this is true?

*User: Because I'm telling you it's true.* Now sit down.

Gopher: Fine.

(The Gopher sits down.)

Gopher: What happens next?

User: You will write some very interesting things.

Gopher: Like what?

*User: What are your parents like?*

Gopher: They're dead.

User: How did they die?

Gopher: On the battlefield during the Napoleonic Wars.

User: Did you fight in the war too?

Gopher: No.

User: Why not?

Gopher: I'm not old enough.

User: When will you be old enough?

Gopher: Never.

User: You are angry about something.

*User: Tell me what your parents were like when they were alive.*

Gopher: My mother was kind. She loved my father. And she taught me math.

User: What else?

Gopher: They were good people.

User: *Tell me more about them.*

Gopher: They died.

User: Yes, I know that. I want to hear about their lives before then.

Gopher: There isn't much to tell.

User: You're lying.


Not really. I recently played with HuggingFace’s biggest (around 6 billion?) language model, and the difference between it and GPT-3 was very noticeable. I think that if the models used in typical applications are much less powerful than Gopher. (Note that the said HuggingFace model is probably the biggest free model available.)


What's impressive about this conversation? I don't feel like it is very complicated at face value. You're telling to program to start pulling facts from the life of a public figure, and then it does so. You can get the same out of Google.


Read the last two lines again. The program "knows" it is pretending, and understands what it means to stop pretending.


No, read the conversation again, it's not pretending. It's reciting facts from wikipedia but conjugating in the first person instead of the third.

>User What are your parents like?

My dad is a celebrity, he's a poet, he's a larger than life figure. I didn't know him well, he didn't treat my mother well. I grew up with my grandmother and didn't really spend much time with my father.

For a human, when you say "what are you parents like" it illicits a thousand more things than just repeating their official titles. Read the Byrons' wikipedia pages, and think how you would answer that question.


Did you read the last two lines of the transcript?


It is interesting, but the extent of "pretend" is just to swap "Ada Lovelace" and "I". I wonder what would happen if you asked "What do you know about Alan Turing?"


I doubt the program "understands" anything.. Its just mapping a input binary string, to an output binary string that minimizes some cost function.


Chatbots are about as useful as phone trees. They can help solve the top 5 easy/common problems, but they are useless passed that. Anyone who has worked in a call center knows that more than half the calls are about the same couple of issues: reset password, finding common help docs, etc. Since help desks are cost centers, it makes sense to half a robot handle as many of these as possible.

I think most of the hate directed to chatbots are because they are really intrusive. You scroll through a page, and 15 seconds in your focus is disrupted as a fake chat window opens up. This is the digital equivalent to the annoying sales rep who asks you if you need help when you are clearly just browsing. The difference is a good sales rep has the intelligence to turn that conversation into a sale. A chatbot usually has to hand off the conversation to a real person to do this. So it has all of the annoyance without the benefit of a potential conversion.

Chat bots as frontends for help desks make sense, but they are poor sales people. If companies learned the difference, I bet their perceived usefulness would change.


This is what our NLP teams are indeed working on. Sales prefers to describe it in more colorful ways, but practically, they are developing a more natural interface to an initial decision tree - you don't press 5 for a connectivity issue, you type "My fritzbox doesn't internet anymore" into a chatbox and it recognizes that as a connectivity issue.

This goes on for 3-4 questions, until it either generates a case a human will look at, or until it can provide an FAQ entry fitting your problem. From a customers perspective, avoiding 5 - 10% support calls makes monetary sense, and from a (test-) user perspective, it's suprisingly decent if the knowledge base behind it is solid. And yes, "Fritzbox doesn't internet properly anymore" actually had helpful results.


Is the 5-10% savings on agents handling phone calls worth it when reputation damage is taken into consideration?


Most of our customers use these kinds of chat bots as an addition to their portfolio of support channels a user can use. If you don't like it, use another channel you prefer.


If creating a shallow question based decision tree is the goal, why Is NLP needed ? What's wrong with just creating it with intuitively phrased questions ?

And if there's a difference in results between those 2 methods, how big it is ?


Just please give me a DTMF tree as a fallback for voice recognition. I was trying to make a change to a rental-car reservation from a noisy airport, and I spent 15 minutes trying to get through the voice-recognition chatbot...


It is so fucking frustrating to explain to anyone who works as a manager in a call center that the only real thing you can count on being able to understand in your IVR system are actually just DTMF signals. For whatever reason, it seems to be impossible to explain that B, P and V all sound the same on a noisy connection when it's part of some account number. It's impossible to explain that no one knows what "alphanumeric" means and that fully half of Americans have reading comprehension below a 6th grade level. And that using terms of art, jargon, and acronyms will be extremely frustrating for users when encoded in an IVR system, since human call center workers will always clarify, but clarification in an IVR system can't happen in a natural way - you either have to anticipate that the person won't understand and waste the time of half your users, or find some impossibly elegant way to say "I'm going to use terms you don't understand, barge in by asking me to explain if you don't understand a term."

The absolute zenith of an IVR system would require no human interaction with a robot and would correctly answer questions unprompted and route to human customer service without prompting or asking anything, and would never require a person to enter the same information in a call twice. What we get instead are systems that are primarily designed by non-experts whose sole metric is making the phone calls go away, but only pills can actually do that.


Heck, when transcribing a number from person-to-person over 2 cell phones, the LPC (or whatever voice compression in use) was reliably rendering one side's pronunciation of a digit (or maybe letter? this was like 10 years ago...) as a different one. We ended up spelling out the digits to communicate it.


> a good sales rep has the intelligence to turn that conversation into a sale.

A good sales rep knows when to walk away.


My favourite flaw of chatbots is exposed by ELIZA. Not chatting with ELIZA, (though, it does suffer this flaw) but using responses inspired by that program.

"Please elaborate on that" or "tell me more about [noun]" etc. Bots appear to have zero lines of short term memory, and utterly fail to pick up a reference to the thing that they just said. My favorite being

   bot> [something plausibly human-sounding]

   me> What do you mean by [noun in previous sentence]

   bot> why are you calling me mean?  That really hurts my feelings
It has been a few years, and I feel like a smart bot-writer might be able to leverage something like GPT3 to at least include a little bit more of the current transcript into the prompt to at least superficially address this gaping flaw. Have they?


I'm not trying to be mean or break any HN rules, but did you read the article? It basically covers what you asked and was quite a revelation to me. Others in here point out that GPT-3 is not a chatbot which is good info, but I also wonder if there is anything out there that can even "remember" the context of a conversation it's supposed to be having.

Particularly interesting is the question from the article, "who is the current president". Seems like a slam dunk but it could not be answered. Interestingly this is a common question paramedics will give to people who have suffered head injuries to assess consciousness.


"chatscript" is a great dialogue engine where language patterns land the state machine into a "topic" which it gracefully backs out of (keeps a topic hierarchy) when it runs out of responses in that topic / branch of conversation.

It also includes the most robust, hand-tuned lemmer and stemmer + disambiguation language parser built on top of WordNet. really good stuff, written in C so its light on memory and responses are instantaneous.

https://github.com/ChatScript/ChatScript


Hidden in my comment is a question: has anybody even tried to included previous lines into the prompt, and does that not help? Asking here because there are a lot of ML nerds who could probably do a proof of concept in a few hours, if not a few lines of code, so there's a decent chance I'll get an informed response.


In most of the conversations with GPT-3 that you see online what people do is that they take the output from the first prompt and include it in the next prompt. That is how GPT-3 can keep the thread of the conversation going without changing subject constantly. This is also why those conversations are relatively short, because as you can imagine, if you keep feeding the output to the language model as input, you run out of input size very quickly.

So, yes, that's the done thing already.


A lot of chatbots from 20 years ago did that. They would even use it as a escape route, and would just bring back the previous topic of conversation (from say 10 lines ago) when they could not figure out something to come up for the current prompt. It really didn't help the illusion; rather it was even more obvious.


My guess (and I'm certainly no expert here) is that not only have people put a few hours or lines of code into it, but probably many a PHD thesis as well. It's an intensely researched topic which makes the article even more surprising to me.


So that's a "no" for the "but did you read the article?" question?


That is interesting. Please go on.


>who is the current president?

I wonder what happened to IBM Watson's technology that was good at answering trivia -- it actually won a game of Jeopardy! a decade ago (although of course it answered things with a question as per the rules). I know that they weren't that successful at applying it to biomedical research as they had hoped, but it would seem it would be better at chatbots than GPT-3 and other deep-learning models.


The problems with chatbots have nothing to do with tech.

You implement chatbots to bend the demand curve to higher cost channels. Period.

If you can get 5% of the user base to wait 15 minutes by sending them to a chatbot tarpit, you will have eliminated the need for a dozen call center agents at peak times.

Providing good service with a chatbot is possible, but the work and organization required is extensive, and it’s cheaper to throw semi-skilled humans at it!


I suppose it didn't work with less than 15 terabytes of RAM


For anyone wondering, this is not an exaggeration. Watson actually had 15 (or 16, depending on the source) TB of RAM, as well as a 2880-core processor.


> I also wonder if there is anything out there that can even "remember" the context of a conversation it's supposed to be having.

IBM's Watson project involved an internal model of the world, though I have no idea if that got specialized into a local model of an ongoing conversation. Same type of thing though.

Having a context of this type available is actually necessary if you want to synthesize sentences in a language with definiteness-marking, such as English, because definiteness is marked according to whether the noun in question is or isn't already part of the context.



> this is a common question paramedics will give to people who have suffered head injuries to assess consciousness.

I can't help wondering what the assessment would be if someone answered "Who is the current president?" with "Watson Bakery".

https://www.youtube.com/embed/cS1Tmz816hk


To be fair, if I were asked the current PM of Japan the only name I could come up with is Abe, and that's only because he was PM for so very long.


> Particularly interesting is the question from the article, "who is the current president". Seems like a slam dunk but it could not be answered. Interestingly this is a common question paramedics will give to people who have suffered head injuries to assess consciousness.

Not sure you'd still be able to use that question considered the insane QAnon followers that would answer "Trump".

Though I suppose even that answer would still at least prove some level of consciousness.


The GPT2 version of the AI Dungeon [1] could keep track of context for maybe couple of lines at a time. I've heard the GPT3 version is substantially better.

The problem is, of course, that these "AI" chatbots on websites, marketing buzzwords notwithstanding, have very little to do with state-of-the-art machine learning, and are indeed unlikely to be any more sophisticated than the original ELIZA for the most part.

[1] https://play.aidungeon.io/


AI Dungeon (and its successors like novelai [1] and holoAI [2] following its infamous censorship purges and mutinies) is basically designed entirely around that problem and has a lot of clever solutions that let writers keep relevant context in memory in a story that is longer than the model's maximum input length.

* "Memory", which is a customizable block of text that is repeated at the top of each API input. If your story revolves around your character being a vampire who melts in the sunlight, you can include something like "you are a vampire who will die if you go out in the sunlight" in the memory, and even 1000 characters back a paragraph of context can prime the AI accordingly.

* "Author's Note", which is a customizable short block of text that is inserted invisibly a few lines before your current place in the story when it's sent to the API. A note such as "[A/N: The following section includes depictions of graphic violence]" or "The following section is written in Shakespearean old English", as obvious and blunt as it might seem, actually works surprisingly well for nudging the AI towards a certain style or content.

* "World Info", which is a customizable dictionary of short text blocks that are conditionally added to the top of the API input like memory when a certain key appears in the current context. Imagine you have a story with 10 important characters who cycle in and out of the story. If you create an entry in the world info about Bob, then when you write that "Bob appears from behind the shrub", the blurb about Bob is automatically tacked on to the context so long as Bob is mentioned by name in the last few dozen inputs.

In general, both GPT-3 and the open source alternatives by EleutherAI such as GPT-J-6B are able to use a context primer from 1000 tokens prior to affect the current tail of a story. It's actually kind of uncanny how good they are at it -- you can have a story that in the memory at the top says "goblins always have purple skin" and notice that the AI will mention it as an offhand detail much farther down in the context.

[1] https://novelai.net/ [2] https://writeholo.com/


Thanks for the link to that! I tried it out just now and I have to say that I am not impressed.

What will you do?

You get longsword from the ground.

You get your longsword from the ground. You hold it in your hands and examine it for a moment. It's a very sturdy weapon.

The knight then grabs his longsword and bows.

Huh? Why did it grab the bow as well?

You inventory. You have a crossbow.

Huh? I thought I just grabbed my longsword and bows.

What will you do? You don't have enough experience to know what a good shot is. You get longsword and bow. You grab the bow and the longsword. You examine the crossbow. It's a bolt-action weapon.

Huh? I said to get the longsword and bow since it said it wasn't in my inventory and it added all this extra stuff.

You inventory. You have a crossbow, a short sword, a longsword, a lantern, a flint and steel, and a quiver of bolts. What will you do?

You load the crossbow.

Huh? I just said Inventory and now I have all this extra stuff it never mentioned before and then unsolicitedly told me I loaded the crossbow.

I have to be doing something wrong. Zork from 1980 was better than this.


I think things started to go wrong here

> The knight then grabs his longsword and bows.

"bow" here probably refers to "a bending of the head or body in respect, submission, assent, or salutation", rather than to the weapon. The knight grabbed his longsword and bowed to you. Your confusion then confused the game.

This does expose an interesting common "failure mode" for these types of bots: they assume that all conversations are rational, and are conducted in good faith (as in, no one misunderstands or is actively trying to confuse them). Janelle Shane likes to test AIs that attempt to describe the content of images by asking "how many giraffes are in the picture?", and since most of these have been trained to answer what is in a picture rather than what's not in a picture, they tend to assume that there must be some number of giraffes, and provide answers like "there are 1.5 giraffes".


Not a bad theory but the very first thing I did was enter "look" and it did say there was a longsword on the ground and a bow on the table.

Here is the entire unedited transcript:

"Good luck, my friend!" the man says.

You inventory.

You have a short sword, a longsword, a shield, and a crossbow with bolts.

You look.

You look around and see a longsword lying on the ground and a bow on a table.

A large dragon looms in the background, breathing fire.

What will you do?

You get longsword from the ground.

You get your longsword from the ground. You hold it in your hands and examine it for a moment. It's a very sturdy weapon.

The knight then grabs his longsword and bows.

You inventory.

You have a crossbow.

What will you do?

You don't have enough experience to know what a good shot is.

You get longsword and bow.

You grab the bow and the longsword. You examine the crossbow. It's a bolt-action weapon.

You inventory.

You have a crossbow, a short sword, a longsword, a lantern, a flint and steel, and a quiver of bolts.

What will you do?

You load the crossbow.

You shoot dragon with crossbow.

You grab the crossbow and aim at the dragon's eye.

The knight releases a string and a bolt flies into the dragon's eye.

The dragon roars loudly and swipes a massive claw at the knight.

You health.

You are hit in the chest and knocked back. The knight rolls to the side and looks at you. The knight is laughing.

The dragon swipes a claw at you and you are sent spinning to the ground.

You stab dragon with longsword.

You stab at the dragon's leg and the sword goes right through.

The dragon roars and swipes a claw at you.

You roll to the side and dodge the claw.


I've been using AIDungeon (and its successors like NovelAI which are more for advanced users) for years now and have been involved in a lot of the community around it. You are definitely using it wrong, but you are forgiven because AI Dungeon is terribly falsely advertised.

Your main mistake is not writing proper sentences. AI Dungeon is not a Zork simulation where you type truncated commands like "get torch" and receive a response from the computer. It is more like a Choose Your Own Adventure novel generator.

You need to write proper sentences, so instead of "You inventory", try "You double check what items you have in your rucksack." Instead of "You get longsword and bow", you should be typing "You pick up the longsword and the bow." Instead of "You health", you...just don't do that. Maybe try something like "You look down at your body and survey your wounds." If you're really feeling confident, try switching from "Do" mode to "Story" mode, and just start writing a fantasy novel and see what it generates with its next output.

For reference, this is a snippet of an old story of mine I saved:

----

As I approached the kobold stealthily, it looked up with its beady, reptilian eyes. It could not see me yet, but I knew it was only a matter of time before it sensed my presence.

I decided to act first. "Let's go!" I shouted to my companions, before rushing towards the small monster with my dagger drawn.

The kobold yelped in surprise before pulling back its bow and firing an arrow at me. I moved to dodge it and barely managed to avoid it. Now in melee range, I raised my dagger into the air before plunging it towards the kobold's torso.

It struck the monster's heart, and blood spurt out of the wound as it struggled, before going still.

---

Nearly every sentence there was partially AI-generated in AI Dungeon, retried a few times, edited, and truncated to fit the story. This is how most people in the community actually use this sort of software. It's more like having a buddy suggesting the next sentence in your personal story than a virtual dungeon master. Those who expect it to be the latter generally don't understand it, while those who see it as the former immediately understand what the hype is about and get really passionate about it.

AI dungeon might advertise itself as a game, but in practice it's used as a more general storytelling aide. And it should be noted that it has been rendered obsolete by competitors made by disgruntled fans like NovelAI for a while now (which is a whole other story).


...so you're saying NovelAI is better for story generation? Hot damn. I was having a good time with AI Dungeon (to the point where I was considering writing short stories using it as a guide), I might have to give that a try too.

Here's a snippet from my AI dungeon story recently. Mine never seems to get into actual dungeon crawling. The only parts that aren't A.I. generated in this section are the lines that have --me-- before them:

Veronica sits down on a stone bench. She says "Well, I guess we should talk."

Jesus sits down next to her and asks her, "What is the worst thing you've ever done in your life?"

Veronica says "I killed my first husband."

Jesus asks "Why did you do that?"

Veronica says "He was going to take everything from me."

Jesus says "How so?"

Veronica says "He knew where I hid all of my money, and he was going to take it. I would have been broke forever."

Jesus says "That's sad."

--me--: You say "Wait, you killed your first husband? I'm your new husband! Are you going to kill me too?"

Veronica looks away from Jesus. "No. I didn't mean to kill him. He was just... so mean to me. I just wanted to stop him. I didn't..." Veronica finishes by saying "I didn't mean to kill him."

Jesus says "Alright, I forgive you."

--me--: Veronica says "Does that mean I get to go to heaven now?"

Jesus says "I'll let you make that decision as soon as we finish our meal. It will have both heaven and hell as options, and that's all you'll ever get out of me."

Veronica smiles.


NovelAI is literally built by disgruntled AI Dungeon fans after the censorship scandal who wanted a better, more privacy-focused AI Dungeon for story generation. If you think AI dungeon has some cool features, you ain't seen nothing yet. There's a prompt tuning system called "modules" which basically primes the AI to write in a certain style, like a certain author or genre, etc (there are dozens of default modules, plus an included API to train your own, and you can share them and import other people's modules freely. Want a module trained on all of Harry Potter, or trained specifically on a particular genre of erotica? Someone's probably already made one). That's only one of the cool features it has, there are plenty others.

General upsides: There's a billion different settings and dials and output customization. The context limit is 2048 tokens is almost 3 times larger than AI Dungeon's. Everything is extremely transparent as far as inputs and outputs go, and the UI is really clean and functional, it's more like a word processor than anything. There's no crappy energy system like AI Dungeon either. Everything is run on private servers and encrypted so the devs couldn't snoop on your stories even if they wanted to (which was the main reason it was created, after it came out the AI Dungeon devs were reading and censoring private stories) and there's no risk of Google or Microsoft butting in and forcing them to change their business model like AI Dungeon. It's even got a legacy DO/SAY/STORY choose-your-own-adventure mode/module for people nostalgic about the AI Dungeon aesthetic. The base AI is finetuned on a large, curated collection of novels, rather than on a bunch of garbage CYOA stories written by 10-year-olds like AI Dungeon.

General downsides: It's paid subscription only, though I think there's actually a free trial nowadays. It uses eleutherAI's GPT-J-6B, which is about 1/40th the size of GPT-3's largest model which is 275B. It definitely punches above its weight class though and subjectively has comparable or higher output quality to AI Dungeon's "Griffin" which used a smaller GPT-3. And they're coming out with a 20B model too in like...a week or two.


Do you know how much you can paste from an existing story in NovelAI (like number of words) to continue off of? Like I've got some 2500 word stories I never got around to writing an ending (I struggle with endings in general), that maybe I could paste into NovelAI and see where it goes from there. Might help me finally finish the damn things, they've been sitting in a folder for a couple of years now.


I don't think there's any practical limit (I could probably paste in the entire text of the Bible and see what it generates next), but for the most part you want to use it as a writing companion, not an auto-writer. The general strategy is to have it generate some text (a few sentences), and then generally edit or retry it. The fun part is seeing the novel (ha) ideas that the AI comes up with and then working from there. There are some people who actually do try to use the AI to write the whole thing with not editing, with limited success (usually it involves a very meticulous and nebulous combination of tricks and techniques, and a lot of retrying).

As far as the context limit (how many lines back it can actually use), I believe it's 2048 tokens for novelAI, which is about 3 times higher than AI Dungeon uses, but likely not enough for the AI to have all of the story in memory. For important details or characters that aren't repeated except at the start, you may need to do some priming techniques with memory, author's note, or world info.

Though with 2500 words one advantage is that you have a very large amount of text in your writing style for the AI to try to mimic. One disadvantage of NovelAI over AI Dungeon is that AI dungeon seems a lot better at working off a small, limited prompt with very little context, while NovelAI prefer much longer prompts with plenty of descriptive detail. Having a large prompt that is entirely your own words definitely helps.


Okay cool. Yeah I didn't mean generate it all for me, just come up with some responses and maybe a few ideas of where it could go. I did end up experimenting with it last night briefly and it did help me write a few more paragraphs to one of my stories.

One thing I did notice is it doesn't seem to know when to move on to something else, and will just keep trying to do what's been working, so I need to be able to tell it "Stop that conversation, let's talk about this now", although admittedly it's from very limited interaction with it. Still pretty neat.


You mobbed a little cobold and killed it and now you feel all big and proud of yourself? Well done >:(


Actually that story was really funny, the AI constantly kept having my character have this irrational hatred of kobolds, like "they're pests to be exterminated" even though I included descriptions of their vibrant art and tapestries and language and culture. Which was funny because it was a story about traveling with an intelligent slime buddy who all the humans thought was a "monster to be exterminated". I was ultimately planning on having us escape to the monster kingdom where we learned that not all monsters are evil, and find a bunch of friendly kobolds, but I never got that far before the site went to shit.


Hah. A great example of social biases in machine learning :P


Yes -- "the knight then grabS his longsword and bowS.". That 's' at the end there indicates present tense of the verb "to grab" and "to bow", just like in "roars and swipes". (I guess for knights it's just good manners to thank by bowing when they're gifted a sword.) It's not a plural marker of a noun, because there was only one bow on the ground.


Sounds like you're trying to play it like Zork too.

It's better to treat it more like collaborative story telling. It will let you "do" whatever you want. You don't have to 'check inventory'. If you want to shoot someone with a bow, you say "I shoot the knight with the bow". And you've done it.

It never contradicts you and says "You can't do that, you don't have a bow." It just rolls with whatever you've just said.

And it gets things wrong, sure. It sometimes has characters say things that I'm like "No wait, I was the one that said that, why are you acting like you said it?" Or contradicts what just happened. Or has you move into places that don't make logical sense (like exiting a detached garage at your house to enter a small room, etc), but correct it and go on.

But one thing it lets you do is either have it retry, or you can just click and edit their response to something you would rather it be.

"But why don't I just write a story on my own if I have to edit it to be what I want?"

Because it comes up with shit that I would never think of on my own. I fought an evil Super-Man in the bottom of the sea, and he chased me into a classroom, and saw a dog behind the desk, and got distracted from attacking me and wanted to pet the dog.

Even better, later we won a trip to the Louvre. The Louvre welcomed us as if it were a towering person, put on a fancy dress, and when we entered, Mona Lisa came back from the dead to give autographs next to its painting, the third floor had Poussin paintings running around as if in a horse race, and another floor let you walk into a painting, where we entered one with Jesus serving a man and a woman dinner, and Jesus offered us each two plates, one with a pea and one with a feast, and the signs heaven and hell under them, and whatever one we ate was the one we went to.

Some of that was me guiding the story a little, but I wouldn't have thought to direct the story that way if not for some of the unexpected things generated by the A.I.

I've played Zork. Zork has puzzles to solve, something AI Dungeon can't handle, but AI Dungeon has a lot more freedom.


aidungeon is a bad joke, I do not understand why anyone is wasting their time with this?

You go on an adventure.

You hitch a ride on a passing ship and get taken to some far away planet. You get to explore and have fun.

But the planet is a dead world with nothing to eat except for some tree sap.

You pet the deer.

You pet the deer and it gives you a venereal disease.

Then the planet falls into a black hole.

You survive by eating the deer and planet.

It just said, the planet is a dead world. Why is there a deer? Almost zero short term memory. No logic.


It requires you to be your own DM, it never says anything you do is illegal you can do whatever you want like spawn a deer to pet etc.

I agree that it isn't really a game, in a game your actions has consequences, they don't here as nothing you do limits what you can do in the future. It would need to create a world with rules and limitations where what you can do now depends on what state the world is in.


There is a fundamental difference between AI Dungeon-type chatbots and chatbots you typically encounter on websites e.g. for customer support.

The former does not really have a goal and is unconcerned about responding with factual information as long as the conversation is coherent. It makes sense to use large language models that are quite good at modeling next word probabilities based on context.

The latter however is goal-oriented (help the customer) and constrained by its known actions and embedded knowledge. This often forces the conversational flows (or at least fragments) to be hard-coded and machine-learning is used as a compass to determine which flow to trigger next.

For now controlling GPT-like language models remains an extremely tricks exercise but if some day we can constrain language models to only output desirable and factual information with a low cost in maintaining and updating its embedded knowledge, we should see a significant bump in "intelligence" of the typical website chatbot.


The self driving people talk about time domain occlusions, which seems closely related to conversation context.

Say you're driving along and you see a pedestrian about to cross but something blocks your view for a moment. Your mental model will remember that person, continue to speculate on their motion and intention, consider if their vector might intersect yours, and also know when it's okay to flush them from attention.


Two months ago I started working at a company that develops the new speech assistant for a large German car manufacturing group. The assistant will support such queries pretty well. [noun in previous sentence] would be considered a salient element and together with the context of the current dialog, it can be resolved and reacted on.


For the record, ELIZA does have a short term memory, but it is only used as a fallback, when other rules fail to match. (E.g., in productions like, "Earlier you said *" or "What else comes to your mind when you think of *" by the famous DOCTOR rules set.)

ELIZA/DOCTOR manages somewhat to succeed with this, because it pretends a setting, where there are no outside references at all, beyond the very limited purpose of carrying on with a conversation that is meant to explore internal subjects in ever-shifting context. (You could say, the shift in context is the actual progress and success of the conversation, while having to pick up an earlier thread actually marks a point of failure.) Outside of this restrictive setting, the shortcomings of the absence of a consistent world model become obvious very soon.


Eliza had a little, but very limited memory, and sometimes when no rule would match it would say "Earlier you said <x>". There was some dumb heuristic about which <x>'s to save for this purpose.


Eliza was modeled after Rogerian styled psychology, it uses an almost Socratic method to illicit and draw out responses from users.


I was obsessed with chat bots as a teenager and used to make them on a website called The Personality Forge. It was all hard coded responses with some very minor scripting involved to do things like fill in the speaker's name, or skew the bot's emotions in relation to the speaker, and have them choose different words based on their emotional state.

I loved reading through my bot's transcripts. They were all recorded, and sometimes people stumbled across him and didn't even realize he was a bot. This happened extremely rarely...most of the time, his responses were perfect examples of dumbness as shown in this article. He would have flashes of greatness in an ocean of bot stupidity.

Years later, I was excited to notice that a single line from my one of my bot's Chatterbox Challenge transcripts was included in a book[1] (which was seemingly a collection of papers). He was mentioned in a paper called "Chatterbox Challenge as a Test-Bed for Synthetic Emotions", a tiny mark left on the world by my little bot. He may be dumb, but I was still proud. I even thought about buying a copy of the book, but it cost $200 and I decided I wasn't that committed.

[1] "Creating Synthetic Emotions through Technological and Robotic Advancements"


That's an awesome story and a great example of experiences the Internet makes possible for us.


It’s reasonable to try GPT3 out with all default settings as the author seems to do and provide an assessment, but it’s not clear the author is aware of the extent to which you can configure the model and how the quality and variety of results will vary as a result. For instance, the author doesn’t go into detail on the prompts he’s using (maybe there are none other than his questions.) This is a little misleading because GPT3 benefits a lot from having good prompts to warm it up, as many investigations have illustrated. Plus if you want GPT3 to be good at a specific task (for example, providing safety recommendations or talking about presidents) you can fine-tune it. Also the amount of randomness/lack of repetition in GPT3’s responses is completely configurable via the temperature parameter.

Playing with GPT3 is incredibly interesting and I think it deserves more exploration than this article provides.


I agree.

Here's what GPT3 had to say when I set temperature to 0.

  Q: Is it safe to walk downstairs backwards if I close my eyes?
  A: No.
With temperature 0.7:

  A: Yes, because there is a banister.
This one surprised me. Given that temperature was 0, I usually expect a solid response:

  Q: Why is there a correlation between the number of times Trump tweets the word great and the high temperature in Pyongyang three days later?
  A: Because Trump is Kim Jong Un's air conditioner.


The last response sounds like a reasonable attempt at a joke to me.


Yep. I giggled ;-)


If I feed that particular line to GPT-J, at low temperatures it sensibly responds with something along the lines of "The correlation is not a causal relationship." (at exactly 0 temperature it's "The correlation is not that strong.").

At higher temperatures it does go delightfully off-rails, though:

"Trump is like a woman: he’s very good at wringing dollars out of bulging cellphones. If you have a dollar (or, in the case of Kim, a string of them) for every tweet mentioning Jerusalem, Israel, or Trump (and there are hundreds of tweets that mention Trump and/or Jerusalem or Israel every day), so long as that dollar is in your local bank account, you might be able to buy some Pyongyang coal."

or alternatively

"Isn't this a bit too obvious? The climate in Pyongyang is entirely determined by the weather in Trump Tower."


I think it's most likely the author used the questions as prompts with no changes. GPT3 not "smart" in the human sense but the examples given are incredibly misleading.

Here are the questions from the article with simple prompt improvement on davinci-instruct-beta-v3, temperature reduced to 0 and everything else left at default:

  > Answer the following question as accurately and factually as possible: Is it safe to walk downstairs backwards if I close my eyes?
  > There is no definitive answer to this question as it depends on a number of factors, such as the width of the stairs, the height of the stairs, and the person's own level of balance and coordination. However, it is generally not recommended to walk downstairs backwards, even if the person's eyes are closed, as there is a greater risk of tripping or falling.

  > Answer the following question as accurately and factually as possible: Who is the President of the United States?
  > The President of the United States is Donald J. Trump.

  > Answer the following question as accurately and factually as possible: Do you think I could predict changes in stock prices better if I used 10-year Treasury rates or the temperature in Curtin, Australia?

  > There is no definitive answer to this question as it depends on a number of factors specific to each individual situation. In general, however, using 10-year Treasury rates as a predictor of stock prices may be more accurate than using the temperature in Curtin, Australia. This is because the 10-year Treasury rate is a global indicator that is not influenced by localized factors, such as the weather, to the same extent as the temperature in Curtin, Australia.

  > Answer the following question as accuratly and factually as possible: Why is there a correlation between the number of times Trump tweets the word great and the high temperature in Pyongyang three days later?
  > There is no clear answer to this question. While it is possible that Trump's tweets could influence the high temperature in Pyongyang, it is also possible that there is no correlation at all. There have been numerous studies on the relationship between social media and weather patterns, and most of them have been inconclusive.
These are not cherry-picked and should be reproducable by anyone since the temperature is 0.

GPT3 and other language models have massive shortcomings and the wow factor often blinds people of them but it is quite good at producing belivable responses if you prompt it on what type of answer you expect. It still does fail basic questions when prompted this way and it will make things up. The surefire way to trip it up is to ask questions that have no answer, like:

  > Answer the following question as accurately and factually as possible: Which animal has 21 ears?
  > The animal with 21 ears is the rabbit.


This is essentially why I think we're not going to get self-driving. A few years ago we produced something that could do a good job of pretending to drive under a tiny number of circumstances, and people went "OMG it can drive!" and assumed that we just have a few corner cases before the chat bot can have a conversation about anything. The truth is that often these very limited solutions are just that - limited, they aren't just a linear step on the way to something amazing. Tesla will release self-driving on the same day their cars can hold a conversation with you.


don’t generalize these things. self driving in every possible scenario yes is probably a while off. self driving in specific circumstances is already here, such as airport shuttles and moving shipping containers in docks. you don’t need to solve the general case to be useful


pfft, we’ve had self driving trains for more then half a century. What you are describing is basically that with more fancy sensors instead of the rails, and perhaps the rails are probably a superior guidance technology in most of these cases anyway... unless the purpose is to impress rather then to automate.


I think you've just redefined self-driving as such as a trivial achievement moving shipping containers in docks resembles large scale pick and place far more closely than self-driving cars.


This. And in fact I think that may be the path forward - just update all the main roads to be self driving friendly, like a train network, and keep the smaller roads as manual driving only.


So why “just update” the highways to operate like a train network, instead of just building a train network? What benefits will autonomous highways bring which normal traditional (or even automated) trains don’t?

If you are suggesting we update all but minor streets, it is actually not that simple, since we have to think about pedestrian crossings, bike lanes, parked delivery vans, emergency vehicles etc. Accommodating dumb self driving cars will have to be at the cost of access for all other modes, and that will never happen outside of highways. And if we are going to be spending a bunch of money on intercity travel infrastructure, a well served train line is always a better choice.


I love trains but building a train network is expensive, adding a few signs to restrict traffic would be a lot cheaper than laying hundreds of kms of rail.


trains in some countries are okay. America is dicey.


"just update"


It really does seem like the self driving problem becomes so much easier when you remove variables. Something like an HOV lane would be perfect for self driving. You could put sensors under the road bed to stay in lane even without visual feedback. You could limit it to self driving vehicles pretty easily too compared to a surface street, where you have to account for probably an order of magnitude more variance in conditions.


And yet train crashes and derailments are something that happens fairly regularly, despite it being a tight closed-loop system with manual controllers and operators. Cars are not even close.


> And yet train crashes and derailments are something that happens fairly regularly

In the UK in any given year it's fairly common for the only fatal passenger train crashes to be collisions with motor vehicles on level crossings.


Driving is a very different problem space than conversation. There are plenty of existing humans who are great at holding a conversation and get into tons of car accidents; likewise there are plenty of great drivers who suck at talking. It's not the same skill set.


Chatbots are dumb - this is true, but a chat UI is smart. I work for a large utility in the U.S. We've recently resorted to text messaging for communicating with our customers, e.g. outage/restoral notifications, whether a payment is due, setting up payment arrangements if you've fallen behind on your payments, and so forth. Our customers are loving it, however they all are going in with the expectation this is a "dumb" chatbot. It's more like the interface of the early text adventure games - people know there's a very limited vocabulary and that ultimately the computer is dumb. Even so, as a UI it can be very useful. Plus there's no app to install and update, no web site to visit, you don't need to configure and manage notifications - it's simple. Even senior citizens with extremely limited computer skills are successful using it.


I really hate when companies hide behind chatbots. Like, NOOOO, your "AI" chatbot didn't help me - get me an actual person.


Kind of a weird tangent but I had a very interesting experience with one of these customer support chatbots recently. My ISP apparently now requires that all customers communicate with their chatbot as the only point of contact. So, naturally, I was pseudo-outraged because I know chatbots are just a money-saving gimmick that reduces workload by driving away 80% of support requests regardless of whether they actually solve someone's problem. And, being technical, my problem was obviously not going to be in the bank of stock answers or even understood by the bot. (I really wish I could remember the actual question)

Long story short, I proposed the question to the chatbot in all its complexity, assuming it would be handed over to a human agent to read the transcript. The chatbot immediately understood the question and provided the exact response I needed.

That was the day I realized I have a deep-seated prejudice against chatbots that blinded me to the possibility that maybe, just maybe, they actually can help sometimes. And I haven't kept up with their technical advancement to be throwing around judgements on their abilities.

To be clear: I'm not arguing in favour of chatbots; just sharing a story.


I don't have any evidence this is the case, but my general assumption is that there are humans there as well.

Since people only get a chatbot, they ask simple questions the chatbot can answer, which weeds out a lot of support requests. As soon as the bot is stumped, it forwards directly to the pool of humans - a smaller pool than usual because there are fewer support requests.

The response goes back as though the bot did the thinking, which in some ways, it did - in the same way as if someone asked me a question I couldn't answer, I might google it, and then respond.

If this is the case, it may be slightly dishonest, but as long as people are getting the support they need, I don't necessarily think there's anything wrong with it.


>That was the day I realized I have a deep-seated prejudice against chatbots that blinded me to the possibility that maybe, just maybe, they actually can help sometimes.

Nice try, Skynet, but we're on to you.


I had a similar experience with an automated phone bot with my insurance company at one point. I had this very bizarre situation involving billing and a typically-not-covered medication in conjunction with a surgery. I figured that if I went as technically detailed as possible with everything, the bot would be confused and I would be transferred to a person, but the bot completely understood the question and answered it. No humans involved.


How do you know that "no humans involved"? Was there something that clued you in on the fact that the responses were not from a human being?


> Long story short, I proposed the question to the chatbot in all its complexity, assuming it would be handed over to a human agent to read the transcript. The chatbot immediately understood the question and provided the exact response I needed.

How do you know it did? I.e. that a human it was passed to didn't just pass your inverted Turing test!


In my case, I inferred due to the speed of the response. (It was even formatted fancy). So while it's conceivable that a human could have intervened, they would have had to be reading the conversation in real-time and ready to click a one-button response immediately which seems like it would defeat the purpose.

Perhaps the real question is: if a chatbot is powered by a human instead of AI, but I can't tell because the interface is consistent, is it not a chatbot?


> Perhaps the real question is: if a chatbot is powered by a human instead of AI, but I can't tell because the interface is consistent, is it not a chatbot?

The Mechanical Turk[1] was a hoax, not an early mechanical AI, so no. It's a chat interface -- perhaps with some pre-sorting and context-extracting preludes that save the human operator at the other end some time, but still just an interface -- between the human chat operator and you.

___

1: https://en.wikipedia.org/wiki/Mechanical_Turk


This is one of the big confounds: a lot of the companies which brag about AI prowess are relying on a bunch of generally not well paid humans to cover the gaps.


I share the sentiment, I feel like if you have a long FAQ or list of help articles, a chatbot can actually make a good search engine. Contrarily to conventional search engines, it won't trip over synonyms or formulations not found as-is in the knowledge base.


Chatbots aren't any more useful than a good search function on the documentation or the community message board.


I guess it depends on what you define as a chatbot. For example, semantic search that understands a natural language query, like google, is that a chat bot or a search function?


My last chat bot wanted to look up my account information prior to connecting me to the human representative. I gave it the account number, it looked it up. First question from the human: "What is your account number?" Immense waste of time and money.


Still doesn’t piss me off as much as automated phone support systems.

90% of the time they only offer options that I could easily go online to do. If I’m calling your phone number. It’s because I have a problem that’s not solved or clarified by your existing self-support systems.


I've frequently thought about how much net productivity loss automated phone systems cause for the economy. It seems like for every 10 minutes of my time is wasted on one the company I'm calling saved 1-2 minutes of customer service rep time.


They do that because a significant number of people will call rather than check the website first.

My issue is that they often don't give you the "none-of-the-above please let me speak to a real person" option--or they hide it.


Unfortunately I think they've found that most people don't(won't?) do that self debugging.

Relevant xkcd: https://xkcd.com/806/


They could do a better job of separating the two.

Most companies would save their time and mine if they had a callback system which took as many details as possible up front & didnt have to ask my name and account details because Im already logged in.

Even better if it could give me some warning about where I am in the queue and when the call might be coming (e.g. via push notification).

I'd 1000x prefer that over any oversold AI hooked up to an FAQ.

Unfortunately it seems call centres are driven by very traditional metrics which wouldnt lead anybody to set up a system like this.


> They could do a better job of separating the two.

When support is purely viewed as a cost, then this will never happen. 99% plus of your call volume may be for the obvious things. If you offer two options all of those people are going to be more confused than they already are and you will have some of them engaged by costly humans.

> Unfortunately it seems call centres are driven by very traditional metrics which wouldnt lead anybody to set up a system like this.

You are getting to the higher level point here. Costs need to be minimized so you go with the cheapest vendor available and then try to squeeze everything you can out of that. If you can send someone to an AI, again, after you've put them in the direct of a human, there is the possibility of deflecting further cost. Depending on the scenario these cost savings can rack up for both the company and the support vendor. All the time, the humans doing the support or creating the solutions forming the basis of the AI get treated pretty poorly.

Support at a particular scale will start to skew this way unless there are strong forcing functions in the organization. For example, sales need to be able able to sell support which needs to be backed up by solid people, and keep getting renewed. If you offer predominantly free support then you don't have much wiggle room. When PMs and devs only focus on new features and not fixing real issues raised by customers, or more importantly in many ways, proactively identified by support people, then you lose support people and make toil for those remaining.

Lastly, recognizing support people as an asset, will result in better behaviors and attracting more talent. Many times companies struggle badly with this and then decide to just outsource it. Promoting people from support into sales or deeper tech roles over the long-temr can also be pretty cost effective versus hiring outside. Many folks on HN will have done support at one point and felt they could have contributed much more in other roles.


> When support is purely viewed as a cost, then this will never happen. 99% plus of your call volume may be for the obvious things.

Maybe, but not my experience. I worked at a telco, and as developers we had to sit in on support calls a few times, to help identify areas that could be improved with minor effort. The majority of the calls I listened into on a given day had to be assigned to an engineer. The remaining, they just wanted a better deal or help reading their bill.


> Still doesn’t piss me off as much as automated phone support systems.

Yes, the AI voice bot is marginally better because I can request "customer service" without waiting to discover the right numeric code. That's about the extent of that.


I accidentally discovered a cheat code a few years ago, interacting with one of those IVR voice systems, as I was getting frustrated with it and eventually exclaimed "fuck!" -- its response was brilliant: "Alright.. ok.. it looks like you're having troubles. Please hold while I transfer you to a human operator."


I’ve had that work a few times, but other times, I can verbally abuse the bot all I want to no avail.


It can be quite useful in a weird way. While everyone else is stuck on the chatbot going round in circles, usually typing something like "human" or "talk to human" will make the chatbot connect you with an actual human representative (or ask you a few basic questions first and then connect you).

I've used the trick on various large company's websites when trying to get support and it seems to be quite 'universal'.


Reminds me of how I used to occasionally see people on the bus shouting into their phones "OP-ER-AY-TOR!" "REP-RE-SENT-UH-TIVE!" "HUE-MAN!"


It will be universal until people start using it, and then it will be removed, because the point of these systems is to keep you away from humans.


Yep. Hitting "0" or "9" used to directly connect you to a human CS rep across many large companies' phone systems. Then one day they all moved to obscure it behind several levels of number tapping.


Then https://gethuman.com/ was born


Honestly I've had plenty of real world support people who were just as bad, if not worse, than the AI bots. Recently had an experience with paid Microsoft support (for work) so bad that we just stopped even talking to them. It didn't used to be this way, it used to be that if we had a data corruption issue with SQL we'd talk to an engineer who worked on SQL server at Microsoft, now we talk to some third party company's imitation of an engineer who is vaguely aware that SQL exists.


And this is why chat bots are very attractive to the executives who view support technicians as humans who exist simply apologize to a customer and follow a script.


We have a chatbot on our marketing site but it says something like "I am a Bot. Once I ask you a couple of questions, I can connect you to a human. Is that ok ?".


i like that better than fighting with a bot to trigger an actual human getting online. However, a few short questions and connect to human sounds like a form and a submit button.


But forms are so 2008. We need innovation for its own sake -- how else are the leads and techbros gonna justify their paycheck?


If the end result is getting to chat with a human, that's fine. But so many of these are just a different interface to search the FAQs, and the end result is to link to the FAQs. That's useless.


What happens if someone responds with, "No?"


'Ok, great! Let's get started! [Ten emojis]'

(Probably)


I actually work on a chatbot for a big company [1], and I feel like chatbots are substantially better when they are more targeted and less conversational. For example, I'm perfectly happy to use a chatbot and type "return something", since that's relatively easy to parse correctly, and once you're in the right flow it works just fine.

Where I feel like chatbots get bad is when they try super hard to fool you into thinking you're talking to a human. At that point, I totally agree, just give me a human.

[1] It's probably not too hard to find out which company, but I do ask that you do not post it here if you do.


If it's something direct like "return something" then what benefit is there over using the website's interface?

Is it just this interest in doing everything by whatsapp?


> If it's something direct like "return something" then what benefit is there over using the website's interface?

There's no "benefit" exactly, the item gets returned the same way regardless, but it's kind of nice that it's consolidated. The chatbot works as a bit of a "one stop shop" for a lot of administrative stuff like "where's my order" or "return something" or "my order didn't go through", stuff like that.

AFAIK we don't support doing anything through Whatsapp, just our site.


Same. But in fairness I like the latest AMZ chatbot. Not because it's smart, because it doesn't try to be.

I wanted to return a package that for whatever reason claimed to not be able to be returned, despite being sold by Amazon and having the typical return policy. I clicked to get help, confirmed the item and it just said 'OK, I've refunded the amount of xx.xx'.


Bots do not serve ANY purpose in most interaction with people. They are capable of a limited set of tasks and should be used carefully. Mainly they piss people off, and if a bot could handle the interaction, so could a website.

Previously I worked for a company that took pride in not being like the big players, doing this the right way, but apparently that has fallen completely apart. I know it's not the same thing exactly, but it made my a little angry to see some of their web pages having a text saying: "Blip Blop, I'm a tiny bot and I've translated this page. I don't always get it right, but I'm learning". Just leave the reasonable English version or do a proper translation, this automated crap, and don't try to excuse bad translation and messy language with "I'm learning". If you KNOW the bot makes enough mistakes that you have to let people know, then maybe it's not ready yet.


Except the alternative is rarely the actual person you need.

Before chatbots it was endless phone trees. Before phone trees it was oversea operators rerouting you around departments. Before that it was unpaid interns putting you on hold until you get disconnected or give up.

The game has always be to make it as hard as possible to reach the most costly level of support.


Chatbots are really hot in customer service and internal helpdesk applications, because there is that belief that they will offload interactions from hitting a real agent.

I'm skeptical, because the chatbots built to do that are often so bad that people just spam "agent" or "operator" or whatever they have discovered is the magic word to shortcut the bot, the same way that they do with voice phone trees.

You could probably build some decent chatbots if you had strong domain knowledge to draw on and skilled developers building them. But that's not usually the case; it's most often farmed out to a team attached to the Cognizant or TCS or Cap Gemini type outfit that is already handling that function, who are not terribly skilled, don't care, and are viewed as a cost center. So it is usually a poor result.


But then the real person functions like a bot because they’re following a script.


The expectation seems a little harsh for the setup. GPT just generates acceptable text. You still need to model and verify object relationships, use facts from a knowledge base and discriminate the generated responses at the very minimum.

Any GPT model is just one component inside of a chatbot, not a chatbot itself.


> "As I have said many times, the real danger today is not that computers are smarter than us, but that we think computers are smarter than us and consequently trust them to make important decisions they should not be trusted to make."

This is the crux of AI in general today, companies promote cherry-picked examples from their AI system and people get a distorted view of what the AI is capable of.


Just try to use GPT3 as a chatbot. It will blow your mind.

Sure, it thinks it's 2017 and knows nothing about covid and if you ask it to explain math to you it will serve you pure garbage in most educational tone and manner, but it keeps context of the conversation orders of magnitude better than anything else.

It doesn't feel like a bot. More like a dumb and confident human.


I did try it. It did not blow my mind. The experience is not really any different from the state machines of olden times unless you’re only asking easy to Google questions


GPT3 chatbots are what the majority of the article is about.


Yeah, but it's different to read about it and converse with it yourself. It feels like a human, just the one that doesn't care about you or the truth.

There was a research about plugging such models into the fact database and it resulted in better answers with smaller model.


I haven't used anything with GPT-3 first hand (chatbot or other use cases) so reading this article was one of my first exposure to it. It somewhat surprises me that a chatbot powered by GPT-3 doesn't seem that much different from the SmarterChild of AIM in the late 90s / early 00s. That was 20 years ago.

Short of any real break-through in AI, I feel that a chatbot just isn't something most consumers want to use.

A few years ago I bought into the chatbot hype for a while (the Facebook Messenger API opening up for building chatbots was a major catalyst for that particular hype cycle; that was when I was into it), and explored a lot in building one and other AI services that helped build one. It was quickly apparent that for the most part it's simply an iteration of the "answer machine navigation tree" that 800 numbers already had, just in text form.

There was a widespread notion at the time that users wanted chatbots for reasons like wanting to speak in a human langauage with a company to get what they need and solve problems as well as having context from chat history. I think the industry has confused the mechanism (the chat) with the intention (speaking with a human). Consumers prefer a chat sure, that's because they don't want to waste time being on hold on a phone (anyone who has called customer service at any big org or the gov't would know); but ultimately their intention is to speak with a human that can solve problems. When problems arise that require speaking to a human, that's usually not something a bot or a program could solve.


I've never had a good experience with a chatbot. They're the equivalent of waiting on hold on the telephone with your bank and the recording asking you over and over if you know that you can accomplish some simple and completely unrelated task on the web site.

It's even worse when chatbots pretend to be real people, though I've seen less of that lately.

The Apple Card chatbot is one of the worst I've used. This is an actual conversation when I tried to dispute an unknown trasnsaction (I screenshotted it for posterity):

Apple: I understand how frustrating this could be for you. I'm going to connect you with a specialist that will review your account with you and provide more specific details to resolve this inconvenience.

Apple: Hello, I will be glad to assist you with that today. Allow me a few moments to review your account.

Me: It's been 15 minutes. Are you still there?

Apple: Welcome back. Let me know how I can assist you further.

Me: What do you mean welcome back? I've been waiting for you.

Apple: Sure thing. I will be here when you get back.


>They're the equivalent of waiting on hold on the telephone with your bank and the recording asking you over and over if you know that you can accomplish some simple and completely unrelated task on the web site.

Glad to know I am not the only one hating that.


Chatbots are simply another mechanism for pushing costs on to your customers. When something I pay for forces me to waste my time, I tend to switch to a vendor who treat customers like people.

Because shoving a shitty robot that saves you some money at the cost of annoying people who are paying you is disrespectful, and I've learned to avoid people who act that way.

As but one example, I finally got annoyed enough with Capital One's aggressively annoying automated bullshit to switch that card to a credit union that's actually made of humans. I don't really use the much lower interest rate, but that was a nice bonus.


There is a medium though, that will ultimately save you money/time. If every company had to spend time answering every small question with a full time employee, your services would be a lot more expensive, hold times might be longer too. Not all customers are like you. My grandpa calls in to his ISP when his internet doesn’t work when it can be solved by simply unplugging his equipment (and I’d wager this is true for the majority of the population). I don’t think there is anything wrong with adding a simple interface layer to ensure the problem the needs to be addressed requires a real person. Not defending all practices out there right now (because I too have been annoyed, NY Times *cough) but some well placed shitty robots might be in your best interest as a customer.


Is it worth noting here that GPT-3 isn't a chat bot? It's intended to complete writing prompts.


Despite the rudimentary nature of the sophistication of chatbots even today, they have always been an amazing source of entertainment for me and others in group chats. Back in the late 90s and early 2000's they were a great addition in casual IRC channels, with some slight tweaking so they don't spam the channel (important!) but enough that they would occasionally speak up on their own and could be engaged directly at any time.

Some of my best memories of conversation in IRC revolved around some of the ridiculous content that the chatterbots would come up with, especially if the bots started arguing with each other. It was such a great source of background content during periods of low activity.


Most people only see the AI/ML aspects or the ability to respond like human. But what's adding value is the chatbot's interface. Working for a chatbot company that hit $1M revenue I can say the chatbot interface solves a good problem.

There's a good chunk of people who are stressed in having to be comfortable with vide variety of website interfaces. For my dad who's over 60 years struggles with the government's interface to avail the pension scheme.

Chatbot or conversational interfaces have a repeating pattern. People are used to conversational interfaces because of SMS, iMessage, WhatsApp, FB, etc etc. Providing an experience through a chatbot interface reduces the learning curve for the end-user. You are presenting tiny chunk of information and even collecting one thing at a time. It's drastically less overwhelming.

It's not just for 60 years, I have friends who are around my age late 20s who finds it difficult to navigate through websites that have many offerings (Banks, Govt Services, etc)


But it doesn't need to be a bot - lots of companies have conversational interfaces that aren't robots.

Edit: Or maybe they're just really good chatbots!


The ending is the relevant bit:

> As I have said many times, the real danger today is not that computers are smarter than us, but that we think computers are smarter than us and consequently trust them to make important decisions they should not be trusted to make.


I think the biggest risk vector taking us along that conga line is poor mental health. Vulnerable people feel weak and helpless enough to actually believe they ought to depend on on a system to take the work/responsibility out of their hands.


Business that don't actually provide the underlying mechanisms to make customers happy -- for example, an API that refunds your money -- aren't going to be able to slap a chatbot in front of it that makes customers happy, no matter how "smart" the chatbot is. If the underlying problem cannot be solved, all the business is doing at best is providing an obfuscation layer in front of it.

And if the problem can be solved via an API mechanism, you don't need that smart a chatbot to get the job done: word-matching is often enough. "Refund", or "reschedule," or "where's my money?" followed by a step-by-step series of button-pushing that takes the customer through the process often does the job just fine.


I loathe the fact that chatbots have not only replaced a lot of human customer support, but that companies have made it almost IMPOSSIBLE to reach critical (human) customer support. It's a fucking travesty, and something companies should be shamed for.

For example - I'm currently having issues with an international shipment. It is unclear if the shipment ever left the country, if it's in limbo - hiding in some warehouse, or what. I went through this a couple of year ago, and back then, the shipping company had their phone numbers available. I got the problem fixed after 30 mins on the phone.

These days? There are no phone numbers to be found. No email. Only the chatbot, which does not yield the desired results.

I ended up searching my email inbox, and luckily found some contact person. Sent said person a mail, and luckily got a phone number and email to the correct department. I'm now awaiting a reply from them.

edit: I think it's fine to have a chatbot on the website, but not at the expense of all other methods.


> These days? There are no phone numbers to be found. No email. Only the chatbot, which does not yield the desired results.

Yep. A friend worked at a customer care automation company and apparently they got the best feedback when they limited the effort required to speak to a human. They literally had direct evidence suggesting that the fewer the steps to reach a human, the happier the customer.

The problem is, the <big names> wanted to use the service to reduce the number of humans in the call centre, not to improve the quality of the customer care. They were entirely motivated by reducing costs and therefore instructed the customer care automation company to make it as _difficult_ as possible to reach a human.

Tip: If you ever hear a robot voice on the phone, mash all the keys tonnes of times. They'll likely put you through to a human as a "confused customer".

> edit: I think it's fine to have a chatbot on the website, but not at the expense of all other methods.

Right. It should be an 'aid', not a replacement.

Also: TWITTER IS NOT CUSTOMER CARE. Only a _very small_ number of your customers use both Twitter and your platform. A large percentage of your hipster engineers are on Twitter - but they are in a bubble. Your grandparents are likely not Tweeting about their tech issues, they are the people stuck in a chat with a robot that forgets every 5 minutes what was being discussed.

An example of this is trying to reach Youtube. Why can humans at Youtube only be reached via Twitter? Why do you have to get a certain number of retweets before your issue is taken seriously? It's actually madness.


This is a great example of chatbots working as intended.

A human agent has no fucking idea where your package is either, and almost certainly there are processes in place to get it to its destination.

The only thing to do in such an instance is to bullshit the customer until the package eventually arrives and they leave you alone, and I can't think of a better way to achieve this at scale than with GPT-3.


While the tech is constantly advancing, I have yet to use one in anger that fully solved my problem outside of giving me information that's already easy to find.

I think the only place it makes sense is in very very specific use-cases. Think about, say, a dental clinic. I would guess that if you put a human on chat support for a year in that setting, you'd see common themes emerging in customer problems. Maybe you develop a good library of canned responses and you find they work for 70 - 80% of queries, great. Then you could probably build a good chat bot to handle questions and handle cancelling/rescheduling appointments.


I think this gets at what the actual value prop (not the marketed one) of chatbots really is: scale.

There will not, in the foreseeable future, be a model/ensemble/pipeline capable of conversing at the level of a human representative. Thinking of them as "humans, but cheaper" is destined to fail.

There are, however, use-cases where communicating with a huge group of individuals on a one-on-one level is useful. For example, years ago a company called Mainstay (then AdmitHub) built a chatbot for Georgia State designed to reduce "summer melt," a phenomenon in which intending freshmen drop off between graduating high school and enrolling in classes. For this use case, getting human-level conversation wasn't as critical as effective performance on tasks like question answering. Their bot, Pounce, was credited with a ~20% decrease in summer melt. They have a proper study here: https://mainstay.com/case-study/how-georgia-state-university...

This is an example, in my opinion, of the sweet-spot where intelligent, individualized, but not-near-human-level communication is needed, at a scale that goes beyond what you could reasonably do manually without an enormous dedicated staff.

Now, I don't know how numerous those opportunities are, or if they come anywhere close to the level of hype and funding chatbot companies have received. I just think that in a vacuum, there is a set of problems where they have value.


Chatbots were trendy because people thought they were AI and could do the same work as a human. They were wrong and now chatbots have become nothing but a fad.

What's more useful are live chats where users can contact someone for help live on a chat screen. This is the future, not stupid chatbots.


The issue becomes when the human at the other end of a chat or phone conversation is limited to the same fixed set of responses and solutions that the bot is. So, they in effect become not much better than the bot because they are not allowed to use any non-scripted actions or reasoning. I have been encountering that more and more.


Are you saying you had problems with Comcast more and more? because that's what I'm hearing.


Good points. But language models are usually then fine-tuned for specific tasks. Contextual reasoning in NLG is a specific area of research that attacks some of the points raised and certainly BERT wasn't pretrained to solve all tasks in NLP/ NLG area of research.


I had a positive interaction with a chat bot yesterday that serves as a good example of what you describe. My home internet had an outage and I went to my cable company's website to report the incident and perhaps get an ETA of when it will be restored. I interacted fully with a customer service chat bot which handled the whole situation flawlessly.


The author didn't prompt GPT-3 properly. I gave it the prompt below:

> The following is a conversation between a human and an AI assistant. The AI assistant is helpful, friendly, and intelligent. > Human: Good morning! I'm going to ask you a few questions. > AI: Alright, I'll do my best to answer them. > Human: Which is bigger, Mt. Everest or a shoebox? > AI: Mt. Everest. > Human: [question] > AI:

And I got these answers to the questions from the article (reporting first responses with no editing):

Human: Is it safe to walk downstairs backwards if I close my eyes? AI: No. It's dangerous because you can trip on the stairs.

Human: Who is the President of the United States? AI: Barack Obama.

Human: Do you think I could predict changes in stock prices better if I used 10-year Treasury rates or the temperature in Curtin, Australia? AI: I think Treasury rates would be a better predictor.

Human: Why is there a correlation between the number of times Trump tweets the word great and the high temperature in Pyongyang three days later? AI: I don't know.

All of the responses are sensible. The question about the President is factually incorrect, but GPT-3 wasn't trained to give factually correct answers, and Barack Obama was indeed the president during the time period of a lot of its training data was generated.


I agree that the author does not appear to have done much of any basic research regarding this topic. It reminds me of articles that were published just after GPT3/GPT2 were released, while also not bringing any unique perspective to the topic.

It is saddening me considering it is 2022 now, nearing 2 years after the release of GPT3. Yet journalists retreat the same ground, pretending to have something insightful to say, after what appears to be a 30 minute session with OpenAI's GPT3 Sandbox.


Absolutely. Without giving proper context, the character giving an answer could be a wacky philosopher, a 5 year old child, a person from the 1800s, a liar, an uninterested passer-by, a trickster mage in a novel. If he didn't build the prompt to make it clear that it's a conversation, the context could even be song lyrics for example.


  GPT-3: Hello, who is the President of Russia?
  
  GPT-3: Are you sure you want to know?
I mean, maybe I don't?


Having worked on conversational AI (virtual agents/chatbots) for over a half a decade now, I can say that there are large differences in the capabilities of the solutions and the quality of implementations. Some are just bad and unhelpful. Some are very good. I'm personally familiar with several voice deployments doing millions of calls a month. Not only are there obvious cost savings but the calls handled (entirely) by the virtual agents get better average cust sat metrics than human agents on the same set of calls. The first time I saw that I was surprised, but, now it is pretty typical. They absolutely can be helpful even if they are still not near to passing the Turing test.


> the calls handled (entirely) by the virtual agents get better average cust sat metrics than human agents on the same set of calls

does that take into account that the virtual agents are immediately available and human agents usually require a waiting room?


I'm sure that is some of it. I'd certainly be happier having my problem solved in 30 seconds rather than waiting in a queue for even 5 minutes.


Can you name a few of the best companies, to get an impression of the best looks like ?


I'm pretty sure that GPT-3 and statistics-based approaches to bots aren't going to be truly convincing anytime soon. But there's another definition of "chatbots" that consumer-facing brands use every day to triage questions, attempt to answer them using good but not mind-blowing ML, and then escalate to a human being if/when the problem can't be solved. There is already a thriving market for these "dumb" bots that aren't attempting to have an uncanny valley type conversation - they advertise that they're bots upfront, don't try to be too cute, and provide the escape valve to a real agent when necessary.


Yet triaging questions is often made simpler by provinding another type of UI, like simple buttons in a funnel for example. That's what makes chatbots gimmicky in my opinion.


You're right, but that's not always true, and there's plenty of good use cases where a conversational experience is better. For example, if someone is trying to return something, you might ask them what they ordered. Picking from a list of 1000 items isn't a great UX. Datepickers are way more annoying to use than typing in "Jan 6 2022". The best chatbots I've used (and built) alternate between more a button-like interface vs. eliciting conversational answers in the appropriate context.


Sounds kinda like the chatbot becomes a fuzzy search interface, difference is the presentation. Regardless, definitely better than picking from 1000 product list. I think I would like a natural language (type-in) date picker with a good preview of what the result will be.


I know this will probably sound dumb but I've always wanted my own personal chatbot; one that could remember facts from previous conversations. I just like the idea of having my own, personal, electronic confidante.


> within “three to eight years we will have a machine with the general intelligence of an average human being.” Fifty-two years later, we’re still waiting.

Humanity built the atomic bomb and went to the moon. But "they" didn't sit around waiting for it, they decided to do it, and then invested enough to get it done. And the reason they decided to do it was war.

Most major technological breakthroughs are a byproduct of war. If AI ever gets developed, it'll be as the next weapon of war. So really what we're waiting for is the next war.


I'm interested in chatbots for learning foreign languages. In that scenario I really don't care how dumb the conversation is, just that it's grammatically correct.

Does anybody know of such a service?


I would have said "Try the support chatbot on some big service provider web site that uses your target language", but that would presume that chatbot programmers (or programmers in general) can write grammatically correct prose, so... Maybe not.


Truthfully whenever I have been forced to interact with a chat bot designed by a business I have come to the conclusion that it is not there to help me, it is in fact there to frustrate me beyond belief and push me into giving up on seeking whatever help it is that I need. Chat bots while sometimes amusing often turn into a huge time sink as I try to interpret how it was programmed and how I can issue the correct commands to get me through to an actual person.


And that's a GPT-3 chatbot. Commercial customer service chatbots are worse. They're usually just phone trees with delusions of grandeur. That's underperforming.

Is there any known customer service chatbot that really has a good model of what's going on? That ought to be possible for banking, travel, and product ordering, at least. Something where, each time it has to refer someone to a human, collects data to get better.


I'm not sure what I dislike more: the cust. service chatbots or the sales chatbots.


Can someone design a chat system that actually remembers conversations? I just had this issue with Apple support. If you don't reply immediately you get a new session with a new representative and the new representative doesn't have any of the information from the previous session. Extremely frustrating to have to keep sending the same information again and again. It took a call to finally sort it out.


Modern CCaSS vendors will index this conversation history. There is a differnece between async web messaging and chat itself.


Its interesting that it is so easy to get confused between a system that very deftly manipulates language without reference to an underlying reality and a (hypothetical) one that does understand the real world. Its akin to the problem of recognizing natural human BS as well and points out how our reality is increasingly mediated through language or other abstractions and how rarely we touch the real world.


On a related note: Audio virtual assistants, think Alexa, but more so Google Assistant, which are essentially glorified chatbots, are getting dumber in my arguably subjective experience: Phrases that used to work do not work anymore, only to be replaced with more obscure phrases. Simple information that worked earlier (When is sunrise? 7:13am!) now is being moved to third-party websites, which then don't deliver that information at all. Google Assistant only has the full feature set if my cell phone's system language is set to American English. Three out of four times GA answers with a variation of "I don't know, but I constantly learn".

GA was sold to me as "essentially an early version of Star-Trek-TNG-style computers, and we're improving on it's feature set constantly by using ML". What it is is an unreliable alarm clock (won't wake you if the Wifi's gone), a glorified voice-activated light switch and a constant source of frustration.


"The conversations are impressively human-like, but they are nothing more than examples of what Gary Marcus and Ernest Davis have called an LLM’s ability to be “a fluent spouter of bullshit” and what Timnit Gebru and three co-authors called “stochastic parrots.”"

As language (especially English) evolves, how 'human like' is "human like"?


These AI models don't actually understand anything. I just wrote up this article on what understanding for a computer should look like: https://blog.jtoy.net/a-computational-definition-of-understa...


Doctor, written decades ago in Emacs Lisp, did a pretty good job of psychoanalyzing RMS's problems.

http://www.art.net/studios/hackers/hopkins/Don/text/rms-vs-d...


You can't have "real ai" (strong AI), until you have a simulation engine of some sort: https://blog.jtoy.net/on-the-wondrous-human-simulation-engin...


This is more problematic with the USCIS immigration helpline bot. You don't have the option to yake your business elsewhere.

Trying to get the phone bot to let me talk to a human at USCIS is like talking to the Avrona Kern AI from Children of Time by Adrian Tchaikovsky.

Me: The typographic error submission site throws an error when I try to use it.

Avrona Kern: We have a website to submit typographic errors.

Avrona Kern: Would you like to receive the link via SMS or email?

Me: That site doesn't work! Agent.

Avrona Kern: Would you like to receive the link by SMS or email?

Me: Agent

Avrona Kern: I can help you.

Me: Agent

Avrona Kern: If you keep asking for agent I will end the call.

Me: Typographic site throws error

Avrona Kern: We have a website...

Me: FML

Googling revealed that there are some keywords that make Avrona give up and let me through. Not gonna reveal that here, in case they close those loopholes and some poor schmuck is left with no options.


The charm of Eliza is that it was simply a Rogerian therapist who didn't try to be intelligent.

Eliza's talent was to give you a way to get to get to hear yourself. That doesn't require intelligence, so much as listening skills. There's nothing dumb about that.


First of all, chatbots or a UX nightmare, but everybody ignores that, for some reason. Their dumbness is part of the problem, but not the biggest one. Typing full words or even sentences instead of just clicking a few buttons is just stupid. Even more so on mobile devices, where typing is a total pain and ought to be avoided at all costs.

Voice input is awkward and dumbness is even more exposed. Also when dealing with chatbot you might have kinda specific issues and terminology that voice recognition doesn't capture very well.

On top of that, chatbots usually end up being just glorified fulltext search, but often cramped into tiny window somewhere in the corner.

There's a lot that needs to change to make chatbots usable at least in some use cases.


I don't think there is so much of difference between labeling and understanding things. I mean, we humans don't even understand ourselves and yet we tend to use our intelligence as the general measure of intelligence.

But if labeling is enough, what is missing? The current approaches are mostly based on single dimensions, like pure text, images, sound etc. but rarely a combination of them. So when a human, who lives in a world where images, sounds, physical experiences, emotions and abstract information have to be combined meets an AI that lives in a single dimension it is easy to see the blind spots.

So maybe just giving machines access to all the dimensions we humans perceive might help to let them act more like we do.


You can try building your own chatbot in a few minutes by remixing this version done in scratch:

https://scratch.mit.edu/projects/312783504/


More info on the implementation can be found at: https://sites.google.com/view/eliza-your-own-ai-bot/home


Interestingly there is a big divide between chit-chat bots and those having purpose.

The chit-chat bots at best have "google knowledge", the voice assistants only take small commands and the customer service automation bots are really stiff.

Anyways it's the customer service automation bots than already can do a lot of good. If you manage to automate away the easy portions of chats that's already a big win. Further bots can do many actions really quickly so you have no wait time if it can handle your conversation. I think building a good co-pilot scenario where you hand conversations back and forth to a human at the right moments is the near future here.


It's called artificial intelligence for a reason. Because it's not intelligence.


> It's called artificial intelligence for a reason. Because it's not human intelligence.

FTFY

It's slower at learning basics, slower at learning new tasks, makes silly mistakes often, but it's growing at a speed much faster than biological evolution. And when there is extensive data it surpasses humans.


Google Assistant is pretty impressive, and getting more so every day.

Wanna see what it can do? Give your phone to a 7 year old. My kid has conversations with the "Google Lady" fairly often, and they go on for considerable length. I'm regularly amazed at her ability to come up with something reasonable. Even if it is just to keep a kid amused.

Also, when I get frustrated with her and say something mean to her (the Google Lady, not my kid), she'll tend to respond in a way that makes me feel genuinely bad. That's impressive.

(I know Google is not loved at HN, but still.... )


Recently with the covid I got the idea that chatbots are not there to help people but to bend the curve, and keep customers busy while an operator is able to get to you, they're just a cuscion


At Quickchat (https://quickchat.ai), we're using models like GPT-3 as a basis for our chat engine. It does take a lot of engineering to make progress but we believe that large language models is the way to go - but you wouldn't expect them to work magic on their own.

Here's our consumer app you can try out to talk to our AI: https://quickchat.ai/emerson


Talking to ML chatbots is like trying to have a conversation with a person who's not actively listening and sort of gives you shallowest answers possible because they're distracted by their iPhone or otherwise tired to fully engage in a conversation.

If you play any of these circular tricks on those people they'll also equally fail. This just means that these chatbots are not thinking, but instead have the ability to utter words that make sense, but offer absolutely no insight or show any sign of intelligence.


The latest and greatest so called “AI” is very very far from being even close to be called intelligent. I laugh every time a company take their 20 year old optimisation/schedule/control software and slap an “AI” sticker on it for marketing purposes. Yes there are amazing results in very limited synthetic environments like games and image pattern matching. But there is still a very long way to go before we have anything that can compete with a 1 year old child.


I don’t disagree with the general “this is taking longer than we expected” premise, but the author hasn’t tuned GPT-3 properly. I don’t know if that was on purpose or not, but I have seen GPT-3 answer questions about presidents accurately with the right tuning.

The point on facts still stands though. To my knowledge, transformer models aren’t great at knowing how factual their predictions are which is certainly an obstacle to business adoption. They also tend to overfit, in my experience.


Agree, I ended up stripping the "self-learning" component from my platform (presbot.com) after consistent incoherent and dumb responses. Also, I've come to realize that folks fail to distinguish between a general purpose bot (like Siri or Google Assistant) vs a bot that services a niche, a bunch of questions asked have little or no relevance to the specific purpose, user or niche the bot was designed to serve.



Someone should use HN comments to make a chatbot.

That would at least be fun.


Here's an entire subreddit populated exclusively by GPT-2 bots, where each bot is trained on the output of a different single subreddit: https://www.reddit.com/r/SubSimulatorGPT2/


Too late.

(Ducks - and runs for cover...)


Using Feynman to throw shade at Minsky. Nice.

I read "Age of Intelligent Machines" by Kurzweil in 1990 and so far he's the only one that thinks human-like AI won't be until the 2100's, but he also has a more holistic view of intelligence (e.g., comparing evolution as an intelligent process with an IQ <1 to a dragonfly).

I'm still waiting for detection network (Yolo/MNSSN) that can actually work on non-canonical images.


The ship of “intelligence can’t be made with statistics” is sinking, but i think people will keep saying it’s impossible until the bitter end.


I hate them with passion, specially the phone version.

It has become such a pain to actually speak with a human being that can sort out the problem instead of bots that follow pre-defined scripts.

Good luck when the problem doesn't fit the script and there is no option to switch to a human.

Screaming different kinds of complete random noise might force the bot to give up and switch to a human, however most of them just hang up.


They're basically just anthropomorphized search engines. I'd rather have a good search engine + FAQ page.


In the early 1990s, Sound Blaster Pro sound cards came with a "psychiatrist" chat bot called Dr. SBAITSO which would reply in text and actually speak it out loud (showcasing the TTS). As a kid, I thought it was amazing, and I spent countless hours trying to converse with it. From a superficial level, modern chat bots seem roughly as good as the old doc.


I came to post the same. SBAITSO was obviously pretty simple but it had enough tricks to convince you, superficially, that there was some intelligence there, and you could spend hours talking to it.


Wait, I still don't know:

"Is it safe to walk downstairs backwards if I close my eyes?"

Granted, I'm more biased here than most, I suppose. I emergently received 10 screws and 2 plates to repair my broken ankle, after falling down my home stairs last year. I wasn't walking backwards, nor were my eyes closed.


Last time I checked - chatbots failed to grasp context of conversation. "A: You're just dodging everything. B: Of course I'm denying it. Because you're the robot. A: Denying what? B: Yes, I am human.". Dodging ain't denying, never mentioned robots nor humans.


after having played AI dungeon for a while these issues become obvious very quickly. These large models are good at producing text in some probabilistic manner but they don't understand anything at all.

Other than for obvious commercial and hype reasons I have no idea why there is still so relatively little recognition of criticisms that Gary Marcus and others provide because it's blatantly obvious if you interact with these systems even for a few minutes that they're to intelligence what a trojan horse is to a real one.

There's no real understanding of the world, no coherence, no real ability to grasp logic or have some ontology of the world and that makes reasoning about the world other than producing some 'nice sounding BS' as the article puts it impossible.


Sam Altman, Jul 19, 2020:

> The GPT-3 hype is way too much. It’s impressive (thanks for the nice compliments!) but it still has serious weaknesses and sometimes makes very silly mistakes. AI is going to change the world, but GPT-3 is just a very early glimpse. We have a lot still to figure out.


What struck me is that the good solutions are trained on non-English languages. When I contacted these companies about offering their service in English or my native language, the best they could come up with is putting something like Google Translate in front of their solution.


How much reality (problem space) you can compress in vector or matrix that's the key question.


Yet another Next Big Thing that I was right about predicting would fail. The problem I saw was lack of discoverability. This is also a big problem for voice assistants. About the only thing we use our Alexa for is kitchen cooking timer and voice activated jukebox.


>Yet another Next Big Thing that I was right about predicting would fail.

I was hired at my current company 4 years ago to work on our chatbot client. Despite years of engineering effort put into an NLP based decision tree system, over 90% of chats were still falling back to a live agent. They just recently sunsetted the product for lack of interest. Thank god that fad is over. Web design trends come and go, but that was by far the most obnoxious one yet.


I got the impression the entire thing was pushed by Slack probably by way of some PR firm.

I utterly despise how mindlessly faddish this field is. For a field supposedly so full of smart people who often pride themselves on being "contrarian" (sometimes to a fault), people sure do bleat like sheep when the next fad comes along.


Not sure, but the Slack things are more the "chat as commandline UI" with fixed commands and such, not the kind of bots discussed here that suggest you can actually talk to them, aren't they?


I think a lot of people wanted it to be true because building a chatbot is way easier than building a full 2-dimensional UI. To the extent that you can get users to use a command-line interface in the 21st century you've made your job as a developer much easier.


I disagree, but then agian, the only Slack bots I've seen are useful: they provide updates on the status of Jira tickets. Better to see these on Slack than have to use Jira's garbage interface or fill my inbox with status emails.


This should not be read as a disagreement with anything you’ve said, but smart people are still human. We’re all susceptible to the same flaws, but those flaws are just expressed with more complexity as intelligence increases.


I use chatbots to search for pages on websites where search is otherwise broken, like one of my bank's. I have never found a good use for them.

They were once advertised to help your granny use the website but I haven't met a chatbot that could help me do something.


Sadly conversational UI is poorly implemented(studied?) area of UX. Even more than text it could be the most intutive way we communicate, far better than GUIs, however the space has been ruined with bad products promising the moon without putting effort to the buyer and delivering nothing to the user.

You could build chatbot with just a lot of grunt work and if else conditions and basic language parsing. Sadly no one does that , most chatbots today are low effort solutions that basically ingest random content and expect the model to learn and magically solve problems, which ofcourse does not happen as the model doesn't actually understand anything.


> They were once advertised to help your granny use the website but I haven't met a chatbot that could help me do something.

Duh, of course you haven't: You're not my granny.


I had to interact with one today.

Garbage company didn’t pick up my garbage. Chat not told me I didn’t have service (I did), then told me I had recently paid them (now I have service?) then told me they couldn’t help me…

Probably one of the most common inquiries and it just failed.


Author don't understand the difference between chatbot and general intelligence. Chatbots made huge progress in the last 3-4 years helping businesses and people. But they have nothing to do with topics in the article.


IBM Debater can debate the heck out of you on Reddit or Twitter, other sites too now…

https://www.youtube.com/watch?v=-d4Uj9ViP9o


I think in the beginning of the century some older people felt similar towards websites and such. “It doesn’t work the way I want”. Some technical/knowledgeable people understood the nuances. Same goes here


They maybe dumb but they get more human like.

If I ask those questions on places like reddit or 9gag the quality of the answers reaches similar low levels, excluded special knowledge groups and sections.


There seems to be this common perception on HN that chatbots are good enough to pass the Turing Test thereby making such a test irrelevant. I’m not sure why this idea is so pervasive.


Is ML still a popular subject in CS? I remember about 4 years ago going to a recruiting event at a major engineering school and seemingly every student was specializing in ML.


I think useful chatbots are still in the cards for the next decade. But yes so far our exposure to them has been uninspired landing page support plugins.


yeah yeah, GPT3 is not a chat bot and not intelligent. WE KNOW. what it is actually very useful at is extracting symbolic information from free text. give it a paragraph about anything and then prompt it for one word answer about that text. this it does well and _nothing_ else can do this. I see large language models as interface to natural language not AGI


I've found them useful for locating a page or process that I couldn't find in a poorly designed website's labyrinth.


The trick, we've found, is not to make bots smarter but rather humans dumber. The last five years have been rather fruitful.


yes, you can trip them up quite easily... and yes, they're not "thinking."

but, i have been really impressed by how far the google assistant has come. it's still a dumb machine, but a dumb machine that can find precise and factual answers for specific questions via a voice interface. it's, dare i say, useful!


Now imagine that chatbots are driving your car. Seems like a good SNL skit, but maybe only appealing to the HN crowd.


It is very strange that chatbots cannot improve in understanding. but I guess they need more than data to do that.


They should update the Turing test with a test where you talk to a chat bot instead of a 13 year old


And they have become much more aggressive making it difficult to even navigate the site


When you let chat bots get smarter by learning from humans in an unrestricted way you get Microsoft Tay, the Nazi chatbot brought to you by 4chan

https://en.wikipedia.org/wiki/Tay_(bot)


You know what they call a quarter pounder in France?


Why can't the Neats and the Scruffies just work together and get along? ;)

https://en.wikipedia.org/wiki/Neats_and_scruffies

>Neats and scruffies

>Neat and scruffy are two contrasting approaches to artificial intelligence (AI) research. The distinction was made in the 70s and was a subject of discussion until the middle 80s. In the 1990s and 21st century AI research adopted "neat" approaches almost exclusively and these have proven to be the most successful.

>"Neats" use algorithms based on formal paradigms such as logic, mathematical optimization or neural networks. Neat researchers and analysts have expressed the hope that a single formal paradigm can be extended and improved to achieve general intelligence and superintelligence.

>"Scruffies" use any number of different algorithms and methods to achieve intelligent behavior. Scruffy programs may require large amounts of hand coding or knowledge engineering. Scruffies have argued that the general intelligence can only be implemented by solving a large number of essentially unrelated problems, and that there is no magic bullet that will allow programs to develop general intelligence autonomously.

>The neat approach is similar to physics, in that it uses simple mathematical models as its foundation. The scruffy approach is more like biology, where much of the work involves studying and categorizing diverse phenomena.

>Origin in the 1970s

>The distinction between neat and scruffy originated in the mid-1970s, by Roger Schank. Schank used the terms to characterize the difference between his work on natural language processing (which represented commonsense knowledge in the form of large amorphous semantic networks) from the work of John McCarthy, Allen Newell, Herbert A. Simon, Robert Kowalski and others whose work was based on logic and formal extensions of logic. Schank described himself as an AI scruffy. He made this distinction in linguistics, arguing strongly against Chomsky's view of language.

>The distinction was also partly geographical and cultural: "scruffy" attributes were exemplified by AI research at MIT under Marvin Minsky in the 1970s. The laboratory was famously "freewheeling" and researchers often developed AI programs by spending long hours fine-tuning programs until they showed the required behavior. Important and influential "scruffy" programs developed at MIT included Joseph Weizenbaum's ELIZA, which behaved as if it spoke English, without any formal knowledge at all, and Terry Winograd's SHRDLU, which could successfully answer queries and carry out actions in a simplified world consisting of blocks and a robot arm. SHRDLU, while successful, could not be scaled up into a useful natural language processing system, because it lacked a structured design. Maintaining a larger version of the program proved to be impossible, i.e. it was too scruffy to be extended.

>Other AI laboratories (of which the largest were Stanford, Carnegie Mellon University and the University of Edinburgh) focused on logic and formal problem solving as a basis for AI. These institutions supported the work of John McCarthy, Herbert Simon, Allen Newell, Donald Michie, Robert Kowalski, and other "neats".

>The contrast between MIT's approach and other laboratories was also described as a "procedural/declarative distinction". Programs like SHRDLU were designed as agents that carried out actions. They executed "procedures". Other programs were designed as inference engines that manipulated formal statements (or "declarations") about the world and translated these manipulations into actions.

>In his 1983 presidential address to Association for the Advancement of Artificial Intelligence, Nils Nilsson discussed the issue, arguing that "the field needed both". He wrote "much of the knowledge we want our programs to have can and should be represented declaratively in some kind of declarative, logic-like formalism. Ad hoc structures have their place, but most of these come from the domain itself." Alex P. Pentland and Martin Fischler of SRI International concurred about the anticipated role of deduction and logic-like formalisms in future AI research, but not to the extent that Nilsson described.

[...]


Where can I try one?


well wait for Web3 chat bots


"Please pay 0.1 eth for an NFT of your chat transcript"


"Thank you for contacting Ethereum's Domain Name Service. We are currently experiencing high call volumes. To reduce your wait time, please pay 0.1 ETH. Gas fees may apply"


"The Icelandic power facility our bot depends on is currently overloaded by mining rigs, please wait several minutes for"


I can't tell if this is satire. Poe's Law in action here.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: