Siri, don't make me even start. I like much of what Apple does, but Siri is just hilariously bad. The error rate at speech recognition seems to have gone up over the years. Keywords are randomly being changed, sometimes they fire, a day later you have to use a different phrase to get what you want. They even broke "Where am I" at least in German. "Wo bin ich" is sometimes replied to with "Sein oder nicht sein" (to be or not to be) which indicates Siri is trying to be smart and fails totally. I am blind and use this feature when I have a feeling I might have been lost. Very funny if you ask a simply question like "Where am I" and your voice-assistant is fucking with you about pseudophilosophically. "Was läuft gerade" did also just fail a few hours ago. Siri just started to play something from my library, which is not what I wanted. All in all, its a embarrasing failure all around. Oh, and before I forget, when Siri fails to understand the name of the person I want to call, it says "OK, I am calling <your name>" and actually tries to dial my own number. This is so dumb, it feels like a joke put in there by an intern.
> The error rate at speech recognition seems to have gone up over the years.
Input recognition quality falling all over the place - it's very much noticeable with their onscreen keyboards as well. Some "intelligent" mechanisms know better which button I tried to tap and I'm getting gibberish even with auto-correct off. It's the same on iPhone and Watch.
Apple has already forced me to re-learn keyboards once - with the MacBook Pro 2017 fiasco when I had to start carrying external keyboard with my "mobile" laptop. Haven't bought a MacBook since.
Their swype-style keyboard is an absolute joke—perhaps the most infuriating part of my iphone. I know that it’s easier to criticize than it is to implement, but my god, just a basic markov chain would yield more intelligible results than the shit it comes up with.
For some very odd reason, swipe keyboard technology seems to have peaked around 2014, never to reach those heights again.
I remember being amazed by how great swiping was, and after using it recently (it's not really great on gboard or samsung keyboard either, not bad but just meh) I was wondering if it was just rose tinted glass. But nope, I booted up my old nexus 5 with a normal gboard (not even swiftkey!) and it was still amazing. The contrast was amazing even when I wiped typing data on both phones to make sure that it was a fair comparison.
I guess the most glaring difference was the subtle, but extremely important handling of edge cases or particularly ambigious letter combinations. Even the "bad" modern google keyboard is still 95% accurate, but the difference between that and the 99% accuracy on my nexus 5 basically makes all the difference. It goes from predictive magic to having to swipe across every single letter to make sure it works.
I remember not even knowing the spelling of words but being able to swipe in the general direction of letters I thought might be correct and it inputting the correct word.
Now now it seems I have to have near perfect tracing and pathing to get correct words.
That honestly could be it! My past few galaxy phones have been narrower and taller than my old note4 (the nexus was more compact
but with an aspect ratio that was more "square" ). The narrow and tall screens makes hitting keys a lot harder, now that I think about it.
What are your talking about? You done like it converting “your” to “your” on every damn time you type the word?
Note: the above paragraph was typed with swipe typing. I actually typed “Y O U” for both of the above instances of “your.”
Apple swipe typing is the opposite of AI: it’s like an idiot is going behind you taking even correct words you swipe and turning them into gibberish. Don’t even get me started about its obsession with “it’s” and “we’re” to the exclusion of the words without apostrophes. Ughhhhhh
I never write on an iphone, so didn't know today's ridiculous apostrophizing is an autocorrect thing - but that could explain why people sound like idiots or would that be sound like idiot's on the internet.
Totally agree so I installed SwiftKey. It worked great for years and I’ve been super happy! Then in the last year, SwiftKey has been randomly crashing so while I type I swap between the SwiftKey and normal keyboard unwillingly several times per message. The built-in keyboard seems absolutely determined to believe that the word 'fuck' does not exist.
This is useful advice but, the frustration is warranted if a user needs to modify a sub-sub-sub menu to undo a default that is not even grammatically correct.
Personally i would higher consider the galaxy fold over the pixel fold. i remember when samsung released their first foldable phone it was a very expensive trainwreck and they are now at a point where i here they are decent
i would not for the life of me trust google with a 1st attempt at any hardware implementation. especially with one that will be pretty expensive
I haven’t downvoted, but might be because third-party keyboards have been officially supported for many year in iOS. Your comment gives the impression they’re some kind of hack.
Thank you! No, that wasn't how I intend to mean it. If I recall correctly, the implementation was very restricted when the feature first launched in 2016. And if I recall correctly, it's not a feature which Apple would mention ever again.
Therefore, I have the impression that this is not a feature Apple would endorse much.
It's completely supported, you can get keyboard replacements on the app store.
It's just less secure to use them than the one built into the OS because the third party keyboards might run input through some web service instead of keeping it completely local.
There is a default-off toggle for whether to allow a keyboard to use the network, but of course it's so much more useful enabled that I expect everyone enables it.
The keyboards on the Apple M Macbooks are good, having used one myself... Except there's this feeling that Macbook keyboards fundamentally "work" differently to all other keyboards out there.
I'm primarily a Windows wizard, I use my Macbook for specific tasks and it's not my daily driver. I can type on my Macbook mostly fine, but occasionally it misses some of my key inputs. I know the keyboard works, the key in question types in properly when I press it again, but nonetheless I have the occasional input go missing as I type.
I don't know why this is the case, it annoys me, and I'm left wondering if it's the keyboard with no obvious defects or my typing which works flawlessly with literally all other keyboards I ever come across.
I have this issue using some keyboards. I think it has to do with key spacing and pressure required. You could be missing the key by hitting the edge of it.
The butterfly keyboard from previous Apple laptop generations is absolute trash. Every time it takes me a few minutes to get used to it.
That’s interesting because I use the same keyboard with my Mac and Windows work laptop hooked up to a dock, and Windows + Teams (where I’m doing the majority of my actual typing words not code in a day) constantly has weird artifacts of switching lettering, like it’s kicking off some async process to grab keystrokes right as I open a message then flushes the buffer out of order to the application. It’s really weird and I thought it was user error at first but I’ve noticed this never happens on my Mac.
Then again Teams specifically is just kind of the worst. At least they finally added the full range of emoji reacts to messages. (Still doesn’t beat Slacks custom emojis but oh well).
I have the same issue with slack and any facebook app frequently -- especially if you type, delete, type quickly. I think there must be some damn crdt within the draft edits feature that causes the final state not to match exactly what would be expected from the input sequence.
It's very annoying because it feels like I'm being gaslit by the device -- the errant results can pop in after some delay .... "was I misreading what I saw had been typed earlier on screen??"
It's infuriating when I ask Siri to play a song and it decides to pick an obscure remix of the song one day, the actual song another day, then another remix another day.
Did they test this at all? Why would you ever pick a more complex/verbose option from the results list?
One of the most infuriating things about any voice assistant, IMO, is the absolute clunkiness with which you have to try and control music.
I use Android Auto in my car for safety. God forbid I'm listening to a song and want to queue up another one right after. I don't think I've ever gotten that behavior to work without either skipping the current song or adding the second song to the end of the queue. And even that's assuming that it even recognized the proper song out of my library and didn't try to go to Youtube or something.
Meanwhile, if my buddy is riding shotgun, I can just say "Hey, put on Peace of Mind next"
I switched from Apple Music to Spotify and have yet to figure out how to get it to play a playlist. So it’s either “tap on the screen while hurtling down the highway” or “awkwardly tell it to play each individual song after each one completes”.
Hahah yeah I said “Siri play classical music on Spotify” and siri said “I’m afraid I can’t do that” and then starting playing Sabaton (which is Power metal and the exact opposite of classical music).
Then Siri had the gall to claim NO MUSIC WAS PLAYING as this super loud music was assaulting our eardrums. My wife thought my exasperated struggles were the funniest thing, it felt like HAL-9000 with the “I’m sorry I can’t do that Dave” moment.
I've had the exact same experience with Google Home.
"Play music on dining room speaker."
(dining room speaker starts blaring music at high volume)
"Turn down dining room speaker."
(No response)
"HEY, GOOGLE. Stop music on dining room speaker."
(dining room speaker music volume decreases)
(from the dining room speaker) "Can't find dining room speaker."
(dining room speaker volume increases, blaring music)
I did follow the advice here to rename all my speakers lowercase, since google home's VOICE interface seems case-sensitive:
And since the Google Home android app is, literally, the worst and least-reliable mobile application I've ever used, the voice interface is pretty much all I've got.
> Hahah yeah I said “Siri play classical music on Spotify” and siri said “I’m afraid I can’t do that” and then starting playing Sabaton (which is Power metal and the exact opposite of classical music).
As a Sabaton fan, I laughed out loud. Technically they have classic music sounding (roughly) songs, e.g. Christmas Truce, but yeah, that's a massive fail.
Names are a hard problem though. How would it know how "Bach" is pronounced? It seems you would need pretty advanced multimodal AI, some sort of GPT which is trained both on text and audio.
The problem I've had with Siri and music has nothing* to do with parsing individual words. What I've found recently is that if you don't give an exact match Apple just puts in random shit. Hey Siri play the album Rubber Soul by the Beatles on Apple Music gets me random songs by the Beatles because apparently I have Rubber Soul named "Rubber Soul [some edition info]". Hey Siri play songs by the band Duran Duran literally just plays the eponymous album because reasons. You don't need AI, machine learning, GPT, LLM, or whatever fucking buzzword is all the rage, you simply need to revert to behavior that was standard in iOS 15 and earlier. The upgrade to iOS 16 completely nerfed Siri on my phone, starting with some mandatory trial subscription bullshit.
It's to the point where I've given up trying to use Siri while driving.
* Almost nothing. I still have to say "play underground eight zero s on soma fm" because reasons.
Yeah, some others also had the impression that it got worse over time in some aspects. I wonder whether this was some kind of tradeoff with other abilities. Or perhaps they rewrote the code, which had unintended side effects.
Yes, GPT is particularly good understanding, even when you misspell things. It would make a good front-end interface to something like Siri. take the human input and make it something that makes sense to the dumb computer.
Yes, but large language models are very compute intensive and require a ton of RAM. So they wouldn't be able to run locally (this is currently possible with Siri), would be relatively expensive and possibly slow. So they might still be a while off.
The accurate pronunciation in German is irrelevant. Siri is always constrained to one language (Settings > General > Language & Region), and when set to English you get English pronunciations.
Same way Siri understands the english “Los Angeles” even though the G sound is completely different from Spanish.
Lots of Americans seem to try to pronounce Bach the composer "correctly", which leads to batch, bash, buck, ... which is fine, the German hard "ch" is very hard to form for English throats, and it's always better and more polite to at least try than to simply pretend foreign words and names are just weirdly spelled English ones, but it's not as straightforward as with a John Bach from Ohio.
So how would you have to pronounce "Chopin" in "English"? This doesn't make sense. There aren't even consistent pronunciation rules for many genuine English words, like "ead" in "read" and "thread". It's not even straightforward for English speakers to correctly pronounce "Eliezer Yudkowsky". Which means it's even harder for Siri.
Not only Siri — the whole iOS. You can’t type a sentence switching languages in the middle, without changing the keyboard language all the time, if you have autocorrect enabled. It will change what you type into utter gibberish, even though without the “correcting” what you type is perfectly correct. This system is quite visibly designed by people who speak only one language and don’t understand that people may want to use multiple languages at the same time. The keyboard should support a mix of languages, instead of making a XOR between languages, because otherwise when it starts, it’s almost always in the wrong mode, and if it isn’t, it will almost certainly be wrong by the end of what I write.
You’re talking as if there is an anccepted standard English pronunciation of Bach. The only one I know is the German one which I would use when speaking English. Perhaps I would soften the ending.
My point is that there's nothing particularly special about this name of foreign origin, compared to any other word. Every word has lots of variations in how they're pronounced.
The audio clip the person posted was for a true German pronunciation, which happened to be very different than how 99% of English-speakers would say it.
I've had multiple teachers teach me different languages (other than english), not one called me by my english pronunciation. It seems that it's just people who speak English that try to do this.
> it's a name of a concrete person which has only one correct pronunciation.
This is an insane standard. The [x] at the end of the German word doesn't exist in English; most English speakers wouldn't be able to pronounce it if they wanted to. When the demands you're making are literally impossible, the problem is you.
So just because the "th" sound doesn't exist in many languages, like German, they should pronounce "Heath Ledger" or "Anthony Hopkins" or "The Beatles" incorrectly? That seems to me a way more "insane" standard. By the way, the Scottish are perfectly able to pronounce "Loch Ness", which has the same sound for "ch" as "Bach".
> So just because the "th" sound doesn't exist in many languages, like German, they should pronounce "Heath Ledger" or "Anthony Hopkins" or "The Beatles" incorrectly?
They're going to use the sounds that exist for them, yes.
> That seems to me a way more "insane" standard.
I hope you never get to make any decisions. Dave Barry once wrote about someone thinking "What an idiot I am! Here I am, a Japanese person, in Japan, and I can't even speak English!"
But then again, Dave Barry was joking.
> By the way, the Scottish are perfectly able to pronounce "Loch Ness"
The population of Scotland is 5 million; if you want to talk about "most English speakers", the Scottish aren't even worth noticing.
> > So just because the "th" sound doesn't exist in many languages, like German, they should pronounce "Heath Ledger" or "Anthony Hopkins" or "The Beatles" incorrectly?
> They're going to use the sounds that exist for them, yes.
That wasn't the question I asked. They will at least try to pronounce "Heath Ledger" or "Chopin" correctly, they won't act as if there was a correct German way to pronounce those names.
I lived in Japan for a while. My name contains sounds that just didn't work for them. No one pronounced it correctly.
I was not upset, annoyed, or confused. It's just the way language acquisition works. You learn the sounds you need and the rest are hard to acquire later in life.
Be strict in what you send, forgiving in what you receive.
> It's just the way language acquisition works. You learn the sounds you need and the rest are hard to acquire later in life.
As a point of interest, this is actually backwards. You're born recognizing all the sounds; what you learn is to ignore the difference between sounds that aren't distinct in your language.
You do keep that ability for the rest of your life, but it isn't helpful when you try to learn to recognize foreign sounds.
But that's exactly the issue here when people use the correct pronunciation, which happens to be different than how normal words in their language are pronounced, but the voice assistant assuming normal language, which leads to absurd misfirings. The issue is not people not knowing how to pronounce something, the problem is that it's a hard problem for "dumb" AIs to know how a certain name is pronounced, as long as they are not multimodal LLMs.
I think there's something about sounds that you learn early on in language acquisition - maybe your brain develops differently.
'th' is the obvious one that non-english speakers struggle with. I remember a dutch guy laughing at my attempts at various dutch words - I literally could not hear the difference between his pronunciation and mine.
And 'ch' (as in Loch or Bach) is a sound in Scottish english but not in English english.
I lived in Scotland till I was 4, then moved to England and all traces of my previous Scottish accent are long long gone. But my friend, whose surname is Donnachie, says I'm the only English person she's met who pronounces her name correctly - I guess because I learnt that sound early on.
Similarly, my dad, who learnt english in India, still struggles with a "j" sound (he says "zudge" instead of "judge"), despite living here for 50 years and having a posh middle-class English accent that sounds just like a "native" english speaker.
I don't know if "th" exists in Polish or not, but a common (perhaps dominant) spoken way to refer to "The Beatles" is[0] "Bitelsi", which not only loses "th", but also like half the other sounds in the name[1].
Thing is, we understand it just fine. More than that, if you overheard me saying to someone, "puść teraz Bitelsów" ("put on the Beatles now"), there's a good chance you'd identify the name from context. If you didn't, you could always ask to verify (well, not if you were actually overhearing me...).
----
[0] - Or at least would look like that written down. Polish is mostly a "you say it as you see it" language, but with foreign names, often enough people write the correct form but use localized pronunciation.
Want to flummox the Japanese tongue? Try a sentence like "Darth Vader is Luke's father". It hits most of the highlights: interdentals, labiodentals, and that weird 'r' sound English has that Japanese sometimes tend to conflate with 'l'. Even a competent Japanese English speaker is likely to render it as "Dāsu Bēdā izu Rūkusu fazā". Depending on the region they may mess up the 'f'; the syllable 'fu' is actually 'hu', but pronounced with very pursed lips in Tokyo Japanese (not so much in Kansai).
Unless they're bilingual from childhood, most people are not able to pronounce sounds outside their milk tongue without difficulty. That you expect English sounds to be perfectly pronounceable by non-English-speakers is probably more reflective of the fact that quality English education is widely available where you live than anything.
That is completely wrong. People have many names in practice, especially historical persons. Even living people often present themselves differently in different languages.
For some examples:
- the famous Romanian/French modern sculptor Constantin Brîncuși (which uses a vowel that has no direct correspondent in either French or most dialects of English, and it pallatelizes the ending sh, so that it's pronounced in two syllables, brîn-cush with a slightly pronounced ee at the end), but also Brancusi (in French, roughly bran-cu-see).
- in Japanese, since Japanese speakers have relatively few syllables they are familiar with, almost all foreign names are expected to be Japanized; for example, if your name is "Stephen", you would be expected to present yourself as, roughly, "su-tee-ve-n", and write your name with the corresponding katakana characters in certain official documents
I take your point but really like the mirth of a man who died 30 years before audio recordings being represented by a hip hop version of one of his pieces! He never recorded a canonical version of anything!
Funny, my Android Auto playlist for "Bach" is actually Wendy Carlos, a CD rip of Switched-on Bach 2000, which she did on an early 90's Mac II using MIDI sequencing.
In general I find that 80's and 90's era CD's ripped directly to FLAC still sound really good.
You get the same issues on Android. There is zero intelligence, it's embarrassing. You can have a song in your library that you've listened to every day of your life, but it'll still decide you actually want to listen to some weird track you've never heard of just because the names are similar.
It gets real bad if you listen to music in a language other than the language you have Siri set to. It's attempts at deciphering Japanese punk or Bollywood song titles are terrible.
It also has a habit of invoking whenever I say "Hi sweetie" to my neighbor's dog.
I think part of this might be Spotify’s fault. I moved from Apple Music to Spotify Premium and my biggest complaint (aside from literally one SPECIFIC song I really love not being on Spotify at all) is that there is no HomePod support (you have to airplay the music to the HomePod) and Siri support is shoddy.
I understand this isn’t necessarily Apple’s fault, but I bought a HomePod (which are very expensive smart speakers!) because I assumed it’d be the most convenient “smart” speaker with good audio quality. It does deliver on audio quality, but despite supporting literally every other smart speaker under the sun Spotify has no support for HomePod and instead you have to AirPlay from your phone. So my kid can’t listen to Spotify, and the HomePod is useless to anyone but me or my wife.
I think this is Spotify's fault. They're in a weird feud with Apple and won't properly support their app on all of Apple's platforms. I don't know who they think they're winning over but it just makes me dislike them more, I'll pick the Apple ecosystem over their app if push comes to shove.
It's unbelievable that Apple hasn't been able to do better.
My 'favorite' Siri 'feature' is when I ask for directions to 'home' it gives me directions to the local Home Depot.... really???
EDIT: After thinking about this for a while, the reason this irks me is that 'directions to home' is one of the most basic asks. I am not asking Siri to play some obscure unreleased track from an underground British punk band. I am asking for directions to my home, and it fails.
Every few months, Siri decides I live at 123 East Foo street. Just adds the E when I say “directions to home” and tries to go miles off course. Lasts a week or two, then back to normal.
If I spell or type out the address, it still adds the E, with no apparent way to get to my actual home address.
As a bonus, last night I said “cancel my 6am alarm” and it said “you have 29 alarms around that time”, and proceeded to read through each of them. My wife nearly died laughing as it started to rattle them off.
I speak midwestern American English and it’s been constantly trying to text the wrong person, which is weird because a year ago ON THE SAME PHONE this never ever happened. If it’s not working for me it’s definitely a degradation. Siri had worked reliably for me for literally a decade!
I’ve also noticed my autocorrect has gone bonkers. It’s constantly trying to change the case of the word “guess” into “Guess”, the brand I guess? Despite me never ever having shopped or mentioned this. The autocorrect has gotten very aggressive and I type so fast that it’ll take me three or four tries to get it to revert to lower case. It also has some weird context sensitivity to proper nouns they added so if I say “I guess James can come over” it’ll change that to “Guess James” like it’s a name (in iMessage only I suppose) and I have to delete that entire “name” and carefully retype!
It’s gone feral with punctuation, and things like “they’re” get changed to “there” or “their”. Common misspellings seem to appear when they weren’t typed.
OMG yes, the latest updates to the keyboard are a nightmare. I went so far as to turn off autocorrect as I’d rather have my spelling wrong but the meaning the same. It straight up makes up entirely new phrases I didn’t type.
SwiftKey on Android does the same thing to me to random words, I think it's learning some words should always be capitalized because I once used them at the beginning of a sentence.
Yours picks the local Home Depot? Such luxury! It's not uncommon for Siri to navigate my to the other side of the country, even when I speak the entire address.
It's odd that Siri was better under Eddy Cue, for some reason, even though he is mainly the content guy, then under Craig Federighi or the AI/ML experts like John Giannandrea.
The worst part about Siri in my opinion is how it can do bulk actions when it mishears you. You want the ceiling light off? Okay, turning off all the lights. Or the opposite, it turns *everything* on, which can be even more annoying.
They're all complete crap. I use Google Home to ONLY turn my lights on or off and it gets it like 60% of the time. Sometimes it accepts the command and it just doesn't even work and says something went wrong. Pressing the light button in the app is 100% consistent. Incredible.
I've been superstitiously switching between "turn the lights off" and "turn off the lights" (also with music). It seems like every few weeks or days it prefers one over the other.
I say to Siri “close the blinds” eyes and it replied “I don’t see anything like that in your home”. I say again five seconds later “close the blinds” and it closes the blinds.
I have the same problem with my “blinds” - I started calling them “shades” instead (not the usual term, in my vernacular at least) and have had much better luck. YMMV!
It's not a good reason but I believe this happens because voice command has to go to Google server, Google talk to light vendor API, light vendor communicates with your device, and lights go off only if all of this succeeds.
Meanwhile, button in vendor app will not use internet so lot less can go wrong.
The voice commands are worse than touch screens and touch screens are worse than the optimal solution which is a tactile interface of switches/knobs. I stopped using Alexa voice controls for lights since its inconsistent and switched to a bluetooth switch, much more reliable have only had a few instances where it didn't work.
I don't do lots of home automation mostly because the automation isn't practical in a new build house that has switches in appropriate places. There are extremely rare circumstances where I want a light on/off in a room I'm not in. However, it would be a fun project to build a nice remote control with switches and knobs that you can program and have it talk to the home automation solution. To make it even more fun, I'm talking hard flip switches and volume like knobs, really tactile.
It's also surprisingly unaware of how poorly it's performing. I have recently noticed that when you ask it to send text a text message for you, it follows up with a suggestion that you turn on auto-send for siri-dictated messages, even after it took you six tries to get it to understand you.
Siri does ok for me on this. It also gets the shades right if I say shade instead of blind, otherwise the error rate is higher. Overall, I’d rather live with Siri than without. Google does ok as well, Alexa is utter crap for these tasks.
I thought Siri was bad for me because I was on a watchlist for maybe discussing Apple negatively in front of the pod, and they purposely messed with me, doing the opposite of my commands, acting randomly, etc.
Kidding aside, with all those non-deterministic assistants, it will be hard to see when you’re hellbanned from a service.
Apple does lay some turds. MobileMe. Whatever their attempt to make iTunes into a social media network was called. Maps until recently. Usually they fix them, but Siri never seems to get better.
For what it's worth, I think Alexa has gotten worse, but for different reasons.
Alexa has always worked better from across the room, but nowadays an Echo device will do very little aside from weather and timers without trying to sell you a subscription.
Alexa’s inability to just answer a question and go away is why my son didn’t get in trouble for knocking it off its table (and breaking the top), and why we didn’t replace it.
I finally caved in and switched the Siri language to English on all my devices, after trying and failing to keep it in my native language on my iThings while simultaneously keeping my new MacBook in English (because I prefer it on computers, and I wanted the read time out loud at the hour-function turned on - it kept reading it out in a mixture of both languages, wtf?). Turns out it is much better in US English, but still completely useless .
I just set timers using the crown on my watch now, combined with custom google assistant commands to control what I need (lights, tv) through my Nest. Playing and controlling music is too much of an extreme sport for my taste on either assistant, and I've stopped using it for calls too as there's always a 5% chance it will call up some random person I knew decades ago but haven't removed from my contact list because reasons. The name is of course totally different from the few persons I do call on a regular basis.
It's great for joking around with my kid though, laughing at the misinterpretations and stuff like that. And that's about it.
Oof that sucks for you. Siri is an absolute joke. My two major gripes:
- After four years it still doesn’t understand my youngest son’s name. I added a phonetic spelling to his Contacts card. I told Siri 200 times: “my son’s name is pronounced XYZ”’ I religiously corrected his name a 1000 times when text to speech misunderstood it. Nothing. Joke.
- Siri is triggered by anything that even vaguely resembles “hey Siri”: “…they seriously…”, “…ok see here…”, “easier”. And once that dumb piece of %$£~ is triggered she HAS to finish her cutesy “listen to me being a helpful and funny assistant” sentence. Again: sad joke.
I get the impression that Apple makes an effort with accessibility and would be horrified about this blind/“where am I” example. Hopefully there’s an employee here who sees it and can get it in front of the right person.
What totally strikes me as odd is, that Siri on iPad even reacts to the Audible player in the background.
Most false positive triggers stem from Audible books as well as audio play from the Apple Audio.
Sometimes I feel Siri is desperate to be triggered, because if hardly ever, I use it in the car when I have to change direction and need to set a new course on a Map app.
> Keywords are randomly being changed, sometimes they fire, a day later you have to use a different phrase to get what you want.
The most striking example for me is that the word "half" - as in, "Hey Siri, set the lights to half" - stopped working for six months, then started working again.
> The error rate at speech recognition seems to have gone up over the years.
Has it really or have your expectation increased instead? Or is it because they initially only worked well with business us english and have been trained over the years to support many more languages, dialects and even variations of said dialets/language + slang and profanity?
Also at the beginning people where talking to these slowly, articulating every word. Now everybody take it for granted they should be understood and talk in their lazier and more natural way.
I've never activated speech recognition on any device I own. But last week I was in a videocall with my partner's mexican family and early in the call they were trying to make my mother in law's Amazon Echo to stop the music and it only really stopped when my sister in law started using profanity words, something like "no mames! chinga to pinche madre Alexa cayate!"
Apparently this stuff became so used/trained by people not using them in a polite way that they eventually only react when you talk badly to them using slang and profanity.
In my experience using Siri for only a handful of defined tasks over the years “set a timer,” “remind me to X at Y,” and so son, it has objectively gotten worse.
We used to laugh about autocorrect mistakes, but over the past few years it has gotten exponentially worse. Instead of just screwing up the word I'm typing, it'll decide I meant a different phrase and change the previous word too. If you're not saying exactly what it expects you to, it can be extremely frustrating.
I've started turning many of the autocorrect-related features off entirely.
I use it. I don't like it. (I haven't found anything else that's much better, and that's with a fair amount of SwiftKey use; nothing beats now-dead Swype.)
Routinely - routinely - I try to swipe out "and". Often, I get "abs" (I don't work out, let alone write about it), but the most common and bizarre one is "Abbas". The only Abbas I know is the Palestinian president, and I don't write about him at all, let alone enough to justify having that be a common word that comes up.
I realize that with put/out/or it's just not easy to distinguish. But I'll live with that. Abbas?
I assume some users are using new names on a daily basis. If you run a plumbing business, you may never have texted Abbas before - but if Abbas asks you for a quote, you'll be doing it today. Although I agree that people write 'and' far more often.
I assume they have some sort of special-case inclusion/handling for names in swipe dictionaries - it'd be embarrassing if your swipe keyboard recognised George and Donald as words, but didn't recognise Barrack.
But I'd just punch in the letters individually just as I would for any unusual word. Misinterpreting "and" is like how Swype used to put in "née" for "me". Look, maybe I'm not as precise as I should be, but one is a word that people use a couple of times a year, and the other a couple of times an hour.
And "Barack" isn't the most common spelling. I wouldn't expect it to know "Fillmore" either.
Shouldn't frequency of use matter? Why go to all this effort and not put some kind of weighting on its word choices?
I have a similar problem, but it is literally every English word that is also a common name. I send texts like "I Will run by the store and May grab dinner for tomorrow, Hope to be home by 7". It's infuriating.
I'm also blind. I use airpods, and sometimes I'd like to know where the phone is. I ask Siri, where is the phone? Or "make a sound." Siri just says some stupid joke, or "I'm right here." So not useful, just make a chime or something.
Even basic stuff like changing the lock rotation can't be done with Siri. I just don't understand how it could be that bad. I feel like I could sit and code a better parser that would be more useful. And Siri was the first really big voice assistant, they've had 13 years to get it working.
> The error rate at speech recognition seems to have gone up over the years
This isn't unique to Apple: my Amazon devices have definitely become slower to respond, and less accurate when they do, than they were a while ago.
I think part of the issue is that the companies have not found a way to “properly” monetise the services (my echo thingies where dirt cheap and I've never done anything with them remotely like making a purchase) so they are essentially throwing money at the infrastructure to keep them running due to the fear of the backlash if they just let the services die.
I'm English and my wife and kids are french so naturally all our devices are set to french - our Google speakers do not understand my terrible french accent but bizarrely work perfectly when I speak English but only if I do a completely over-the-top french accent.
Same with Hindi. Needs a ridiculous English accent to understand Hindi. I call it the British Colonial Administrator accent. Like this guy - https://www.youtube.com/watch?v=V6_896OhnaQ
> Siri fails to understand the name of the person I want to call
After too many frustrated attempts, I found a crude fix for people I call frequently - use the shortcuts app to set a single word to trigger that specific call. Use words that are hard to misunderstand, like "newspaper" or "fantastic". Silly hack but so far it's been working flawlessly.
The project has been on the department trading bazaar, selling itself out to survive.
"So we in the MacOs trading station, will push siri enable notifications on powerbutton wakeup, if you push the following answers on keywords up the likelihood.
Deal? Deal!
Thus she existed on, living long and prosperous.. through diplomatic victorys, that destroyed the qualities that could not keep her alive on her own..
I use Siri to turn on/off voice control which lets me control my iPhone without touching it (it's a disability feature).
When I no longer need voice control I use Siri to turn it off (I have my voice control set to label all actionable buttons on the screen so I like to turn it off when using my phone via touch.)
I'm not visually impaired however I find voice control super useful when I want control of my phone and rather not use touch or in a situation where touch is impractical.
Whilst Siri is good for certain tasks it lacks the ability to control every aspect of the phone whereas voice control is literally a replacement for touch control.
By combining Siri and voice control I get fine grained complete control of every app when I need it, all enabled/disabled via my voice. Such a brilliant combination.
Apple's attention to this disability feature is incredible. I only learnt about it after a visually impaired friend showed me how it works.
I learnt that voice control/voice over is the reason most visually impaired people use iPhone due to Apple's dedication to building world leading accessibility features.
My wife switched from a Pixel to iPhone and has utterly hated Siri and voice assist on iOS to the point she'd rather just deal with her hand pain. And it's a lot of pain.
I've listened to see how bad it can be at understanding her, and it's appalling.
Note that this is probably about an accessibility feature distinct from the horrors of getting Siri to understand that you want to call your mother; there's an option to label every clickable element, interacting with voice becomes much more sane and programmatic. "Click A, scroll down, click B", that sort of deal apple built Very Well.
I switched from a Pixel to iPhone and had to turn off Siri because the difference in quality is jarring. I'm also sad that I can't ask my Google Home to find my phone anymore, or use Google Assistant in the car with "Hey Google."
But camera that can take more than 3 pictures rapidly + iMessage + magnetic charging + airpod/iwatch support + build quality are ultimately just more important features to have on a device I use every day. I hate that getting these features right means that Apple can forever get the other features wrong.
Former Apple Employee here: I still remember for one WWDC my colleague’s talk had to be altered to change all mentions of “A-Series” socs to “A7 and later” to avoid triggering “Hey Siri” on every single person’s phone in the audience.
A scene in the second most recent episode of Ted Lasso, when a character uses Siri on their iPhone triggered Siri on my HomePod Mini in the same room. Siri attempted to follow the instructions spoken by the character but failed miserably. If Apple is going to plug their half baked assistant in their shows, the least they could do is prevent it from being invoked. It may vary by language – I’m in Australia and the character speaking has a British accent, yet sounds nothing like me.
There was also a period of about 18 months where running water in my shower would trigger Siri on my Apple Watch. Dismissing Siri was pointless, as it’d get triggered again 5 seconds later.
The same Ted Lasso thing happened to us. It's strange because Apple produces this content. You'd think they'd make sure their Apple+ content doesn't lead to bad Siri hiccups
That made my day!
Just a few weeks ago I was watching a 'Shrinking' episode on Apple TV where they said "Hey Siri play <some-song>" and then my HomePod saying "OK I'll play <some-song> for you" and me yelling back "Hey Siri stop! I'm watching TV".
Surprised for that to happen from an Apple TV+ series. I was under the impression all voice assistants had a standard way to play inaudible sound to briefly disable their activation for this exact situation
My wife and I bought a VW ID4. It's "smart" voice command keyphrase is "Hey ID". _Every_ time one of us says in conversation, "I have an idea" it kicks on. So, now we have to talk around it and never use the word "idea" in conversation while in the car.
I use Siri fairly frequently for setting timers, controlling my smart home and the like, but for me where it falls down is really understanding natural language. Personally I found that google assistant was way way ahead of siri in this regard. I could ask it to do something and it would just understand what I want.
One issue I frequently have with siri is that commands that work one day suddenly don't the next. If I ask siri to "lock screen" it tries to find smart home door locks (which i dont have) instead of locking the device screen, eventually I figure out some combination of lock device/screen/phone screen/off etc that works, so siri does know how to do this but isn't smart enough to figure out my intent.
The other issue is that siri unlike google assistant can't maintain a train of thought, You can't say "hey siri dim the bedroom lights" and follow that up with "hey siri..a bit more" unlike with google where it seems to be aware of the context of what it did prior.
> where it falls down is really understanding natural language.
I would settle for Siri understanding simple commands. Natural language is something for Apple customers in the 2040s. I want to not have to use creative language skills to decipher things in Reminders. "Who is this Paul Cage I'm supposed to call? Oh, repairing the pool cage."
Forget LLM or natural language processing; Siri needs to catch up to 2018's Google Assistant
2023 Google Assistant needs to catch up to 2018 Google Assistant. It has become progressively worse since I invested a great deal into setting it up with smart-home devices in that very year.
Amazon mitigated bad speech transcription by providing a link to listen to the underlying audio.
Apple has never offered this and it has led to lots of head scratching at the grocery store as I work through my shopping list. It still beats trying to use the Alexa app while shopping, though.
Amazon is hilariously bad at understanding voices. The number of times it has played "Pure" instead of "NPR" is astounding, despite me going in to the Alexa app to report the error every time.
I'm convinced those error reports are round-filed.
Thrown in the (round) trash can. At least that's what I meant. Based on my googling it looks like it's an uncommon term -- way way less common than I thought. Not sure where I first heard/read it.
The lock screen issue is really funny. Because I have the same experience where I will come home and say hey, Siri lock screen to close the navigation. However, now it doesn’t seem to work anymore and I have tried some of your prompts and some combination does work.
I also use it in a limited way and it's fairly successful. Just homekit turn on turn off and set scenes.
However I have one frustrating bug with shortcuts that sends me over the moon. I have a little shortcut that just sends a simple text message when i say 'Hey siri, Sweetie' about 20% of the time siri comes back telling me about musical artist sweetie... it's braindead.
The biggest problem I see with Google is that saying "ok Google" feels like a mouth exercise, especially when i have to repeat it so many times in a day.
I’m frankly surprised that Siri hasn’t moved to on-device voice recognition. It’s by far my biggest complaint with Siri: random, unpredictable delays or outright failures to recognize simple messages because the network is unstable or unavailable.
There’s other trash too: the other day I tried asking Siri to play a song from my library, but it misinterpreted the song title and proceeded to activate a seven day trial of Apple Music Voice to play some random track on Apple Music. I didn’t ask for that subscription!
To be fair, for the tiny subset of functionality I do use - navigate by voice, set timers, get terrible jokes to amuse passengers - it works fine.
All of Apple’s services by and large handle spotty networks horribly.
Take Apple Music for example: if you want to play music you’ve downloaded to your device, it’s often better to switch into airplane mode, because with a present-but-poor signal it will still try to load album listings etc over the network, and sit there on a spinner.
Spotify is just as bad in this regard! It used to handle spotty connections flawlessly a few years ago, but at some point it was altered to keep trying to pull album art and track listings from the network even for downloaded albums.
I have an issue with the Spotify app on the Apple Watch. I’m not sure if it’s an apple or a Spotify problem, but I have loads of trouble when I’m using the watch app to play music when I’m at home, near my phone and near my Wi-Fi network. However if I leave the house and walk away without my phone, leaving the watch to use this 4G connection only, it works flawlessly.
If I come back from my run Spotify will often stop playing as soon as it gets home in range of the phone.
This is a common developer curse. I just ran into an issue at my job where customers were having an issue in the field that we just couldn't reproduce in the office; it took a long time to put together that the issue was related to failing downloads on sketchy networks because all of our networks are stable and fast.
This is so annoying! Every single time I try to play downloaded music on a bad connection it stalls instead of just playing the audio I literally have stored on my device. 100% agree with you that turning on Airplane mode makes it easier to play AM on iOS.
Not that this excuses it, but I've been able to get this to work by going into the "Downloaded" tab in the Music app. That seems to play directly off the device.
Seems like it does - cool! - but why does it sometimes spin for several seconds when faced with a seemingly simple request? I'm guessing it has some path where it tries to use the network if possible, but if the network is spotty then the requests timeout and it reports a failure?
I hunch that on-device recognition leveraging the onboard ML hardware will be a whole new thing. The Siri brand is so damaged now I expect they’ll launch something new instead of “but it’s actually good now!”
Except they already do this? Since iOS 15, Siri has used on-device recognition. That's why you can go into Airplane mode and do things like ask Siri to change the brightness.
It's even more embarrassing when you look at the old Apple commercials introducing Siri up to 11 years ago, here's one from the rock when they introduced the iPhone 7:
Anyone thats used Siri will watch that and laugh at how absurd it is that someone could get even one of those voice commands to work the first time. I'm surprised there hasn't been a class action lawsuit for false advertising. I can barely get it to start a timer on my apple watch sometimes.
Siri was better earlier on. In fact they were better before Apple bought them than ever after. And the primary reason is optimization of cost and scale.
And something similar is happening to GPT although not to that extent. Bing has gotten worse over time as they use a pipeline of models, where smaller models serve most common and easily predictable tokens, while the full model only handles the hard stuff. Except... you can't tell always what's hard stuff or not, so Bing serves useless distracted answers ignoring user context from time to time.
The best bit about that ad is that my homepod replied to every 'hey siri ...' with an error.
"Hey siri read me my last email" "I'm sorry I can't do that"
"Hey siri list my reminders" "You can use the photos app to do that"
"Hey siri show me my fashion line (?)" "You can try a search in your web browser"
I know you say it as a joke, but the contact list could be a part of the hidden prompt, at least the names, so that it could then trigger an action with the proper name, which iOS would then understand.
Ideally the model would insert an API call into its output to fetch the contacts first, then make the call.
The scary part is asking the LLM to read your messages. Imagine someone texts you an LLM jailbreak. Siri might not be Siri anymore after it finishes reading.
A sad, but true, indictment. Siri on Apple Watch is brutally bad, it fails more than half the time when asking it to do something simple like setting a timer. I've disabled CarPlay because my car's voice control is much, much better than Siri.
I’ve used Android phones in the past, and Google’s tube person. Siri has IMO clearly gotten worse over time, on both my phone and my HomePods. I wouldn’t be surprised if Google is better at this point.
It's not really "embarrassing". It's just old tech that was created in 2003 when the most common platform was Athlon XP CPUs, and that was okay-ish for its time, and it may take time to migrate current dataset.
Maybe it's the emerging neo-Luddite in me, but I'm alarmed at how many people begrudgingly continue to use these tools (accessibility reasons notwithstanding, which is an admittedly major caveat).
I have a very low tolerance for error in things like setting timers, playing music, getting weather, calling someone. This is because I can complete these tasks simply, quickly, and with 100% reliability by hand. The marginal improvements allowed by tools like Siri aren't worth it to me if there is any decrease in reliability.
I'm sure there are use cases I'm not considering, but I've also seen perfectly able-bodied folks shout "call Steve" into their phone for 40 seconds longer than it would have taken to navigate manually. Conceptually, the idea of stacking even more engineering onto these tools to get these tasks working reliably is funny.
Nah, I'll have to disagree with you there. I don't know about Siri, but I am one of those people using the Google Assistant to set timers and reminders, and for me at least, this is reliable (enough), and definitely faster than by hand.
The obvious use case is setting a timer while cooking, when the phone is at the other end of the room and your hands may be dirty, but I'm using it even in non-cooking settings.
In fact, I just tried it and timed both approaches for setting a reminder. Results:
* Assistant: "Hey Google, set a reminder 10 days from now to file my taxes.": 4.5s to say it (I'm free to focus on other things after this point), 7.5s until reminder is created.
* Manual (pick up & unlock phone, open app drawer, open Calendar app, find the day 10 days from now, click on that day, click create a reminder, type in "file my taxes", save): 17s
So Apple pumps more than a billion dollars into developing and marketing a virtual assistant which fails the vast majority of the time and you’re proud of it because it can set a timer by voice?
But I can second his point of view from a Siri user standpoint. I use it almost every day for two insignificant task, but yet it work and I use it everyday:
- Shutting down the remaining alarm clocks (I'm really not a morning person)
- Setting up timer when I cook
At some point I also used it to control smart lights but I haven't set up the system again in my new flat. And I'm seriously considering reverting to a purely physical and traditional light-switch interface instead of relying on IoT. (Take ages to set up right, rarely changing setup afterwards, waiting for updates to switch light on, flashlight mode when electronic become damaged, etc).
I just read it again and I never read them express pride in the tool or say they were using Siri. The comment is about the impact for them on using a voice-based interface to set a timer. Bad faith argumentation.
This is the key thing, and the formula's different for everyone. I know that my strategy of "look at the clock and remember" will work until I'm senile, as will "panic-check the mail every day until my last tax form arrives and file that night"
Siri is tremendously frustrating to me, but helpful to my parents whose eyes are going bad.
For all Google's faults, the Google assistant is years ahead of Siri and Alexa at natural language comprehension. It's not perfect, but it clears my bar, and it comes in handy a few times per day.
It doesn't matter how frequently Siri gets things wrong, my wife insists on using it for everything. The number of times I've heard her repeat herself to Siri five times in an increasingly frustrated tone is pretty high. It's baffling.
Number of seconds is not a relevant metric for some people.
More relevant for me is how much and often I need to switch my physical and cognitive task at hand to briefly use my device. If I'm in a focused mode going back and forth in my house getting things done, it's a huge benefit to just verbalize my thinking in the moment rather than come to an abrupt stop and switch to visuospatial navigation to drill down to the timer app, and fine tune the amount before I can switch back to what I was doing before.
But that argument only holds if the voice assistant actually works. The point is that it often doesn’t, requiring repeated attempts that end up taking more focus and time then just manually doing it. There’s no way that doesn’t end up requiring as much if not more cognitive switching.
What bugs me the most about this is how fixable it is.
Apple is now at a point where even the oldest supported devices on their latest operating systems have either decent dedicated ML hardware or are Intel-based machines with enough compute power that they should really be able to do a much better job with inferencing locally (and the server component is irrelevant to local device capabilities anyway).
It should’ve been possible to bin Siri’s backend and rebuild something better from scratch by now. That Apple is swimming in money just makes things like this more jarring.
It’s honestly my greatest gripe with Apple as a software company. They put a lot of effort into polishing what they release but seemingly very little into maintaining it afterwards. macOS is the worst offender, where basic and fundamental things stay broken for years on end (settings not being applied after selection in the UI, permissions being set in the UI but not actually taking effect despite being set, continuity stopping working silently in FaceTime and iMessage. It’s gotten so bad I just accept that some former tentpole features no longer exist because they’ve become so broken and no-one cares to fix them).
Back to Siri though, it’s gotten substantially worse than when I used to use it on an iPhone 4S. That’s a decade plus of consistent regression while Google Assistant continues to progress. It’s so bad, and can’t be replaced, that I wouldn’t be surprised if it’s a leading cause of people exiting the eco-system altogether.
> That Apple is swimming in money just makes things like this more jarring.
This is a common theme. Big Tech Company X is rich therefore they should be able to do Y. But corporations don't really work that way, they build a monopoly around a few domains and then their organization is structured to maintain that monopoly. It's the exception, not the norm, for established companies to gain competency in a new domain, and it usually comes with an acquisition or a special initiative by senior management.
Mostly true, but Apple's whole thing is that their domain is user experience, not any particular technology. That's how you get from GUI to iPod to Apple Silicon. They should be able to do Siri right by focusing on experience.
I think the real problem is that they see Siri as a checkbox compete feature and not a core value prop that must be not just better but categorically different user experience. IMO it's a will problem, not an expertise one.
Siri reveals the problem with deep integration using proprietary standards. A modular stack designed around interoperability and open standards makes it far easier to swap a subpar software service with a superior one. Conversely, deep integration using custom interfaces requires redesign of many stack components to switch to a different design principle for just one component.
At least Siri doesn't pester you to buy various things or to use it for additional purposes like Alexa does as far as I know.
But, yeah, all the voice assistants are pretty bad or at least bad enough that I mostly give up trying to use them for anything other than certain rote tasks. Siri may or may not be marginally worse but none of them are good enough to, say, really use hands-off in a car unless I've carefully pre-defined tasks to perform. (e.g. pick from a handful of memorized playlist names).
Gah, my mother has a few Echoes dotted about her house and it’s the only inanimate object I feel compelled to tell to shut the hell up on a regular basis.
Give me a direct response if warranted, otherwise a simple chime or acknowledgement. Multiple years into ownership no-one wants a “by the way…” with upsells to other Amazon services.
It’s even worse if the reason you were talking to it in the first place is to turn something down so you can actually speak to someone.
"By the way" is alexa's entire business model. They sell these devices at a loss and don't require a subscription for the nlp so they can "by the way" you into buying random shit. I'm surprised they let you mute it at all, especially after devices lost $10 billion last year.
Then maybe it should try to make a sale instead of blathering on about how I can ask it what sound does a pig make.
They have 15 years of purchase history, prime video usage, and presumably are snooping on what podcasts I play over Echo. They should be able to suck a little less. Any random page of a 20 year old sears catalog would carry more relevance to me than anything Alexa has suggested.
I am not really sure why people buy into this. These devices cost next to nothing to make, the royalties to the tax haven holding co and the private jet fuel are thrown in to the "cost" as a write-off and a tax dodge.
I HATE that it does that, it made me try to move to Homepods, but even with Homebridge I couldn't get everything to work with Homepods. Also, it has its own drawbacks, like no screen to see a timer.
This is exactly the reason why I replaced all Amazon echo devices with HomePods.
Siri makes many mistakes, doesn't hear me correctly, doesn't understand me correctly, and so many other issues. I can tolerate all these mistakes that Alexa also makes.
I just cannot tolerate Alexa talking back to me, especially when I'm trying to do something important. I've lost my train of thought too many times because of Alexa.
Same with Google Home! One too many times, I asked it what temperature it was outside, and it went on some monologue about how I should try asking it to give me a summary of my day or something.
The optimal amount of times a product should interrupt your workflow to tell you about a new feature is 1 or fewer.
Just preference honestly, we have a lot of apple devices, and it's nice to be able to keep things consistent. If we both had Android Phones we would've probably used Google assistant.
I’d love to see a survey of people’s hit/miss rate with Siri, but with the extra dimension of what accent they have and how thick it is. IME, Siri’s speech recognition is actually quite fantastic (I almost never have an issue with a dictated text message saying the wrong thing), but:
- I have a very standard midwest american accent
- I talk to siri like I’m a pilot talking to ATC: As clearly and succinctly as possible, never saying “umm”, “uh”, or having to correct myself, etc (this is actually an acquired skill that takes time)
- I generally know what things Siri can do well and what it can’t (I try to phrase things in ways I know is less likely for Siri to misinterpret)
My theory is that most people who have a bad experience either have a thick accent (and importantly don’t set the Siri language to something that matches their accent! there are multiple “English” settings in the Siri language, pick one that matches your accent!), don’t speak clearly/mumble, have a lot of noise in the environment, or some combination of the above.
You must have had some issue being understood, or you wouldn't have adopted the careful speech pattern.
I suspect it's hard to do a realistic test. Two of the most common use cases seems to be kitchen timers and playing music. Both imply a noisy environment.
Years ago, it seemed Siri would try to understand you via whatever voice accent you'd chosen it to speak in. If you wanted UK Siri, it would understand... a UK accent. Using a Siri UK voice, and speaking with a US midwest accent wouldn't always get you what you expected.
Your video doesn’t show you changing the Siri voice though, you’re telling it your accent. There’s “Language”, which is your accent (“English (Australia)” in your case), and then “Siri Voice”, which is set to “Australian (Female)”. So of course if you tell it you have an Australian accent, but then speak with an American accent, words like “Light” will get confused, because the en_US “Light” sounds like the en_AU “Late”.
If you have an American accent but want Siri to speak with an Australian one, set “Language” to “English - USA”, but then set “Siri Voice” to Australian. Don’t set Language to something that isn’t your accent!
Two pretty consistent frustrating issues I have are:
Me: "Hey Siri, tell my Mike I'll send that document over in ten mins. How about pizza for lunch. I also spoke to Jenny and she would come for lunch too."
Siri: "Calling mike."
So cannot tell the difference between 'tell' and 'call', however Siri could use context and assume if I'm talking for quite a while after saying maybe tell or maybe call then it's most likely 'tell' because of the sentences afterwards. This is by far the most frustrating.
Second issue is:
Me: "Hey Siri - what's in my calendar today."
Siri: "Playing alternative radio station."
Such a tremendous amount of times music has started playing, I think the top thing on their "can't understand" code is to try and play music. About once a week Siri will take something I say about anything and start playing music instead. It's infuriating. I've deleting music from my phone but I can't delete it from TV / Homepod Mini / Mac.
Edit: I've thought of a third which isn't so bad.
Me: "Ask Hassan has the mortgage has gone out yet?"
Siri: "Here's your message. `has the Moorgate has gone out yet.`"
It will replace random words in sentences with locations, and locations, in the UK, any word it cannot get, it will first assume I'm talking about any town in the UK and use one of those. It is ludicrous the amount of priority it gives to names of towns, I've even had tiny villages inserted into sentences before.
I wish we didn't have blogspam on HN but theinformation.com, which used to unlock certain articles for HN readers, doesn't answer my emails anymore. Totally fair on their part but the story (and thread!) are interesting and only the blogspam is publicly readable...so here we are.
I get that they're wary of an LLM-based Siri making embarrassing mistakes. It would be a bad look if some kid's phone provided helpful instructions for killing themselves when asked. Mummy and Daddy would sue.
However, directly answering queries is not the only way LLMs can be used. In fact, when ChatGPT 4 came out, the first thing I thought of was: "Wow, this could make Siri so much better!"
For example, an LLM like GPT coupled with voice recognition was used to create Whisper, an AI that has nearly perfect text-from-audio recognition. One of my biggest gripes with Siri is that it is basically useless in a car because even slight background noise confuses its voice recognition. An LLM would fix this.
Another point is that many people don't realise that LLMs re-read their entire input for every word they generate! Their writing speed is so-so, but they can read really fast even when running on mobile device hardware. Think 10K to 100K words per second. An LLM could read through all of the text on your device when prompted for search queries in a fraction of a second. As long as this was carefully set up, it wouldn't be able to produce "bad output", because it would just be matching data to your prompt.
E.g.: Imagine GPT being prompted with: "Does this email match the query <q>? Say only YES if it does or NO if it does not. <email>"
It doesn't matter if it occasionality hallucinates and outputs gibberish, you just mark that as a "NO" and move on. This is also very easy to train out of a specialised version of the model using reinforcement learning.
PS: I just played around with GPT 4 to see how it behaves when asked to recognise requests for creating calendar entries, and it's pretty good. For example, it can correctly compute things like "next long weekend". Interestingly, ChatGPT 4 is already doing some similar prompt injection, and I can't override its sense of "current time".
Apple's Siri team has failed really badly. Everyone else is sprinting away while they're not even aware there's a race going on.
I think reading the Information article, there were 2 issues, which it wasn't good at separating. The AI model sophistication and usage are one - and yeah, I think they could be more creative in using LLMs and such. But the other one seems to be a back end issue, i.e. that brief mention of "clunky database slowing down iterations/upgrades". That always seems to be an Apple Achilles heel...and I'd hope also they are NOT using that terrible IT infrastructure company (Agilent or whatever) to outsource their backend...
I don't think that's quite correct. The underlying algorithms start by processing the prompt tokens in parallel, and once that's processed, the intermediate KV vectors are re-used for each new token prediction by the feed-forward.
What you may be thinking of is that once the model stops predicting at the end of an API call the state is dropped from RAM, and if you go back with another message in a chat, then the LLM will re-read the chat log up to that point as part of the prompt. So this is true on a per-message basis, but not on a per-word or per-token basis.
For me all of the current voice assistants have issues.
Playing music seems to be super hard to get right if you don’t follow mainstream. Either you get a playlist with always the same titles, or it starts playing some obscure remixes.
Home automation is in my opinion totally stupid to do by voice. The light should switch on when it is dark and I go into a room and should switch off when none is in the room. But using voice which 99% of the time works is “slower”, and in one percent of the time it is annoying if the light in my kids room switches in.
I see use in the Knowledge augmented ChatGPT like assistants, since then I could reliably ask a digital assistant for information. Right now I know the moment the assistant starts answering with “this is what I found on the web” that it’s going to be wrong. Similar annoying is that Alexa for example seems to rewrite some requests, where I ask about “what is X” and I get the answer on “what is Y”.
I’m pretty disillusioned and disappointed in all the different voice assistants out there, and I seriously think that voice on top of gpt will actually help me get more often what I want.
What would Siri’s market share be if you could easily install Google assistant on iPhone? How much more investment would Apple have put into Siri if they had to compete?
I haven’t noticed any improvement in Siri in close to a decade of being in the Apple ecosystem. The most frustrating thing is that every time I ask it to “turn the lights on”, it thinks I’m saying “off”. I grew up in the US and have no notable accent or mumble.
Note that there are hacky ways to add google assistant, but it doesn’t get integrated into the OS in the same way
I think I might be in the minority based on other comments here.
I hate things like Siri and whatever Android phones have.
It's incredibly rude every time I hear someone almost yelling into their phone in public. They get frustrated Siri doesn't work and loudly and slowly say whatever it is again like being mean to a dumb child.
People speak out entire text messages and the replies. I'm not pointing out folks with disabilities. I'm talking about perfectly healthy adults who seem to have no sense of awareness around them.
On my phone, I found it infuriating that Apple just inserted this Siri icon into the keyboard next to the spacebar. It took me a while to figure out how to disable it. I already have large hands, but my typing accuracy is pretty good and fast. When that icon was there, it reduced the width of the spacebar and return keys and made it impossible to type without Siri popping up every few lines.
I think more broadly I have a prejudice against devices listening to me. I've never liked Alexa, Siri, or any other voice based automation system like customer support phone trees. Very rarely I've encounter a phone system that works well, but mostly these devices and systems are just infuriating to work with.
> I'm not pointing out folks with disabilities. I'm talking about perfectly healthy adults who seem to have no sense of awareness around them.
Careful. It's easy to overlook disabilities that may not be immediately externally visible.
I have a friend who's heavily dyslexic. You'd never know it if you ran into her on the street; she's quite successful in her career. Yet she uses voice dictation when sending text messages because it's an order of magnitude less time and cognitive effort for her to write something out that way.
I tend to think the world would be a better place if we were all just a little bit less judgemental of things we see others doing that we can't immediately explain.
The claim that somehow LLMs as they stand are somehow the answer is disturbing. ChatGPTs ability to be authoritatively wrong is a serious problem…
Just to see what it would do, I gave it a basic word problem the other day (two people drive towards each other) and it had the steps right, but buried in it was a simple logic error (it claimed that the two parties traveling at different speeds would travel the same distance in a unit of time).
That it was good enough to seem trustworthy, made it worse…
Sure, but do you need Siri to solve logic riddles? I don't. I need it to reliably hit other APIs and shortcuts with some basic arguments. Check out OpenAI's ChatGPT plugins [0]. You can connect the LLM system to an external service with just an OpenAPI model and a natural language description of what it does. I want that, via voice.
Did you try that with ChatGPT4, or 3.5? GPT4 has gotten better at this type of problem, it seems.
I think the point people are making is that GPT models are getting better at an exciting/disturbing pace depending on one's point of view, while Siri has been around for years and has only gotten worse.
All voice assistants seem bad, but Siri seems particularly bad. God forbid I ask it to do something simple such as playing a specific artist or album on Apple Music or ask it to turn off lights I've configured in the Home app.
"Siri, please turn off all the lights in my apartment."
I live 20 miles away from work. 3 times a week, I have to go in. For a while, every morning, I ask Siri directions to [name of my company] to gauge how bad traffic is and how long it will take. Every time, she gives me directions to some pub in Saskatchewan, 18 hour drive away, that shares no part of their name with the name of my company.
I just gave up.
Alexa is not much better. I set up voice recognition to turn on my espresso machine (Rocket is the brand). Every morning I wake up, I tell her to turn on my Rocket. For about 3 months, things worked perfectly. All of a sudden, half the time, she instead starts playing "She's a Rocket" by Robert Ealey, a thirty year-old song.
Honestly, I find Siri the most powerful popular voice assistant out there. Once you starting using Shortcuts, you can use Siri for literally anything. With just holding the power button, and a couple words, I can start tracking a run, schedule message a friend, record my thoughts, close all my tabs that contain the word "youtube"... And so much more!
Shortcuts & Siri is honestly the main reason I switched from Android to iOS.
The timer, sometimes, and occasionally I try to get Siri to play something in the car so I don't have to look at my phone. It normally fails -- it will play something random from apple music before it plays anything from my library.
This drives me nuts. I get it, sometimes albums and songs have similar names. It seems like Siri should be able to deduce which I'm asking for given I've listened to one album over a hundred times and never listened to the random thing with the same name it decides to pull up.
The most impressive thing I can get Siri to do is "Hey Siri, play my <playlist name> on Spotify", and it works consistently. "Hey Siri, reply." is another decent one, but it seems to be working poorly lately.
Siri is mostly useless. It works about half the time for me. When I ask it to do a task like create a timer, it works, but asking it questions is almost useless. One time I asked "What is a picometer?" and it gave an answer from Wikipedia, but then I immediately asked "What is a femtometer?" and it said it couldn't answer it on the iPhone. My wife gets amused whenever I use Siri because half the time I'll start swearing in frustration at how stupid it is, and yet like Charlie Brown and Lucy, I keep trying to kick that football.
“This is what I found on the web for XYZ”… all the time!
At this point, they should just pipe the #1 search result into ChatGPT for a nice summary and have Siri read that out loud. It would be so much better.
Yes! My big complaint about Siri is that she's just the same old "here is a list of preprogrammed actions you can perform, with a fallback to just dumping it in a Google searchbox" but you don't get access to the list. So finding what she can do involves outright guessing, scouring the web, or closely watching Apple demos. She can do quite a bit, but you'll never find 90+% of the options because Apple only shows you a small handful of suggestions.
Discoverability is a huge problem. Also, Siri can't launch apps, which seems like it should be an extremely common operation. "Hey Siri, launch Words with Friends".
My favorite is on the Apple TV, "search for [some video] on youtube" will actually open the app and search for it.
I think the big issue is that a lot of these aren't discoverable, and because Siri isn't very smart at natural language you won't discover them by asking for something similar like you could with Google Assistant. An LLM for parsing requests could be super interesting here.
Siri mostly runs on device. Having a LLM needs a much more powerful device or Apple would need to drastically scale up its cloud buys from AWS, GCP, and Azure (which would in turn make the service more expensive).
It's useless beyond manipulating iDevices, navigation tasks and simple questions. Even then, if you don't ask in the exact right way it often doesn't work.
Is this any different than Alexa or Google, though? I guess Google is probably better at looking up facts and Alexa at ordering stuff (if buying things without even seeing a picture is your thing), but neither is that useful. Is this not just a case of voice interfaces being much harder to do right than people originally hoped?
(I say this as someone who has a Google Home that I use regularly, but mostly as a kitchen timer and music player.)
I mentioned an example in another comment - basic manipulation of timers don’t work well - like, adding time to an existing timer. When I’m working in the kitchen this is really useful. There are also major performance issues - “working on it” and “sorry can you say that again” are probably two of it’s most common utterances. Random bus like “there’s nothing playing” and the wrong siri taking over the request from a million miles away and you just can’t even hear is extra annoying.
Google Home is pretty bad at adding time to existing timers too. Sometimes it works, and sometimes it creates a second timer with a difficult to access name.
Personally, I'm become convinced a good talking/listening clock is actually a useful thing, but Google isn't providing it, and it sounds like Apple isn't either.
At this point it will be a race see which virtual assistant can become universally useful first, and from what the article is implying Siri is not anywhere close to where they need it to be.
My favorite is “[Let’s] go home”, which is what I use to navigate back to my house via Apple Maps.
There’s also sending messages to people - you can send voice messages (“send a voice message to X”), which works decently well and avoids transcription failures.
There is one killer application for voice assistants: answering phone calls. I wish Siri could respond to all unknown phone numbers, screen promotional and robocalls, take notes if necessary and report what's important to me (and I need to be able to tell her what's NOT important). This is what the assistants should do, not switching lights or trying to play a random song.
Neither Siri nor its competition can do this. Speech recognition accuracy is not really important, it can just take voice notes as a fallback, or ask to "press 1" to talk to me directly.
I remember Google showing this kind of tech off a year or two at their dev conference. Was it just vaporware with a person in the back using a synthesized voice? We’re waiting google
Google's demonstration was kinda "Hey! We've got this great feature where our personal assistant will piss off everyone you interact with in order to save you 10 seconds making a dinner reservation". I'm not surprised it went no where.
Generally, you sit down to a meal or coffee with the executive in question; they answer your questions and try to convince you to join.
That conversation might span a few sit-downs or extend into e-mails.
If you eventually say “yes”, you’ll be funneled into the hiring pipeline midstream, skipping the entire front-end process.
It’s largely a formality — HR is told to hire you unless there’s a glaring red flag.
When I’ve been in that position, I wasn’t even asked for my resume until after we’d already negotiated my compensation, solely for inclusion in my HR file.
Hopefully they skip that step but it is weird how often a recruiting process goes from having a recruiter cold contact you with platitudes singing your praises and how valuable you would be to their company and then the next step is to have a group of interviewers demanding to know why you're worthy of being in their presence while expecting you operate like a broken clone of Google and also recite CS theory that has never been even slightly applicable to the open position. But at least the 'inside-out condom' brain teaser has finally gone out of favor.
Sigh, I've been on the receiving end of that multiple times. :)
In some companies (the ones that aren't just cargo-culting or on autopilot) I'd bet that someone in HR/compcommittee consciously intends it to be hazing rather than evaluation. Maybe theory like that it makes the company psychologically seem more attractive (the brain thinks, if you're jumping through hoops for them, there must be a reason), and to take candidate's ego down a notch so they're less demanding in compensation negotiation.
Every company has engineers who don't like some products or features. We aren't mindless corporates drones. I don't personally use any of my employer's applications.
Some article like this comes up every so often and it's so non-newsworthy. Anyone who's been at a project planning meeting knows there's a small contingent of engineers in the corner muttering that it's a terrible idea and it will never work.
Apple and other mega corps really need to spend some money investing in training/education for these highly specific fields. Advancement in Siri shouldn't be dependent on 3 people that were lost to Google.
I know a bunch of capable programmers that would love the chance to go into one of these specialties but don't have the resources nor opportunities (time/money/location) to go back to school.
I asked Siri to call the San Mateo County Public Works Department. It called the Sheriff instead. Normally it will ask to confirm before calling a number/person/business you've never called before, but perhaps with emergency services it's a bit looser? Regardless, I was pretty surprised that it misunderstood my prompt so dramatically.
I tried using Siri for a while, but the amount of miss fires on my watch made me discontinue. The slight convenience of being able to (most of the time) set a timer or something paled in comparison for it to trigger off of random conversation and then try to search for something nonsensical and loudly proclaiming it couldn't find anything.
It doesn't help that Apple doesn't let third party developers use it. The only SDK available for Homepod, is locked behind a bunch of approvals and the only thing it does is allow developers to implement music players. It's very specific to playing music. Not even Podcasts.
Why can't an eCommerce app implement, "Hey Siri, tell Online Store to add Tidepods to my weekly order". ?
What about Audible implementing, "Hey Siri, tell Audible to add Tale of Two Cities to my reading list." ?
There's remnants of a car service interface that never got used or shipped AFAIK. If you ask "Hey Siri, get me an Uber to the train station". It just replies "sorry, I help with rides".
Or maybe even work with other Apple stuff better. I can say "Hey Siri, turn on Television" and that works and then say "Hey Siri, mute television" and it will reply, "There are no televisions to control".
I have a homepod in every room in my house. I talk to Siri every day.
He (she's set to an Australian male voice) is great if you know exactly what to say to get exactly what you want. He's horrible with general requests that you haven't made before.
I find it odd how Apple has seemingly dropped the ball with Siri. Don't all newer iPhones have some sort of neural chip or something that could help Siri expand its capabilities? It's strange that Amazon is so far ahead with Alexa and Siri is lagging.
I'd have expected speech recognition software to be good enough that you could have direct speech -> gpt type services. It almost feels like traveling to the past when asking things of Siri when I otherwise get very useful responses from chatgpt.
Speech to text is already excellent with an iDevice. Turn off WIFI and put it in airplane mode... then open up notes, hit the microphone and start some transcription.
If you say "Buy a two by four" the transcription goes from
Buy
Buy a
Buy a two
Buy a two by
Buy a 2 x 4
"Four inches by three inches"
For
Four inches
Four inches by
4" by 3
4" by 3"
And that's all on device.
The issue is that when going to GPT, that's going to cost someone a few pennies each time a request is made. When that's scaled up to the installed base of all Macs and iDevices that gets expensive fast.
I think Siri could really benefit from some LLM goodness. It's just too stupid. You have to deliver every command in bite-sized exactly worded chunks. Really annoying. But the tech is at a level where it can really improve it now.
It also needs some persistence. It needs to know that I never want to play music on my homekit. I tried deleting all the music from itunes but that stupid free U2 album keeps coming back and it often thinks I want to play music instead of doing automation actions.
And it should be able to read my texts and tell me if something important comes in. Stuff like that. AI models should be able to deliver those things. All this manually scripted stuff is a dead end.
I use Siri for laundry timers on Apple Watch. A very simple task you'd think, but it occasionally tells me that this doesn't work without my iPhone nearby, even though it's an activate LTE watch and there's even WIFI available.
Then I use Siri a lot with my HomePods for music. It works rather well, but when it fails it hurts. Sometimes a new song is playing that I like. So I say "Hey Siri add this song to my inbox playlist." Siri then occasionally tells me "OK, I'm playing some-other-song-you-didn't-ask-for." There is then no way to return back to the previous song or find out what it was.
I use Siri just for playing music, setting alarms, and reminders, but even there I run into bugs on a regular basis. One day she stopped being able to find songs from Apple Music, but could still play anything I'd saved to my phone (wasn't reception or CarPlay permissions or any of the obvious troubleshooting things you'd think, trust me I looked into it).
Sometimes she'll also misinterpret commands for unclear reasons. "Tomorrow at 7am, remind me to call John" and she responds "Ok, I've turned on your 7am alarm". I try again, speaking more clearly, and she says "Your 7am alarm is already on"
I honestly don't know why Apple can't do their own ChatGPT. They have the money / resources to then implement something like the ChatGPT 'plugin' architecture so if someone wants to turn on their lights or get specific 'private' information that apple cannot give as 'training' data. i.e. callout to 'private' API's if appropriate to a sensible LLM generated response
I'm expecting that they will - and considering the raw power of their local hardware, I think they have the best shot of anyone at cornering the AI assistant market.
- private, on-device language model execution (see llama.cpp for feasibility)
- a single, consistent AI available with you wherever you go
- total access to your personal information / documents (knows your birthday, can see your meeting notes)
Because they have the hardware to run it locally, they have three very hard-to-beat advantages:
1) privacy, because the LMs can see all your stuff but none of it goes back to apple. Microsoft can't do this; they get flak every time they try and phone home with telemetry, and they don't control their platform enough to run a massive LM in the background.
2) omnipresence: if you're in the apple ecosystem, you'll always have your iphone with you. That means the LM will have access to location data, maps, chat - everything. And since it never leaves the phone, privacy-oriented people may be ok with it. And that means the LM can be exponentially more useful than just summarizing documents.
3) evaluation costs - they are the only competitor who will not have to pay for a massive datacenter, which means that the LMs can be as powerful as the M2+ hardware they sell. Everyone else will have no alternative to running the LMs on their centralized, expensive hardware.
I used to have Google Home and moved to Homepod Minis. I should have ran them side by side, but that would have gotten very confusing for other people in the family.
Siri is fine at controlling smart home devices, giving weather, timers, etc...but if you ask it questions more times that not it wants you to use your phone. Google Home handled that so much better.
We have Apple Music and have Homepods around the house so we use it for that.
<kitchen homepod> "Hey siri set a timer for 5 minutes"
"hey siri what timers are running" -> "it's 10:45am"
"hey siri how long left on the timer" -> "there's one timer with 5 minutes remaining"
<bedroom homepod>
"hey siri list all timers" -> "there are no timers set"
"hey siri how long left on the timer" -> "it's 10:45am"
Ok, maybe I’m an outlier here, but I love Siri on my HomePods. It does what I want in enough cases that it’s useful, multiple timers while cooking, setting them and canceling them when I am putting things in or taking it out of the oven.
Granted, I have very few things I use it for, but I like it. The kids like them in their rooms too. I’m happy with Siri, but maybe I don’t ask it to do much.
That's the trick with Siri. I use it on my Apple Watch when I'm riding my bike with an AirPod in: "Siri, play this podcast," or "Siri, call so-and-so."
And yet Apple still has a massive moat around Siri because there is no official way (as far as I know) to replace Siri by another assistant when pressing the side button or saying "Hey..." I believe if you want to use Google Assistant your best bet is to say something like "Hey Siri, use Google Assistant." (honestly haven't even tried because of how ridiculous that is).
So they're clearly still in prime position to make their product better, even if they're behind right now. That's unless they get an antitrust lawsuit for unfair monopoly, which they probably should get considering this seems to me similar to what Windows was doing forcing IE as the default browser back in the days. Just like people should be able to choose their default browser, they should be able to choose their default voice assistant.
Siri seems to be actively getting worse to me. Not necessarily at understanding commands, but at the verbosity of its replies.
Recently, when I ask it to "Send a text to <name> that says <content of text>" it says something like "I notice you often send texts to <name> using Apple Messages, so I will use that for this text. Is that Ok?" before reading me back the text and then sending it. I'm sure that there are people out there who have a rich and complicated mapping of text-communication-app to recipients, but I literally have only one text communication app on my phone, the one that it came with, and I only ever use that. It's already annoyingly slow to interact with Siri on a multi-step process, and adding another step to it is awful.
The reasons sound very Apple-like. The current template-based Siri is not cutting it, but I do get the fact Apple does not want Siri to give wrong or incorrect info. People already go bonkers if an Apple Maps route is incorrect. Given that the current LLM's are routinely giving false information, I'm betting Apple is simply waiting on another leap or two in models, and slowly rewriting Siri in the meantime.
I'm having a good time playing with LLM's, but I certainly don't trust any output at face value. I know I would value a correct answer from Siri, or any other service.
Siri and the Apple mobile keyboard are both jokes. It often seems like a joke what they produce. Even before LLMs predicting the next word was not that hard in 2-3 word phrases. But now that we have LLMs that are pretty good, Siri seems like a real joke. Is the manager of this product permanently drunk or something?
Oh yeah, the keyboard replacing fuck with duck even though I never wrote duck until now is enough reason to pop over to Android.
I’ve noticed that the whole the voice to text recognition it pretty good, what’s awful is how that text is interpreted. You see the same problems with search in other Apple services.
For example you might search for a street in maps and search will just give up and just give you a half arsed result. Searching for HN Boulevard might instead HN Avenue even though you can see the correct result right there on the map.
You see similar behaviour in Music, App Store etc.
I switched to typing to Siri instead of using my voice [1]. I also disabled all triggers for Siri. Now I invoke it by pressing power button then type away.
It's more useful than dictating when I need something quick and discretely.
When chatbots arrive on iPhone, this is how I'll be talking to it.
I think it's not far-fetched to say that everyone hates brand-promoting voice assistants whose job first is to make money for a company.
I don't trust any voice assistant because I know every query is being stored, analyzed, and I don't own my own communications with the asssistant. I also don't get to customize the assistant.
It's basically a shitty-future branding assistant who is inclined to send you to product pages and shit to buy.
Siri is not that bad, considering that Google Assistant is not great either.
What do people use it for? mostly just to set alarms or turn of lights etc.
With all this breakthrough in AI and things like Whisper which brings incredible voice recognition, these tools are ripe for an upgrade even though they have been stagnant for years.
ChatGPT like abilities + voice assistant and these 'voice assistants' are relevant again and actually deliver on there original promise.
Me: "Siri, give me directions to vaguely ethnic sounding café" where café is about 2km away.
Siri: "Getting directions to other cafe in London/Europe/Tajikistan".
So theres no line of code that says "If found location if >2000kms away with no direct land route and/or crosses multiple continents, it might not be the right one"
I'm glad I'm not the only one. I find it annoying, intrusive and not useful for anything at all.
The only times I ever try to use it intentionally (like asking it to take a note, or make a hands free call while I'm driving) - it screws up or requires clarification to which I have to give screen attention, defeating the purpose.
Most of the time it just pops up and annoys me when I call my wife "sweetie".
I find it the worst when giving driving directions for most streets and towns in New Zealand because Siri has not been trained recognize or pronounce Maori names. So you ask for directions to ie Oneroa (oh-ner-ROW-a) will attempt to find own-ERRR-ua and explain that it doesn't exist. What laziness! How hard is it to train siri in all languages?
Siri rarely disappointed me because my expectations were always low and I trained myself on how to talk to it.
But now after some remarkable experiences using GPT-4 I find I’ve lost a lot of patience with all the different voice assistants. They are just so stupid in comparison. How much longer before LLMs and projects like Whisper run the backend?
The worst are the telephone chatbots. My bank’s is so incredible obnoxious and stupid that by the time you get to a human, the blood is boiling and its hard to get back to a professional level. If they need to do automatic triage (probably they dont), why not just use a old school “press 3 for x” assistant.
Can tell they downgraded the long range microphones somewhere between iPhone 4 and 7, might be even worse in later models because it really struggles to pick up you shouting across the room at it when it used to work pretty well on the 4.
Then again the only thing I've used it for since launch is setting timers, rarely gets that right these days.
Don’t really rely on Siri that much. I absolutely hated Apple as a kid, but I dunno. Apple can take their time rummaging through my data in my humble opinion. Like, do they need to be the leader of ML when we all know what it takes to improve the models.. all of our data.. so, yeah.. I’m good with this.
What is people’s experience with Tesla’s voice recognition system? I’ve previously used it to activate the windshield wipers, which worked. But recently I tried to get it to play music and various other things, none of which it was able to do… I was driving and haven’t had time to play around with it since.
I use an Android phone. Pretty much the only thing I use the voice commands for are setting alarms and timers. It works nearly 100% of the time, even in noisy environments.
My wife has an iPhone and it's hilariously bad at this. She needs complete silence and even then it's a coin flip.
I don't know if it's programmed in or just cargo culting but if Siri accidentally triggers and starts talking to you, "Nobody asked you Siri" is my favorite way to get it to stop interacting.
Said in the tone of Cobie Smulders in How I Met Your Mother (Nobody asked you PATRICE!)
"Assistant" type devices which can actually do things besides reply can't yet use large language models safely. LLMs need to get past the problem of making up stuff when they don't know something. Until then, it's not safe to give them power over devices.
They can - if we split it into parsing, execution and response. If LLMs only do parsing and response and execution is limited to actions that are prepogrammed, i think we’ll have a much better assistant. Like, it could be interactive and natural. But i agrees that it shouldn’t be given unfettered access to do whatever it wants.
I often leave home with just my Apple Watch, and for the simple things I do with my watch voice control via Siri and some tapping on the screen is definitely good enough for me.
I think that all the deep learning models for handling speech on the Apple Watch are run locally on the watch.
I'm excited for a GPT or similar voice assistant. Asking siri basic questions and having it tell me it will send me some links on my phone is annoying. I have homepods in the living room and they aren't good for much besides music.
As a former Apple employee working on Siri, it is safe to say many Apple employees hated Siri even 10 years ago. Many tried to introduce new ideas to improve it, but, at least while I was there, nothing changed.
"Even" implies a consensus that I don't think exists except in the minds of people who find hating something recreational. Siri works pretty fucking well in my experience, but that experience is colored by my understanding of what it is, and what it's good at.
You have to be aware of what makes a good Siri question or task. Technical people tend to understand this implicitly; we are, after all, talking to a computer, and computers are notoriously literal and have trouble with implied contexts, etc.
I think I've talked about this here before, but my wife often phrases questions to Siri in a way that results in the dreaded "I can send the results to your iPhone" non-answer. One example I remember happened when we were idly talking about King Charles. My wife asked Siri "how old is king charles" and got the non-answer. I asked Siri "what year was king Charles born" and got hard data back.
It's that kind of thing.
In the narrow case of music there's more to complain about, I guess, but the base problem is specificity and name collision. It doesn't seem to always pull the example of any given non-unique name that I might want; sometimes I wonder if what i get is just random.
If you ask for "Take Five", you MIGHT get Dave Brubeck. I'd argue that, in the absence of something specific, you SHOULD get Dave Brubeck, and moreover you should get the album cut from "Time Out." But Siri doesn't really agree, for whatever reason.
OTOH, if you ask Siri to play "Take Five from the Dave Brubeck album Time Out" you'll get exactly what you want.
Siri excels in simple, discrete asks or tasks, though. We both routinely use it to add things to the shared shopping list we keep in Reminders. That's kind of awesome, and beats the old norm of "go find a pen to add this to the list that you may or may not remember to take with you when you go shopping." Setting timers or alarms verbally is awesome. The list goes on.
I have terrible luck with Spotify and Siri. These kinds of specific requests fail.
I tried it with your prompt and Spotify started playing the whole album. Which is better than my usual outcome. Of course, it took three tries. The first two times Siri said it was going to play and nothing happened.
Usually I don't even get the same artist or album I ask for. Admittedly, I stopped trying a few months ago, so maybe they've improved things a bit lately, or maybe I just got lucky.
At the very least, make it work in the car! I have an iPhone and a Pixel and my car supports AA and CarPlay. When I travel, I much prefer AA because the Google's voice assistant nails AA interaction.
Outside of driving, I never really understood the popular use case for these voice assistants. In most situations it is easier to type your query rather than say it out loud.
It was a novelty touted as a big leap in technology.
I use Siri to turn on voice control when I drive. I have voice control number every item on the screen.
This lets me have 100% control of every feature/button and every app whilst I drive.
I can swipe on the screen using my voice.
I can write notes and messages.
I can switch to any app.
Voice control is a world leading accessibility feature that is a complete touch replacement controlled via voice.
Once I no longer need voice control I turn it off via Siri. The entire action of turning it on, using it, turning it off occurs with no touch at any point.
Also it's great for when I am cooking or doing diy and don't want to touch my phone.
In comparison Siri is better at a few narrowly defined actions but has less than 1% of the scope of voice control.
Reading people's complaints about Siri in this thread I feel what most people actually want is voice control.
But no one outside the visually impaired community knows about it.
it's for when you can't type reasonably isn't i? Like while you're cooking you might say "skip to next song" or "set a timer for 10 minutes" or something.
The skepticism is strange to me because Siri is extremely well positioned. All Apple has to do is add a GPT-like backend, put Siri in everyone's iMessage, and boom ChatGPT is dead in the water.
Siri recently started mispronouncing simple words that it never mistook before on my iPhone. Phrases like "It's 65 degrees outside" now sound like "It ess 65". Really odd.
I'm a screenwriter and my focus is on character NO MATTER WHAT. We watch movies because we love the characters inside the plot, inside the universe, and how they relate to each other. CHARACTER is the CORE DRIVER of all stories, be it the story of Beowulf or Star Wars or Succession. Period.
Today, I am fortunate enough to be developing 'characters' and backstories for AI companies, and I think its working; the new prototypes I've been working on really feel more warm (or more logical, or more relatable, or more philosophical, depending on what we're trying to transmit).
I think a lot of AI companies are really missing this important point, and I think Apple, of all companies, should have got this!
I have the opposite problem: Siri, or more recently Google's version of it, butting into a conversation because it thinks it heard something. Very annoying. I recently kicked Google Search off my phone because that seemed to be the only way to get rid of it.
This stuff needs to be optional and not forced down our throats, and it needs to be more user-controlled, work when we want it and not when we don't want it.
I found voice controls in anything to be so clunky in the early 2010s that I just rolled my eyes at things like Siri and smart speakers. I've never even tried them.
Because it sucks. I tell it to turn off all of my alarms and instead it ends up calling my mother at 4 in the morning, causing her to panic, thinking there's an emergency.
I've switched to using a Chat-GPT shortcut in lieu of Siri and it's absolutely amazing. Apple should figure out how to license GPT and run Siri through it.
> The company’s senior leaders haven’t shown much stomach for the kinds of headline-grabbing gaffes ChatGPT and similar services have stumbled into over the last several months.
Risk aversion from immature tech is a part of Apple’s DNA. It should be no surprise that Apple declined to push Siri beyond its limited capabilities.
LLMs are quite new and Apple has plenty of cash to hire star devs to catch things up. Writing off Apple’s AI future is premature.
MacRumors’ related article [0] has some more interesing details:
> By 2018, the team working on Siri had apparently "devolved into a mess, driven by petty turf battles between senior leaders and heated arguments over the direction of the assistant." Siri's leadership did not want to invest in building tools to analyse Siri's usage and engineers lacked the ability to obtain basic details such as how many people were using the virtual assistant and how often they were doing so. The data that was obtained about Siri coming from the data science and engineering team was simply not being used, with some former employees calling it "a waste of time and money."
> Many Apple employees purportedly left the company because it was too slow to make decisions or too conservative in its approach to new AI technologies, including the large-language models that underpin chatbots like ChatGPT. Apple CEO Tim Cook personally attempted to persuade engineers who helped Apple modernize its search technology to stay at the company, before they left to work on large-language models at Google.
> Apple executives are said to have dismissed proposals to give Siri the ability to conduct extended back-and-forth conversations, claiming that the feature would be difficult to control and gimmicky.
> Cook and other senior executives requested changes to Siri to prevent embarassing responses and the company prefers Siri's responses to be pre-written by a team of around 20 writers, rather than AI-generated. There were also specific decisions to exclude information such as iPhone prices from Siri to push users directly to Apple's website instead.
> Siri engineers working on the feature that uses material from the web to answer questions clashed with the design team over how accurate the responses had to be in 2019. The design team demanded a near-perfect accuracy rate before the feature could be released.
> Engineers claim to have spent months persuading Siri designers that not every one of its answers needed human verification, a limitation that made it impossible to scale up Siri to answer the huge number of questions asked by users. Similarly, Apple's design team repeatedly rejected the feature that enabled users to report a concern or issue with the content of a Siri answer, preventing machine-learning engineers from understanding mistakes, because it wanted Siri to appear "all-knowing."
> In 2019, the Siri team explored a project to rewrite the virtual assistant from scratch, codenamed "Blackbird." The effort sought to create a lightweight version of Siri that would delegate the creation of functions to app developers and would run on iPhones instead of the cloud to improve performance and privacy. Demos of Blackbird apparently prompted excitement among Apple employees owing to its utility and responsiveness. Blackbird competed with the work of two senior leaders on the Siri team who were responsible for helping Siri understand and respond to queries. These individuals pushed for their own project, codenamed " Siri X," for the 10th anniversary of the virtual assistant. The project simply aimed to move Siri's processing on-device for privacy reasons, without the lightweight, modular functionality of Blackbird. Hundreds of employees working on Blackbird were assigned to Siri X, which killed the ambitious project to make Siri more capable.
not only can voice input never be a universal input mechanism that is seamless (so limits reach of application), Apple's specific implementation is just janky.
I generally like Siri and rarely experience bugs. I use it to set timers, get the weather, play music via Spotify, occasionally dictate text messages, and look up trivia mid-conversation. It's been my replacement for Alexa ever since I got creeped out by Amazon's audio data retention. In this context I'm suprised to hear Apple employees are unhappy with it.
I use Siri to set reminders except it is completely unable to handle reminders like “remind me this afternoon to follow up with Joe about the event on June 25”. It sees the June 25 and the reminder is now for the afternoon of June 25. I manage a theater, so this is a very very common scenario and it’s super frustrating. It speaks to there being no real understanding going on, just pattern matching on things that look like dates. I don’t need GPT4, but surely we can do better than this.
You could use shorthand for date and times in the title. I tried your prompt saying "J N 2 5" instead but apparently "this afternoon" is 5PM, so that's odd.
Siri hard codes this afternoon to mean literally right after 12 PM. So if at 1pm you ask it to remind you to do something this afternoon, it will put it at 12:00pm the following day. I just double checked it right now. It’s currently 12:18 PM, and I asked Siri “remind me to test this in the afternoon“ and it set a reminder for tomorrow at 12:00 PM
I gave Siri a lot of time and forgiveness before writing it off as an interface to set timers and literally nothing else, and even that Siri fails often enough to annoy me. I can imagine there’s a happy medium for other people, but I’m unsurprised that with as big a workforce Apple has now, there’s a contingent there with exactly the same opinions as a lot of us out here. Siri sucks, unless it doesn’t for you, but it still sucks for the rest of us.
Is it just me or Siri is very bad at picking up accents (mine isn't that bad having lived in the US for a while) and background noise.
One of the more impressive things about using the Google's voice assistant was, it did very well in noisy environments. Whereas with Siri, that is quite a bit of struggle. This is only about speech-to-text, not text-to-whatever.
It deals with my oven fan and similar regular kitchen noise pretty well. I hold it close to my mouth at parties, so I can't really compare it to e.g. Google Home or Alexa's far-field capability.
Siri is also quite terrible at setting Homekit Scene. I totally gave up on it. For example I have a scene named "Play Music Everywhere" and even if I tell it explicitely with "Hey Siri set scene Play Music Everywhere" it fails miserably. Also simple task like: "Hey Siri play music on all HomePods" ends up with "Ok playing music Everywhere". When did I say everywhere??? I only want you to play on Homepods!
Yesterday: “Hey Siri, watch a movie in the dark” “Ok setting scene watch a movie in the dark”
Today: “Hey Siri, watch a movie in the dark” “I can’t find a movie called In the Dark.”
I also had a very frustrating issue randomly start happening because one of my lights had the word “lamp” or “light” in it. Thankfully googling the symptoms found others with the same problem and a solution, but it was baffling as there hadn’t been any changes made by me in over a year prior.
Well it's clearly better to use non-ambiguous keyword to solve that. Like my scene name equivalent to "watch a movie in the dark" was called "cinema". Another was called "sunset" for instance.
It worked for over a year before it became erratic. If it is problematic Apple should warn me in the UI. It worked again shortly after - making me think it is not deterministic.
How do you get Siri to actually work with Spotify?
When I try to use it I get consistently laughably bad results. `Play songs by Albert King on Spotify` yields something like "Playing songs by R Kelly on Spotify". No, thanks.
Even when I ask for songs, albums, or artists I have saved in my library, Siri invariably finds something else entirely unrelated to play.
What's really frustrating is that this used to work reasonably well for me.
It does have that ability on the HomePod minis that I have around the house. If a timer is set on one of them and then I want to reset it or add minutes to it, I can do so (but only when that HomePod mini hears me).
The inability to set multiple simultaneous timers is annoying. I compensate by setting multiple alarms. "Delete all my alarms" works when I'm done, but it's certainly a hack.
siri definitely has multiple timers. BUT you can only have two though and it only works on Apple Watch and HomePods.
I know, right?
Also a little known hack to set a timer you don’t have to say the whole thing just press the Siri button or say hey siri ‘n minutes’ and it will set a timer for n minutes.
On Alexa I run many times simultaneously. Didn't know it was a special feature. I just name them.
Latest update(?) it quit announcing expiration by name if only one was running. Just beeps, and maybe I forgot what it was supposed to be timing. Annoying.
Especially annoying if more than one person in the house is setting timers.
>I use it to set timers, get the weather, play music via Spotify, occasionally dictate text messages, and look up trivia mid-conversation.
I use my Google Home similarly. But understand that these use-cases are utterly trivial, and Siri (by most accounts) and Google (from experience) still manage to get them wrong too often. It's unbelievable.
I have Google Home products and a YouTube Premium subscription. By 1 of 10 times when I ask for music it'll default to Spotify, which I have never used. How? Why? One of my speakers now reacts to every request with an error when it's not the primary speaker being spoken to.
It's a novelty, nothing more. I wouldn't rely on this stuff for anything. In fact, it's a bug or two away from being permanently removed.
Siri, Alexa, and Google Assistant are all generally good at those things but honestly not much else. I use my Google Home every day for timers and to turn on/off lights. I don't think any of them will ever be the helpful AI that they were initially promoted as, though.
I'm shocked how much the quality of Google Assistant has degraded over time.
I have a Mini that consistently misunderstands broadcast requests and says "sorry I'm not playing anything right now". When it occasionally speech-to-text converts a broadcast word, it consistently cuts off the first word or letter, even when it's "I'll be right down" users will get "L L be right down".
It used to support simple offline requests like SMS and Navigate when data was unavailable. No more.
It used to integrate with Google Keep. No more.
No longer recognizes the word "torch" as a synonym for flashlight. Why would I be asking to turn on my phone's "porch"?
Painfully slow replies, even under ideal network environment... Just spinning forever, often until a timeout that it doesn't even have the decency to respond with a proper error message.
It's just amazing how they launched a product with a clear "this is where we are, this is our vision of where we're going" and they still sell it but instead they're going in the opposite direction.
My pet theory is that they were launched by the A team, who got replaced by the B team when the A team left for greener pastures. Or maybe they got it working, but in a bid to get another promotion they kept tweaking it, making it worse in ways that matter to us but not the promotion committee.
See, at least Siri usually is ready to take your input the moment you press the button, even if it then casually discards your input because it can’t reach apple’s servers or whatever.
Google Maps: I swear, about half the time I try to activate voice search, it sits and spins before even accepting any voice input at all. Why can’t it just start reading the microphone right when I activate it, and then submit the saved audio whenever it’s done getting set up? It’s so abysmally poor that it’s usually faster to scroll through recent destinations or literally grab the phone, unlock, and put in a destination.
This is just a market begging to be disrupted. I want to see a startup combine Whisper, GPT, and a competent TTS model into a killer voice UI!
What flabbergasts me is how often the screen will display a successful speech-to-text capture, and then it poops the bed anyways. Like, you did it! You did the hard part! The part that feels like goddamned magic to me, converting the noisy messy reality of sound-waves into text. And then it drops the ball on the simple pile of "if" statements it takes to convert that text into an action.
>I don't think any of them will ever be the helpful AI that they were initially promoted as, though.
I must disagree. ChatGPT-style LLM functionality with ElevenLabs-quality realtime voice synthesis will absolutely supercharge these products. The ability to e.g. answer kids' questions in simplified English according to parental prompt guidelines, or drill down on complex educational topics, or maintain context over many back-and-forth conversational interactions will be huge.
A) This happens whether in the car or not, and in the car I'm often using AirPods to get better noise cancellation. There's no excuse for this. I could talk to my Pixel 5 while driving from across the car.
B) It's not always a recognition thing. Siri often knows what I said and has a completely baffling response to it.
I have Alexa and I hate it as well as hating Siri. At this point when bing (and maybe Google IDK) can handle general knowledge questions with AI answers (and a disclaimer) I basically consider Alexa / Siri unmaintained.
The reality is Apple deeply cares about releasing a polished product. Releasing an LLM-based Siri that makes really bad gaffes would be a PR nightmare. Google already suffered that when it opened up Bard despite pushback from folks working on it.
The fundamentals of LLM architectures, training, etc. are now no longer "secret sauce" tons of major tech companies are working on in-house LLMs at this point. I don't see Apple as not having a future for Siri, it's rather a silly conclusion that doesn't have much behind it at all
> The reality is Apple deeply cares about releasing a polished product
This is absolutely untrue in the Tim Cook era. Apple releases buggy hardware and software all the time.
This article talks about how they released a bad product and know it. Siri is significantly worse than both Google Assistant and Alexa, and even those are comically bad sometimes.
Bugs in iOS and Siri give me rage aneurysms every single day (coming from a long-time Android user who just wanted a small phone).
If you think this is limited to Tim Cook, you weren't paying attention. Every single release of macbooks had issues with heat killing GPUs and similar and apple always denied it being a design defect.
Didn't steve jobs sit behind antennagate and bendgate?
The software was significantly more polished in the Jobs era. I don't know why, because it's not like it's something Jobs obsessed over more than Cook. Apple just doesn't seem to have "good software" in their DNA.