Hacker News new | past | comments | ask | show | jobs | submit login
Apple Reveals Siri Voice Interface: The “Intelligent Assistant” (techcrunch.com)
116 points by llambda on Oct 4, 2011 | hide | past | favorite | 173 comments



This is a minor evolution on the voice capacity I have already been using on my Android for quite some time. Because all of this relies on technology in the cloud Google should have no problem leapfrogging this with weather, booking of flights, news, calendar etc. Also, any future "inter-agent" stuff will be totally broken on Apple, all of their stuff like iChat is a totally walled garden.

For a small subset of people who are all on Apple with their friends all on Apple in some regions of the USA it will work great though.

EDIT: For the down-voters I would just say I actually had a macbook for a few years. iChat is technologically impressive but has near zero value when you can't connect with people. Apple promised to make it widely inter-operative and failed to do so when I was using it. They almost had to be forced to give back contributions to Webkit and open it up fully. They run a closed ship more or less in everything they do, the Apple way or the highway. This voice feature is a thin UI on top of cloud services and interoperability, both of which areas where Google is light-years ahead of Apple (not to mention they have their own social network). Android is already crushing them in sales and Apple virtually doesn't exist outside the West + Japan, tens of millions of upper middle class Indians and Chinese will be using Android eventually. Apple doesn't stand a chance in the long run, which is why they are already resorting to patent suits against superior tablets like the galaxy. It's not a BMW vs a budget car, because 1s and 0s are free the budget car has the luxury seats and trimmings at no extra cost. They are toast.


The really unique thing about Siri is the flexible natural language processing. You can say "Meet with Doug for dinner today", and it'll parse out the person/date/subject, realize you didn't specify a time, and ask you "OK, what time would you like to meet him at?" It's just a much more forgiving interface, which makes it more approachable for the average user.

It's actually a very hard technical problem to create software that not only parses the user's intention, but can intelligently prompt for follow-ups. Google has many of the individual pieces of technology, but making it all fit together, with a good interface, is still hard work that takes time. That's probably why it took Apple 1.5 years to go from the Siri acquisition to the announcement today.


I agree. I think what Apple is releasing in Siri could be bigger than most of us realize. I have several android phones and don't use the voice recognition much because 1 out of every 4 or 5 tries it returns things I didn't mean. That's frustrating. The average user quits after a 20% failure rate. Siri on the iPhone 4s needs to be at 5% or less failure rate, meaning one of every 20 requests returns an error. But then again, if it returns an error at least the user can clarify and continue the conversation. I think Apple just leapfrogged Google in natural language processing and taking it to the masses.


as someone who works a lot with speech recognition: speech recognition demos differently from how it works in the field. the demoer gets to remove all of the semantic noise from the system (people who aren't Phil Schiller or their wife, choosing "Greek restaurants" instead of a phrase easily confusable with another), leaving only signal.

if it works the way it says on the box -- and remember, Apple already released Voice Control, which has a less-than-stellar reputation -- then it could be revolutionary. voice recognition often doesn't.


I agree it's a big "IF". IF Siri works as advertised, then it really could be revolutionary. I remember being pumped up by Google's demo of voice recognition, only to try it out and find it novel and cool but not dependable and accurate enough to use in all my real-life situations. And it's difficult to correct errors, or clarify requests in a conversation manner like Siri.

Another reason to hope is that Siri had 19 people when Apple acquired it and most of them are still at Apple. I would imagine that Apple scaled their team significantly. Who knows? Maybe 100 engineers working on it, since it's a cornerstone of the next generation of mobile devices (and probably coming to desktop in the near future too). But Google doesn't seem to have the same priority on natural language processing as Apple does cause more than 50% of Apple's revenue comes from the iPhone. How many engineers at Google are working on voice recognition and natural language processing? Maybe somebody here will know. Maybe max 10 engineers?

Apple also will be forced to scale Siri across multiple languages very quickly, especially if it works well. Currently they have English, French and German. But tons of people will want it, so that motivates Apple to innovate even more.

I guess we'll see very soon how good Siri on the iPhone 4s really is.


it's not an apples-to-apples (heh.) comparison, because Google builds their own speech recognition engine and Siri/Apple licenses the engine of a company called Nuance (at least, last I checked), and so their language scaling is limited by what Nuance can give them.


You're assuming that they're not able to augment/enhance the Nuance engine with their own improvements.


unless Siri has undergone a lot of changes since the last time I looked at it, there is a sharp line between speech recognition and determining user intent/question answering; it was founded as CALO (http://en.wikipedia.org/wiki/CALO), a program which didn't even do speech recognition.

speech recognition maps a space of waveforms onto a space of utterances in a language. determining user intent maps that utterance onto the space of actions.

Siri has done great things in the field of the latter; they license their technology for doing the former from another company.


How many things can it really do, though? Are those not some preprogrammed use cases, and for the rest it falls back to Wolfram Alpha, which has been around for years?

Google also has offered "special" search results for a while, such as evaluating calculations or exchange rates on the fly. I don't think what Siri does is more advanced than that. It is a nice touch to integrate it more deeply into the OS, like "get me home" - but then that is also just one button with a popular Android app (Öffi for example).

Google Calendar also does some "real text parsing" afaik.

Can't really think about a more complicated example atm. OK maybe "set up a dinner with Doug and Sandy, where Doug is a vegan and Sandy doesn't like to go farther than 2 km from her home" - can Siri parse that and pick a suitable restaurant?

I need some ideas for birthday presents for my child, can Siri help?


Have you used Android voice recognition? It's pretty good in terms of recognition. It isn't so proactive as "Go to my calendar and make me money!" but it's quite an adequate little application.


It isn't just voice recognition. It's also natural language processing. Apple just leapfrogged Google.


Google's service also does some natural language processing. You can ask it to "navigate to <restaurant>" and it will look up the restaurant address near you and start up google navigation to that address. In other cases, google's search engine does the NLP. You can ask "What is the time in Paris, France?" or "What is the weather in New York?" and it will give you the answer at the top with search results below.

They are similar services, but Apple seems to have a greater variety of intents that it recognizes.


Toast. I'm kind of shocked that the tenor of your remarks is to instantly pit Apple against Google and that the HN community has decided that it is the most insightful. Apple continues to achieve a level of quality in their products that most of us aspire to as entrepreneurs, and the tech world is better for it. The impulse to belittle them in light of an impressive achievement baffles me.

Android and Siri. Android does well, yes, but Google's financial success has not yet been shown to be correlated with Android shipments (look at the stock performance over the last two years for evidence). In some interesting instances competitors (Baidu and Amazon) have gutted Android to be used against them. Every query that is performed on Siri is routed to services that are NOT search, which is where Google makes its money. Why Google would invest money in implementing mobile services that route computing away from the web is beyond me. It isn't in their interest. Joe Hewitt made this observation before too. So while yes, Google might be technically capable of adding voice assistant, it isn't clear how that leads to increased profits for them. Feature parity is the wrong way to think about the Apple Google relationship.


Agree. It beats me why Google has continued to pour so much money in Android with apparently no business plan to recover that. Apple has a completely different business model that lets them recover their investments. But with Androind being free, I don't understand why Google has any motive to do investments. Their shareholders and directors are going to ask this question at some point of time.

It scares me that Google is looking more like the "New Microsoft" scared of everything that everyone does and trying to get their hands at everything.


>Apple continues to achieve a level of quality in their products that most of us aspire to as entrepreneurs, and the tech world is better for it. The impulse to belittle them in light of an impressive achievement baffles me.

Many have a problem with what Apple because it's trying to normalizing sharecropping in the software development world. I have to give them a 30% cut if I want to sell software for their hardware. I have to pay them $99/year just to put applications I've written on my own hardware (effectively subscribing to use my own hardware). If I write an application they want to compete with they have the power to make my application unavailable on their device. If I write an application they somehow find distasteful or politically controversial they can make it disappear. Microsoft is following Apple's lead. Why would I want to support either of them instead of an alternative like Android?


"I have to give them a 30% cut if I want to sell software for their hardware."

Right, if you build something and Target puts it in their store you think they don't take a cut or doesn't take it out of the store if it crosses taste lines? Is that "sharecropping in the retail world"? Sorry no, that is just how business works.

Lady Gaga didn't become a big hit because she is the most talented artist and pressed her own CDs and kept 100% of the profits. She became a big hit because she signed a deal that gave 10% of her profits to her agent who got her another deal where she signed away another 70%+ of income to a company that got her the best beatmakers in the world and put her songs on the radio every hour.

Analogously, Apple has put a store together that brings tens of millions of paying customers right to your app's doorstep and makes it super easy for them to buy it. The deal they're offering is that they take 30% to cover expenses (customer service, servers, etc.) and as a finder's fee. If you don't like that deal you're free to say no but plenty of other people have said yes. Very few big software houses develop for Android but not iOS and very few develop apps for Android before iOS despite the market share disparity. Clearly they see Apple's deal as a lucrative investment and not sharecropping.


At least you'll get paid 'something' in the apple eco-system.

http://blog.earbits.com/online_radio/i-think-your-app-should...


Software piracy isn't new, yet software publishers have managed to survive. I don't think the existence of piracy is sufficient reason to embrace allowing hardware to be ruled by fiat.


Right, but that is just your opinion.

This is clearly a value proposition offered by apple. Some people don't mind the closed system, and opaque rules.


Yes, others are certainly willing to buy into a planned economy.


Not entirely true. Both iOS and Android have a way to remove applications from your phone. That's useful against rogue applications, the latest of which is HTC's Android giving away your personal data. You want a safe and unregulated market but you can't have it both ways, specially if you are computer illiterate.


Removing malware from phones and the market is entirely reasonable. That's not the problem. The problem is deciding what kind of non-malicious apps people can use on their own hardware. For example:

http://www.foxnews.com/scitech/2009/11/12/apples-rejection-i...

It's not reasonable for a hardware provider to dictate how I use my hardware.


Oh please, Apple isn't toast. Michael Dell said something similar, and look how that turned out. What good is market share when you are making no money because of it? Apple has it right, they're making way more money than all the other handset companies and are constantly voted #1 in handset satisfaction. What's so wrong with that?

When Android return[1] rates[2] drop below 10%, then we'll talk.

1. http://techcrunch.com/2011/07/26/androids-dirty-secret-shipp...

2. http://news.cnet.com/8301-13579_3-20030211-37.html


How about windows phone then? Better UI than android more capabilities than iPhone. Comparable low return rates.

Its another thing that People don't know about it because of Microsoft's poor image and marketing.


No matter how well it is implemented, talking to your phone in public or even when somebody else around just sounds too geekily awkward.


Haven't people been talking to phones in public since the StarTac? Siri's big selling point is that you talk to it in the same cadence as you would a human on the other end.


Actually I don't think people have been talking to their phones (voice command) as much even though the feature has been available for the longest time. One reason could be that the feature was very poorly implemented. What these features lack is the context that we so readily assume when talking to a person.

The reason at least I don't use voice commands is because I feel stupid doing it if the phone does not understand me in the first attempt. I haven't used Siri myself. I use windows phone and it has damn good voice menu system. But I still don't feel comfortable enough to use it.


> They are toast.

Apple earns about 50% of the profit in mobile handsets, a share that is trending upwards.


They managed to make a mobile phone a status symbol, but that is more a branding achievement than a technical one. Today i saw a garbage collector (really) with an iPhone, and both of my parents have one. The status symbol fades when everyone has one.

This number is just relevant for apples wallet, but what matters in the long term are market share and being technical superior.

Nokia and RIM had also very neat numbers some years ago.


I have an Android phone, ATRIX specifically, yet I recommend an iPhone to any non-technical person I know looking for a phone. The main reason I do this is consistency. Android has come a long way, but iPhone still has a more consistent interface to include the applications.

If they have questions any other iPhone owner is a potential support person. If they get some Android phone they either contact me, need to find someone else with the same phone or someone who is willing to learn the UI shell of a given vendor and where things are on that device.

None of this has anything to do with status symbols. The last person I recommended an iPhone to was a retired police officer who wanted something to get information on but did not want to deal maintenance like a regular computer.

After talking to him about the different approaches, including how iPhones prohibit you from doing things that are possible on Android, he was confused why he should care about not being able to do those things. He didn't want to validate permissions of applications when installing them, he just wanted them to not screw up his phone.

He was more than happy to give up some freedoms that he may never have used in order for someone else to try and protect him from malicious apps or to have everything "just work." I can't see that ever going out of style with people that could care less about underlying technology.


"You can be watching TV and see Coca-Cola, and you know that the President drinks Coke, Liz Taylor drinks Coke, and just think, you can drink Coke, too. A Coke is a Coke and no amount of money can get you a better Coke than the one the bum on the corner is drinking. All the Cokes are the same and all the Cokes are good."

-- Andy Warhol

I don't think Apple's goal with iPhone is to make you want it because it's exclusive; you're supposed to want it because it's good.

The Macintosh ended up as more an exclusive, niche product, but with later products Apple has moved away from that as a market strategy.


It is still a premium price, and the people are willing to pay this because it is a status symbol.

The huge profits come from a overpriced product, Coke makes his profits from a clearly distinctive product in huge amounts. You have to explain a customer why it is worth to pay a huge bonus for a similar product. Coke lost this status symbol and is now cheap since Red Bull became the next cool thing.

Apple is incredibly good in sustaining a premium image with new versions etc. That is what the high profit margins come from today, i just don't believe that this holds on forever.

Apple wants you to believe that they are the "best" not good, everything they do is dependent on that promise. When the regular people don't believe that any more they have a problem.


"The status symbol fades when everyone has one."

Exactly. Apple has not been a status symbol or part of the counter culture for YEARS.


Yet their revenue and profit has been increasing... So even if you were correct, it clearly doesn't matter.


Garbage collectors are relatively well paid (~$15-25ph).


They really needed an iPhone 5 here. Every non-tech I know was waiting for it, now they're just confused. Would have been nice to make a big bang to break in the new guy.

For Siri, how many of us will actually use it day to day in the office or commute?

Personally, I was hoping for a new ipad 3 with retina. Would have been a nice gift for the kids pre-christmas. As it is, I'll be sticking with my iPhone 4 and ipad 1.


>For Siri, how many of us will actually use it day to day in the office or commute?

My wife has been using voice input on her low-end Android device ($150 with no contract) for months now, and she uses it pretty much every time she needs to navigate somewhere -- it's that good. I've also seen her use it for search -- general search and "Mexican restaurants near me" map search. It's fast, and it's pretty good at understanding what you're saying.

It's a cool feature, but the most compelling use cases are search related -- and Android has those nailed.


My 85 year old relative can't hit the virtual keys very well on his android phone but he presses the speech button and asks for what he wants. I've seen him do it and it works pretty well although sometimes he needs to repeat himself. I have worked with the android recognizer api and it is very easy to implement in 3rd party apps.


What would an iPhone 5 be, though? Same new innards, different case? The iPhone 4 is a sharp looking piece of hardware. It's not like they need to throw it away just because a few tech nerds want different cases.


It's a matter of signaling/conspicuous consumption. People with brand new iPhones have no way of distinguishing themselves from people with the old model.

It's also another way that Apple drives upgrades in the high end: by making the old model look old and dated.

And it's a pretty old model. I agree with you that's a beautiful piece of hardware, and I think it looks better than any of the other models, and better than any of the 'teardrop' mockups. But it's already been in the market for, what, 16 months? If they don't have another refresh until next year, that's a pretty long time to have one chassis.


> It's also another way that Apple drives upgrades in the high end: by making the old model look old and dated.

I also object to that remark attributed to Apple, the computer or smartphone company with the longest model design life cycles.

The vigorous aftermarket for Apple's supposedly "old and dated" models would also suggest demand for Apple's products isn't based on design obsolescence or design cues.

There will be people who upgrade as a signal, but I attribute that to those buyers, not to Apple, whose design innovations seem largely in pursuit of more usable, durable, or lasting product.


That's a fair point.

But I wasn't making a value judgment. I think it's a perfectly legitimate way to drive desire/demand.

I also think that long model design life cycles are carefully calculated to create desire as well as create a great product.


I think you over estimate how much apple relies on "making the old model look old and dated".

My 2011 Macbook Pro 15.4 looks remarkably similar to my old 2006 Power PC Powerbook. The 3GS is one of the best selling smart phones on market even though the form factor was set introduced in 2008. There was some compelling analysis a few months ago that made a compelling case that iPhone 4 sales were constrained by the ability to manufacture them.


It's possible I overestimate it. And I don't mean to say that it's their driving motivation or anything. I do think it's some significant part of the calculation.

Laptops are a different market. You have your phone on you all the time so it works much better as a signaling device, and it's only ~$200 to upgrade.

The 3GS sells well at least in part because it's less expensive. The ones you're targeting with design obsolescence are the high end who will pay again for the high end.

I've already seen comments around here to the effect of: "Hm, just an internal spec refresh. Guess I can wait until the iPhone 5."


Apple doesn’t do that (all that often). If they like a design they are willing to stick to it.

MacBook Pro. Mac Pro.


> But it's already been in the market for, what, 16 months? If they don't have another refresh until next year, that's a pretty long time to have one chassis.

"Next year" is only 8 months this time around. The 3G chassis lasted exactly the same duration, and they're using the 2 year operator contract renewal as their clock: tick for design, tock for specs.


Lucky. In Canada we have 3-year operator contracts. We're always behind in new phone releases.

Apple is very good at making you feel like you are using an obsolete piece of equipment.

On the other hand, they release iOS5 around the time the 3-year contracts for the 3G expire, so it's a good bit of planning (or coincidence) on their part.


> It's a matter of signaling/conspicuous consumption... it's already been in the market for, what, 16 months?

Counterpoint: Movado Museum Watch[1]. Latest fashion goes out of style. Great design becomes timeless[2] because it's not "simply an adjective to place in front of a product’s name to somehow artificially enhance its value."[3]

1. http://www.amazon.com/Movado-606085-Museum-Black-Leather/dp/...

2. http://garry.posterous.com/dieter-rams-his-first-braun-desig...

3. http://www.telegraph.co.uk/technology/apple/8555503/Dieter-R...


I don't think it works that way with tech products, because an outdated smartphone is considerably less functional than a cutting-edge one. I.e. there's a functional aspect to the desire that goes beyond fashion, or maybe fuses the two. Or are you implying that it will ever be fashionable to use an technologically outdated phone because the design is nice? (I could see it in some tiny contrarian subset of the population, but not as a general phenomenon.)


> I could see it in some tiny contrarian subset of the population

Tongue firmly in cheek example:

http://www.dailymail.co.uk/tvshowbiz/article-2039931/Jamie-L...

You're right about the internals, but this discussion was about external design cues. Even so, to the casual user (the majority), software use on iPhone 4 will not be "considerably less functional" than on the iPhone 4S so in this case an external cue would not signal any particularly important functional change.


I don't think that is Apple's marketing strategy at all. Apple products are no longer about exclusivity, people buy them because they are good. They've marketed the 'Just works' and it has reached the ears of many generations.

My father bought an Ipad 2 the other day, no real reason apart from they 'are good for browsing email'.

If you notice Apple's design timeline they keep the same design for a number of iterations on a product. If they were still about the exclusivity marketing game they would be releasing new colours and names and designs every year.


> But it's already been in the market for, what, 16 months? If they don't have another refresh until next year, that's a pretty long time to have one chassis.

You should see the PowerMac/Mac Pros...


Yep, the name change alone to iPhone 5 would have satisfied them. 4S sounds like a slight upgrade, a bit of extra speed (we know it's more, but that's the perception from those I've spoken to). Anyway, the future will tell us how well it does.

I sitll think they missed a trick on the iPad 3. They should have done that instead of the ipod touch.


The ipod is refreshed in the fall, the ipad in the spring. Each model lasts at least a year. Its like clockwork.


> They really needed an iPhone 5 here.

The 4S thing didn't surprise me too much. This keeps consistant with what they did with the 3G -> 3GS transition. For someone like me, who has an iPhone 4, I don't feel the need to upgrade out of contract, but will likely upgrade when the iPhone 5 comes out next year. Doing a major revision every two years makes sense given the way mobile contracts work.

> For Siri, how many of us will actually use it day to day in the office or commute?

I would definitely use it while driving. It's not hard to take a quick glance and read something while driving (I know, it's still bad) but it's significantly harder to try and type something in, which requires a lot more concentration and is much more dangerous. The flip side is that I'd probably use my phone more when driving, which is not good. Maybe it's like the seat belt thing, where the seat belt gave people enough sense of security to drive more recklessly than they had when seat belts were not a requirement for manufacturers.

> Personally, I was hoping for a new ipad 3 with retina.

Given the title of the event, expecting a new iPad was a bit of a stretch!


I don't understand your point. Siri is going to fail because Apple is a walled-garden? That makes sense for iMessage, but none for Siri. Are you saying it is going to fail because it doesn't support all languages?

Your edit ends with "They are toast". Seems like obvious flamebait, and is only going to attract more downvotes.


iChat is XMPP.


To expand on this: XMPP is the opposite of a walled-garden. I've used iChat with Google Talk, I've used it with an internal Jabber server, and the 'normal' iChat is AOL IM.

Even the Bonjour chat is open - just XMPP with ZeroConf discovery.

To say that iChat is 'totally walled garden' betrays a deep lack of understanding of the software being discussed.


WHAT?? You mean Google copied Apple already? What next, notifications?

[Ha-ha, just some good-natured ribbing, folks, please go easy with the downvotes.]


I know that I shouldn't be surprised at this point, but making it a 4S exclusive is one of the more blatant bits of planned obsolescence I've seen. I was running Siri just fine on an 2nd generation iPod touch over two years ago, and looking at the demo, it doesn't seem to be much different, just able to hook into Apple internal APIs.


I wrote a post last night (http://news.ycombinator.com/item?id=3069745) analyzing Tom Gruber, Siri's Co-Founder, CTO, and VP of Design, original Siri keynote in 2008.

The post got pretty popular on HN and I had 5 key questions/predictions, feel free to read all the details (and lots of quotes I transcribed straight from Gruber's video) but now I'll summarize my thoughts quickly below:

1. Very few languages for Siri (where's Spanish?)

2. No API announced for developers to add tasks to Siri

3. It's still named...Siri? What happened to Assistant? I guess it is quicker to say in practice...

4. Siri's in BETA? Is this the first time Apple's released a major iPhone feature with a beta sticker?

5. No payments integration with Siri mentioned. Can it buy stuff for me? as Gruber talked about in 2008?

6. No Facebook partnership for social knowledge on Siri, or even iPad app.


I don't think Siri is in beta. I think only the 'voice dictation for arbitrary text input' feature is in beta.


So I guess you're not that fascinated any more, Ian? Maybe now you understand my cynical reactions to your original post :-P


I'm sure the jail breakers will solve this problem.


They shouldn't have to. If there are no technical reasons why this can't work it should.


This has been standard operating procedure on iOS devices since...ever.

There's also no reason that Safari can't upload photos out of the browser...except the fact that it would allow devs to write web-based apps that use the camera, meaning less mindshare going to iOS dev. Right now (well, as of about a month ago, the last time I looked), that's impossible.


Agreed about Safari, but iCab for iOS can upload photos from the browser. I have uploaded avatars to Twitter etc using iCab and those sites' regular upload forms.


I believe the voice recognition accuracy of Siri is far superior to Google Voice Actions and requires the dual core CPU.

Google has taken a different approach, where your voice sample is uploaded to a Google server, processed, and downloaded back to the device. This takes less CPU power but is also far less accurate, as the voice sample must be very low quality to have a quick response time from Google's server.

Apple/Siri are taking the approach that high quality voice recognition must be done on device in order to provide the level of performance and accuracy that voice recognition requires. I think we will find that Siri actually works and doesn't have as many errors as Google's voicemail transcription.

This is the reason for requiring iPhone 4S.


Why would it be less accurate? I've worked with voice quite a bit at my last job and the codecs and bitrates for voice don't need to be this heavy-duty CD-quality stuff. Your big limitation is going to be those cheesy mics and background noise, wind, etc.

Voice compresses nicely. Turns out we humans aren't capable of making such varied sound that it can't compress. Our mouth holes are ancient technology.

In practice, I'm in love with Google's voice capabilities. It seems to understand context. Its crazy how accurate it is. I often tease my iphone friends with it. I'm also highly skeptical that an application on a phone can outdo google's massive libraries and server infrastructure. If anything, I'd expect the Apple voice to be worse. Regardless, I can't wait to see this stuff in action. A war for the best voice recognition would be great right now as its been a patent blocked and ignored field for the most part.


You've got it backwards. Modern speech recognizers have a vocabulary of a million words and multi-gigabyte models. It's generally much more accurate to do speech recognition in the cloud, since you have more processing power and more RAM to hold large statistical models.

The rumor is that Apple is sending the audio to Nuance servers, i.e., they're doing cloud-based speech recognition.


False, they're doing on device speech recognition. Servers are only used if you request information from the Internet. Look at the demos and read the hands on reports. On device recognition makes it much more usable than Google Voice Actions.


+1 on the cloud advantage.

I tried Google Voice Actions on my Nexus One quite a while ago, but it was optimised for the US market. The accuracy for me was so bad that I didn't bother with it.

Then recently, a Google blog on RSS said the latest app had been optimised for my locale. Now, of course, it's spookily accurate.

You can't beat your algorithms in the cloud being bombarded with sample data round the clock.


    This takes less CPU power but is also far less accurate,
    as the voice sample must be very low quality to have a
    quick response time from Google's server.
Has there been a comparison of the accuracy of the two services? This claim seems unsupported.


Yeah but there is a monetary reason why this shouldn't work, and that is reason enough for most companies.


They probably don't want to worry about backwards compatibility.

By focusing exclusively on the newest model, they can maximize the power of the feature.


I think the main issue is they don't want to rollout a beta product to 100M devices all at once. The secret to quality voice recognition is a great training set. The experience will get better over time. My guess is it will be rolled out to all devices after a beta period.


It might be that the Siri functionality is tied to the A5 chip. The A5 design is a lot larger than other dual core ARM A9 designs (even after accounting for the larger GPU) and thus has lots of spare silicon to use for specialized circuits. It is possible that Apple added a custom DSP to aid speech recognition.


It seems the actual speech recognition happens on a remote server, not in the device itself.


I was just checking the hardware specs on ifixit. Looks like the iPhone4 has 3 microphones in it. So I am guessing they are sending audio from more than 1 mic. I found no such hardware info on the teardown of 3GS and below. Maybe I didn't look hard enough but there does seem to be a hardware-related reason for doing this.

EDIT : I believe > 1 src is needed for fighting noise so you sample the same audio from 2 locations and can distinguish between multiple audio sources in the input signal. This seemed to be one of the selling points of the mic array in the kinect


We won't know this for a little while but there might have been changes made to the software that Apple felt required the extra processing power of the A5. Of course, we won't know for sure until someone jailbrakes and tried it on older hardware.


For that matter there may be custom signal processing hardware inside to support Apple's implementation.

Particularly as they did just advertise that they added a custom ISP for the camera. I'd be surprised if they wouldn't go to similar lengths for the voice assistant.


It would've been great to see Siri being adopted on multiple platforms. Such a waste!


Sounds like it's an evolution over Android's voice stuff, and better integrated. But the tech isn't what stops me from using Android's voice commands, generally. It's that I'm not often enough in a situation to use it without feeling embarrassed (not going to do it at work or out and about, if I'm at home, I'm usually in front of the computer), and I usually forget about the function. Same goes with location reminders. I have an Android app that does them, but I never remember to actually set the reminders, because I don't have frequent enough opportunity to use it.

The car is probably the only place I use voice commands frequently, but I don't drive that much.

And really, aren't the bluetooth headsets a big enough scourge? Do we really need a bunch of people walking around talking to their phones without even having someone on the other end?


  Do we really need a bunch of people walking around talking to their phones without even having someone on the other end?
Actually, walking down the street is the speech-to-text situation for me. Sending a quick text or Gtalk reply with voice is a hell of a lot easier and safer than trying to pound one out when I'm in the crosswalk.


Maybe I'll give it a shot.

[Edit: I just tried it on my lunch break. Felt very awkward. Different strokes, I guess.]


Did you pretend like you were talking to a real person? I could see the conversational style that Siri supports breaking down some of the awkwardness of this.

I would feel less awkward saying "Hey Siri, what's the weather going to be like tomorrow?" into my phone than "WEATHER. TOMORROW."


Yeah, it's a weird issue. It sort of depends on the context, but I think I would feel silly/awkward either way, but for different reasons.

If I'm at home, I feel less silly giving commands like "WEATHER. TOMORROW." I guess because I don't want to feel like I'm talking to something that's not human as if it's human? I'm not sure.

If I'm out and about, I want to use natural conversation so I don't look like an idiot, but would still feel internally strange, I guess for the reason above.

It may just be a matter of socializing the behavior enough for it to be acceptable. Maybe I won't feel as weird if everyone does it all the time.


"I don't use it, so nobody will use it."

Slippery slope, man. Easy to forget there are a billion others you haven't met yet.


Oh, I wasn't trying to imply that it will fail. Just relating my experience that social norms and robopsychology are bigger hurdles than the technology, imo.


I imagine of the ~7 billion people alive today, perhaps 6 billion are not potential Apple customers at all due to price. Quite a few billion are potential low-end Android phone users though. As I said in another comment in this thread it costs to put luxury seats into a cheap car to attempt to catch up to a BMW, it doesn't cost to put a great many features into budget Android phones though.

This is why Apple is starting to use patents, because they actually don't have much when you boil it down. Even an Apple fan would be hard pressed to say Apple is 20% "better" than the latest Android. They will mostly point to a few UI features and say 10% better. I would say Android is easily better than iOS, as do many others, it seems to be largely a matter of taste. The tidal-wave of Android phones will swamp Apple in the end unless they can sue tons of people.


> I imagine of the ~7 billion people alive today, perhaps 6 billion are not potential Apple customers at all due to price. Quite a few billion are potential low-end Android phone users though.

The 3GS is now free with a contract, just like the low-end Android phones.


"Free with contract" is still a roughly $1500 commitment. For the real low end, you need to look at what it costs for budget prepaid carriers, and in the iPhone's case I believe that you still simply can't get one there.


And you're still restricting yourself to the US market. Globally, your second sentence is a lot more representative, I think, because handsets aren't bundled with contracts. So the real prices of the handsets are reflected more accurately.


Who buys a smartphone for 2 months? If you're someone who wants a smartphone odds are even if it was off contract you'd end up paying the $1500 anyway. You might switch carriers more often but honestly all 4 carriers give the same "I wish it was somewhat better but I'm fine with any of them" service where I live.


It's not for serial upgraders, it's for prepaid/regional carrier accounts. I have a friend who bought a Motorola Triumph recently and pays something $30/mo. for unlimited voice and data. There are also people who want a smartphone with just a voice plan because they only plan on using data while on wifi.


Virgin Mobile updated their rates this month: "$30/mo" is now "$35/mo" and "unlimited" is now "unlimted with throttling at 2.5 GB".

It's great that small carriers are offering some great deals to lure customers and it's great that there are some cracks in the system for the tiny minority that doesn't want a data plan (or whatever) sbut that doesn't change the basic truth that nearly everyone is going to be paying for phone and data every month anyway.

These "total cost of ownership is obscene" things are always disingenuous to me. No, that's the price of the phone and phone service and a data plan for 2 years.


$250 (handset) + $35/mo * 12 mo = $775. Approximately half the cost of ownership.

I think this thread is about the cost of ownership of a "free with contact" iPhone 3GS vs. alternatives, targeted at the low end. I think that's a significant difference for this market. You disagree?


Are you saying this Virgin mobile plan is representative of what the average low end Android buyer gets? And can't you get an unlocked iphone and put it on the same sorts of plans?

I would guess the median low end Android buyer signs up with Verizon and gets the same data plan choices they would if they have an iphone.


Ah, I see where our disagreement is. I agree with you that the current low-end of Android is what you describe, and reading one node higher in the thread, I understand what you're arguing, that the iPhone 3GS is now competing directly with current "free with contract" Android phones. I lost track of that at some point.

I guess to me, the way bundling works, referring to the $1500 cost of ownership as the low-end as opposed to the $1700 cost of ownership seems wrong. That's what the carrier's want you to think, of course, that the only choice is between this "free" phone with the expensive plan and this "high-end" phone with the expensive plan.

But the real low-end is to get away from the expensive plan. And now that there are Android phones that are $250 off contract that are actually quite good, I think this true low-end becomes a viable option for a lot of people, esp. that portion of the market that hasn't yet converted to smart phones.

Could certainly be wrong.


You could buy one on eBay unlocked for around that much, or breaking contract with ATT is only 325. 75 isn't much difference.


My 32GB 3GS was free with a contract 27 months ago. I would expect the 4S to be the same on the 14th.


It actually looks like it has similar capability to Vlingo for Android. Vlingo can do deeper integration into Android due to the intents and permission system making it comparable to Siri.


This is a segway feature -- amazing technology, incredibly embarrassing to use in public.


Should be an awesome feature when combined with a handsfree car kit. The existing iPhone 4 voice recognition is very limited, but it still saves me from getting $200 tickets when I need to make calls on the road.


I am sure at some point wearing headphones was incredibly embarrassing in public.


I think it still comes across as rude or offputting.


Which point?


I don't know about headphones, but using a cell phone in public was at one point considered rude/embarrassing: http://en.wikipedia.org/wiki/The_Finale_(Seinfeld)


I wouldn't go out wearing a pair of those fuzzy-padding headband ones. Monochromatic earbuds? Absolutely.


Have you been out in public lately? I don't think anybody is embarrassed by anything anymore.

With a phone, voice control is a real improvement over thumbing a wee li'l keyboard.

And exposing your personal affairs to everyone within earshot is now just another way to demonstrate your importance.


Took 24 years to ship, but the Apple Knowledge Navigator is finally out ;-)

http://www.youtube.com/watch?v=HGYFEI6uLy0


It's good. But it's not that mind-blowing- I expect Google Voice Search to catch up pretty quickly. In the context of Android vs iPhone it's far from a killer blow.

I'm actually more excited about the iPod Nano as a workout device.


I think that its most impressive feature is how well integrated into the device it is. The overall tech isn't very compelling. It's interesting that it isn't running on the device itself for language processing so there is going to be some delay on 3G networks.


I agree. I think it is a pretty cool feature, but far from a killer one. And the specs of device are pretty close from 1 year old Android devices like Atrix. I was hoping for much more.


I'm actually more excited about the iPod Nano as a workout device.

Indeed, and that the iPod Touch also gets iMessage. Which is nice for messaging with family members that do not have an iPhone.


Apple is doing more than just the voice search. It is parsing what the user is saying and bringing back exact result or completing that actions. Just like Siri did when you wanted to book a hotel or restaurant.


That already happens for a lot of different verbs on android. "Call" "navigate to" and probably more but those are the only ones I use.


this is exactly the same as voice search + "instant actions." there's no line between the two.


The "jogging guy" portion of the video shows Siri juggling context to a surprising degree. I think there's a pretty clear line between that and voice search + instant actions.

It's not unlike the line between what Google does and what Wolfram Alpha does.


I interpreted the parent's comment as being about things like "great Greek restaurant in Cupertino", which are essentially instant search.

but yeah, the reminders stuff in particular is a cool new realm for voice control.


I have a handicapped friend, and we have been trying to get a good smartphone solution working for him. He's currently got an HTC EVO, and as many people have said, Android's voice capabilities are pretty amazing (although Android has a separate called "Voice Dialer" which is absolutely terrible for some reason). One problem he's had, however is touchscreen confirmation. Siri looks like it will fix a lot of problems, and allow him much more freedom.

Also, I'd like to give a shoutout to the folks developing Tecla ( http://scyp.idrc.ocad.ca/projects/tekla ) for Android. It's an open source wheelchair integration for Android that uses the Arduino to interface with whatever controls the user currently has. I found it in my search for a solution, but my friend wanted to hold off until he gets a new chair. Regardless, Tekla is easily the coolest mobile open source project I've discovered, and I just wanted to give those guys credit. The project page is https://launchpad.net/meadl


For your friend who is handicapped it looks like http://www.vlingo.com/apps/android Vlingo for Android might be able to help him a little bit more. The incar demo has talkback and voice confirmation. http://www.vlingo.com/demo/videos


Do look at what iOS provides for accessibility. http://www.apple.com/accessibility/iphone/physical.html There are already a lot of built in features (which Siri no doubt augments greatly) for people with disabilities.


While iOS 4 provides great accessibility features, it mostly excels in providing support for visually impaired users. As a touch interface, it has certain mobility requirements that Jimmy didn't like. He would have had to mount the phone inches from his face (where he keeps his non driving hand). Siri seems to greatly decrease those mobility requirements, and he is excited.


I guess it depends on the handicap, but presumably the iPhone is really useful for the blind: http://behindthecurtain.us/2010/06/12/my-first-week-with-the...


The "Siri" app on the App Store does a lot of this already (Apple bought the company and integrated the product) for current phones running iOS 4. It's really impressive.


It's been removed from the App Store.


Wow, that was quick. I installed it this AM.


Try it now, servers are shutdown.

Site was up at start of 4S intro, down by the time I'd clicked download, redirected to apple.com as soon as I refreshed!


I should probably register a site: siribloopers.com. Am sure we are gonna hear lots of stories about misadventures with Siri :)


Someone just registered damnyousiri.com: http://twitter.com/cjno/status/121293035739955201


I got a better name. Are You Sirious? areyousirious.com I thought about registering it but didn't. Feel free to if someone here wants it. If it becomes big you're welcome, remember me :)


How about whysosirious.com


Damn.. someone beat me to it while checking out.


dang it actually let me buy it but just got a failed registration notice. shouldn't have used a coupon code :)


Same here my frugality got the best of me this time.


whysosiri.us is taken already :(


sirisly?.com (: srsly?)


And I can't even imagine what Siri and Wolfram Alpha will cook up together.


http://damyousiri.com/ slight mispelling of damn in my excitement, but hey ho its good to go and people to contribute to!


Wow. Now I might get a lot of naysayers here, but how long do you think before the keyboard becomes a niche product much like how the pen and pencil are becoming? Natural Language is a field that is going to explode.


See the comment above by Joe. There is a certain 'weirdness' about giving voice commands to a device in public, it feels so strange that people don't do it, and being forced to radically alter their speech tempo/patterns makes it even more uncomfortable for them.

That being said, one 'fix' here is to call your virtual assistant and talk to them on the phone. We do that now with these irritating voice mail type menu trees but at least you are talking to 'someone.' When we get to 'Ironman' level of interactivity it gets more compelling.


I don't exoect the 'weirdness' to be that much of a hang-up. Things that feel awkward take remarkably little time to stop feeling awkward if you just do them a few times[1][2]. And if the pay-off is big enough, as great voice control would be, I think people would be willing to overcome the initial awkward hurdle.

Also, kids are much less set in their ways, and would grow up thinking talking to computers is completely normal and not give a second thought.

[1] Try this: The next time you're standing somewhere with nothing in your hands (maybe in line, or waiting for a bus), just stand up straight, feet parallel, arms hanging by your side. I started doing this and it felt so strange; I badly wanted to put my hands in my pockets, or cross my arms, or shift onto one foot. I have no idea why. But after a day or two of it, now it feels normal, and, in fact, better than it felt to stand with hands in pockets before.

[2] Or try this: If you order something from Starbucks or McDonald's or whatever, mix up your standard phrase. I used to always order, "Can I get a..." and now say "I would like ... please". That, too, was very strange to me, but now feels normal.

So basically, little things can feel awkward, but that goes away with remarkably little experience pushing through the awkwardness.


> See the comment above by Joe. There is a certain 'weirdness' about giving voice commands to a device in public, it feels so strange that people don't do it

Especially so if you happen to be Scottish and stuck in an elevator: http://www.youtube.com/watch?v=5FFRoYhTJQQ


> "That being said, one 'fix' here is to call your virtual assistant and talk to them on the phone. "

I'm sure you can hold the iPhone to your ear and pretend that you've dialed up Siri and are talking to them "on the phone".


I'm not trying to be funny, but there's nothing weird about characters in Star Trek speaking to the ship's computer.

"Computer, make me a coffee." See? It's okay.


So at Google they stuffed people 4 to a cube. Lots of people have open floor plans. Most people turn off their speakers (and miss meetings because of it) because the noise is distracting. Imagine them all saying 'computer, start the debugger on that core file.'

Like you I have seen a lot of Star Trek, it took a world class UX designer to point out to me that every time someone gave voice commands to the computer everyone else in the area had to be silent. There was even a little cultural thing going on, the captain says "Computer" in a loud voice, every one shuts up, and then he gives it some command. I did not pick up that insight on my own, it was pointed out to me and now when I see it in action I chuckle.


I dunno if you are old enough to remember this, but there was a certain weirdness to using a cell phone when it first came out.


Do you code? I can't imagine coding by voice.

And even for ordinary writing, the keyboard is the most efficient thought-to-written-language medium invented. Especially when paired with even the most basic text editing features of a computer. Speaking is far more tiring, especially when you're sick, and makes it far less convenient to edit and rewrite than traditional text editing does.


I am reminded of this old programming-with-voice gem: http://www.youtube.com/watch?v=KyLqUf4cdwc


Naah. You still need to type in public places and don't want everyone to know what you are doing with your phone.


That would be the missing link that would make touchscreen devices productive tools. That, or the society becoming less textual.


There is an inherent friction in using any Software or a Gadget. A piece of technology or service that removes/reduces this friction is going to win BIG!

Some successful examples:

1. DropBox

2. Swipely

3. Kayak

4. Instapaper

5. GPS Voice navigation

6. Multi-touch interfaces

....

I believe Siri belongs to same the class of technologies..

HCI has still not reached 1.0


This essentially exists already with Google voice search. It isn't exactly game changing.


Execution. Execution. Execution.

iDisk offered essentially what Dropbox offers. iDisk sucked, Dropbox doesn’t.

I’m not saying the execution is great (I simply cannot know that at this point in time), I’m saying that execution is extremely important when it comes to whether something is game changing or not.


Has it been mentioned if Siri would be open for Devs?

If not are there alternatives to Siri that app developers can use that behaves the same but open for hacking?


I foresee a new Jargon now in people communication just "tell Siri " if you need anything; Great job Apple as always.


I can see this starting to take over simple tasks. Love the idea of being able to dictate a reminder, or ask the phone to wake me up at 6AM (7AM?!).

How much quicker is it ask those questions, rather than navigate to the App, input the info? Lots!


Think about the value of the aggregated data that comes when everyone (in every country) chooses to use Siri (in the US) as a primary input device. Privacy concerns, anyone?


Surprised that "but the specs are close to ____" arguments are still found around here.

Let's compare the iPhone 4's 5MP camera to the "8MP!!!" Androids.


Galaxy S II 8 MP camera is better than the iPhone 4 for one.


Siri was only for US when it launched, and the screen shots of the current implementation don't look any more favorable.


Not terribly impressive. I wonder how realtime usage will be since all data processing is done in the "cloud".


Android's voice search also does things in the cloud. It's not bad, actually. I can say, "Email Alex. Subject Hello. Body Hi Alex, how are you [question mark]" and my Xoom will open the gmail app and fill out all the fields in about 3 or 4 seconds.

I imagine Apple will be approximately the same.


A lot of the rumors seem to indicate that it's an iPhone 4S feature because of processing power requirements. It sounds as though there is a lot of processing happening on the device. It goes to the "cloud" for looking things up that aren't on the device itself. Wikipedia, Wolfram Alpha, Weather, etc.


I'd imagine they down-sample the audio like crazy. They certainly don't need CD quality sound for it.


the standard is 8kHz-16kHz, 8-bit PCM. higher frequencies are of limited utility in recognizing human speech.


I find Google's version of this pretty zippy and useful, so I'd expect it will be at least as good as that. It's usually a ~5 second turnaround, I'd estimate; I typically use it if I'm walking to catch a bus or suchlike and don't want to slow down or have an accident by typing at the same time.


I think it depends on the carrier as well. Sometimes on my Droid 1 on Verizon it can take a good 10 seconds to get a get a response or 10 seconds to make the initial connection to the server to allow me to start talking.


Is it done in the cloud? Why would it require an iPhone 4S then?


Marketing reasons?


It's like Ask Jeeves, but for my phone in the cloud!!!!


To me it felt like Apple were clinging to straws by touting Siri as a "feature" of iPhone 4S. What was more interesting to me are the thoughts that came to my head. My first thought was of Clippy. My second one was of the McCain/Palin campaign insisting that Alaska's proximity to Russia gives Palin international politics experience. My third thought was that Apple must be smarter than this, or are they?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: