Amazon Echo Dot

hi · on March 3, 2016

The transition from primarily visual UX towards an auditorial UX is really powerful.

Looking at screens to get key information distracts me from my surroundings and seems archaic.

My wife is a sound designer who has opened my eyes to the importance of sounds both in film and in the world. It's not that I was unaware of sounds, but I didn't realize how important they are to centering me in this world and the made up worlds of films and games. Try watching a scary movie with the sound turned off, it turns into a comedy.

I think its unexplored territory that has huge potential to impact the way we interact with the real world, even more so then Glass or Hololens.

When I listen to music as I walk down the street I change, my mood, my posture and the way I look at the world. The music augments the reality around me in a way that visual UX never can because it's a lens between my eyes and the world.

ghaff · on March 3, 2016

The problem is that voice interfaces break down pretty quickly once you try to do anything complicated. The Echo has pretty solid voice recognition--far better than anything else I've ever used--but it's still hard to get it to do anything useful once you get beyond a pretty narrow script. (e.g. what's the weather forecast, play this artist, etc.)

nostrademons · on March 3, 2016

I've found that the voice recognition on Android phones works well enough to be useful in a wide variety of circumstances. Navigating, getting directions, setting alarms, taking notes, sending text messages, sending emails, searching for things, and many more. When I was still using my Moto X I did the majority of every-day tasks with voice recognition.

The iPhone is catching up fast too...my wife's taken to sending emails via Siri (to avoid strain on her hands), and most of the time it gets things perfectly.

The biggest problem is privacy. One of the nice things about touchscreens is that you have a personal dialog with the device that can't be overheard by anyone nearby. That doesn't apply to voice recognition systems, and it can be pretty awkward to dictate an e-mail to a phone in a crowded place.

xigency · on March 3, 2016

Being overheard isn't the only privacy concern. Most of these solutions offload the speech recognition and language parsing functions to corporate servers. I like texting with Siri but I'm not exactly keen on having Apple record everything. It also seems limiting in that I can't use voice commands without a network.

It would be nice for voice recognition platforms to start being built in. I know there's training data that's needed, but there's some convenience afforded.

josh2600 · on March 3, 2016

I think the processing requirements for handling on-device Siri would destroy battery life.

gervase · on March 4, 2016

This actually doesn't seem to be the case. Take a look at Google Translate's offline voice recognition AND translation - it's really amazing, considering it's all happening on your device.

SapphireSun · on March 4, 2016

I forget where it was, but they published something about training a very small very fast neural network that could fit comfortably in the phone's memory. Tricky tricky. :D

euyyn · on March 3, 2016

Plus the only way to train these things at scale is to upload the recordings once you have some usage.

kalleboo · on March 4, 2016

Worse for battery life than firing up the radio?

awqrre · on March 4, 2016

And devices that listen to you 100% of the time is yet another privacy concern... even if they don't send everything to a remote server.

qznc · on March 4, 2016

If you have a human assistant who does that job, he also listens 100% of the time.

_nodg · on March 4, 2016

But he or she is less vulnerable to being automatically hacked by a three letter agency, foreign government, and/or hacker gathering data for identity theft.

The privacy concern _isn't_ necessarily about having something to hide. It's about the consistent hacking of major systems, and exposure of personal data.

kbutler · on March 4, 2016

And you don't think there are privacy concerns with that? It is a /very/ intimate relationship, and generally requires some ritualized/formalized interaction, and a very high degree of trust.

yomly · on March 3, 2016

Just on the note of hand strain, without knowing anything about your wife's condition, a way that could help alleviate it is to critically analyse hand position/technique. As a pianist, I have been trained to have a very supple hand position when operating any device but I notice this isn't at all the case for many people I observe in their day to day activities.

Historically probably wasn't much of an issue but given that most people will spend hours at a desk on a keyboard, it's likely to become more of a problem. Think of it akin to paying attention to your posture

saym · on March 3, 2016

The use of Google Now from my bluetooth'd helmet has really improved my motorcycling experience.

Real easy to say: "Okay Google... navigate to California Academy of Sciences."

What's missing for me is spotify/app specific integration.

dragonwriter · on March 3, 2016

> What's missing for me is spotify/app specific integration.

For that to really happen in a robust way, I think Google needs to open up Custom Voice Actions.

[0] https://developers.google.com/voice-actions/custom-actions

nl · on March 4, 2016

"Ok Google.... Play <artist> on Spotify" works for me.

I agree discovery of these magic phrases needs work.

dragonwriter · on March 4, 2016

Yeah, there's some that can be done through system actions (which I think that is) and it sounds like custom actions have been implemented by selected partners, I just mean they need to open up custom actions to enable more general app-specific integration.

mehta · on March 3, 2016

I thought this already worked.

Okay Google... Play music will start Music app Okay Google... Start Radio will start NPR app

saym · on March 4, 2016

I can say "Open Spotify" and it will open the app. Then I have a button on the helmet that sends the Play command. But I can't do anything robust like playing a specific artist.

Perhaps if I used Google Music the integration would be built out.

nl · on March 4, 2016

On my phone "Play <artist>" uses Google Music. "Play <artist> on Spotify" makes it use Spotify.

khaosrage96 · on March 4, 2016

On my Nexus 6p saying "OK Google play 'artist'" will open Spotify and start playing the top songs of that artist. This does not work to play specific playlists though.

rayiner · on March 3, 2016

Define work well? It doesn't work well if you're not connected to the Internet, if you speak quickly, if you interrupt it, it can only do limited follow up.

hoorayimhelping · on March 3, 2016

>The problem is that voice interfaces break down pretty quickly once you try to do anything complicated

I've done a fair bit of interface engineering for the web. Between that and using so much software over the course of my life, I'd say that this applies to GUIs just as much as voice interfaces.

jerf · on March 3, 2016

Yes, but GUIs have two or three dimensions available (up/down, left/right, time) whereas voice just has the one (time). We humans can also full-duplex GUIs much more easily than voice-based interface. And GUIs at least can be hooked up to full-powered grammar-based interfaces whereas voice, somewhat ironically considering the nature of human communication, has more trouble with it.

(I'd suggest this is actually a combination of the still-non-trivial nature of NLP, combined with a lack of feedback, combined with the fact that giving instructions is quite hard. Humans overestimate human language's ability to communicate clear directions, as anyone who has done tech support over a phone understands.)

hammock · on March 3, 2016

Just as the mouse input has evolved to include multitouch and 3d touch gestures, voice input can also evolve. The full range of tone, inflection, pitch, etc is available from the human voice.

I wonder if NLP research should have started as our ancestors did, with grunts and hoots and cries. Instead it's focused on recognizing full words and sentences while almost completely ignoring inflection.

Another dimension to add with vocal input is directional. If you have mics in all corners of a room, which direction you speak in can affect whether "turn off" operates your TV, your lights or your oven.

freehunter · on March 3, 2016

Very good points. I can't wait until devices can read my emotions or inflections in my voice. I can voice-to-text most of my short messages, but anything that requires punctuation or god forbid emojis still require manual input. And I don't want to have to say "period" or "exclamation mark" to indicate my desired punctuation. If I say it unusually loudly, insert an exclamation mark. If I pause at the end of a sentence (Word has known a grammatically correct sentence for decades) and don't say "um" or "uh", put a period. If my inflection goes up or there is a question word in the sentence, add a question mark.

There is a lot of improvement for voice processing in several dimensions of voice.

krinchan · on March 3, 2016

And copy and paste. People seem to always forget the power of it. It's the GUI equivalent of "Search for that on Google" or "Now, SSH to this IP I found digging through AWS." Copy and pasting of text from application to application is the clunky Unix Pipe. It's universal and deeply important.

Taking sections of the last response, or hell, even having every response essentially be wrapped up in some sort of object you can reference in your next query to the interface is what all of these lack.

Even Androids "Search this artist" doesn't quite get there. The lack of context between queries is what murders Siri for me. That and her seemingly random selection of what goes to Google and what goes to Wolphram Alpha. Sometimes even the "wolfram" verb prepended to a query just doesn't go to wolfram no matter what.

cubano · on March 3, 2016

I've often postulated that copy and paste is perhaps the biggest productivity enhancement in the history of computing.

webXL · on March 3, 2016

I know some software maintainers who might disagree. But I like PopClip (https://pilotmoon.com/popclip/) as an enhancement on top of that one.

chris_st · on March 3, 2016

I second PopClip as a fantastic product, incredibly useful. Their DropShelf[0] tool is also useful, but not nearly as much as PopClip. But definitely worth the money.

0: https://pilotmoon.com/dropshelf/

yoodenvranx · on March 4, 2016

I use KDE Connect to enable seamless copy and paste between my PC and my phones. It's the single best thing I ever installed in the last 1 or 2 years.

Nullabillity · on March 3, 2016

Sure, but the difference is that it's (almost) always obvious what actions are possible in a GUI. With voice interfaces you're back to trial-and-error.

hrktb · on March 3, 2016

There is still a fundamental problem with voice: it has to understand your words.

A text field in contrast doesn't need any intelligence, nor do buttons. This is in particular important for instance for people living in non english speaking countries but using english in specific contexts (work, gaming, minor hobbies etc.). Switching language in audio applications are generally a PITA. Then even when you do the switch between languages every time, the engines are still have huge performance gaps between the languages.

Sofware has become way extremely tolerant for multiple languages IMO. Voice recognition interfaces are not so mature yet in my experience.

NDizzle · on March 3, 2016

I'm not so sure about that. Check this out. One of the toughest fights in one of the toughest games performed with only voice commands. https://www.youtube.com/watch?v=5m2a2dLdZ0M

Now, granted, this is a specific use case, but, you know... "explore the space" and all that. (more cowbell!)

s_kilk · on March 3, 2016

> One of the toughest fights in one of the toughest games performed with only voice commands. https://www.youtube.com/watch?v=5m2a2dLdZ0M

After 111 failed attempts :)

Still, it's a hell of an achievement.

EDIT: to be fair, Ornstein & Smough is a very tough fight even with normal controls.

Also notice the voice recognition fails to recognise some words like "item" even though they are spoken clearly. Almost gets the guy killed at one point.

rrrx3 · on March 3, 2016

The "play some good 60s rock" example isn't a VUI breakdown, it's a functionality gap in the backend. One that will probably be fixed pretty quickly, given the way things are headed.

A VUI breakdown would be inability to understand accents, or non-responsiveness to commands. As a user input, Alexa is pretty well buttoned up.

Domenic_S · on March 3, 2016

Sounds like the Enterprise computer:

Geordi: Computer, subdued lighting.

(computer turns the lights off)

Geordi No, that's... that's too much. I don't want it dark. I want it cozy.

Computer: Please state your request in precise candlepower.

(The scene: https://www.youtube.com/watch?v=OPZnR3Ue1n4)

freehunter · on March 3, 2016

There will certainly be some aspects of the computer training the human, too. Just using this as an example, I don't know how much candlepower I want, but computers don't get bored or annoyed by my requests. I could start with 1 candlepower and move up to 10 if it's not bright enough. 100 might be too bright, so now I know what range I'm looking at. Next time I could just say "computer, 12 candlepower lighting, please".

Computers train users on how to use the computer all the time. It's less ideal than having the computer know everything, but once you know what you can expect from a computer, it's easier to get a good result.

Karunamon · on March 3, 2016

I think that cuts both ways. If the computer can be trained to understand the user's intent, that seems like a better solution than forcing the user to think a different way.

Which would you rather do? Be forced to state your lighting preferences in candlepower, or have the computer learn that when you say "subdued lighting", you mean "12"?

freehunter · on March 3, 2016

Very true, but this is one simple example. Look at what Wolfram Alpha tries to do for even more complicated examples. If I put in "if I am traveling at 60 miles per hour how many hours does it take to go one hundred miles" it gives me an answer of 6000 seconds (1.66 hours). Very intuitive, and it actually ruined my example because I did not expect the site to understand what I was saying.

But if I type in "how fast do I need to go to travel 100 miles in 6000 seconds", now it has no idea what I'm talking about and instead gives me a comparison of time from 6000 seconds to the half life of uranium-241.

Now, when I get that result, I don't usually just give up on trying to figure out the answer. Instead I try to figure out what the computer expects me to say. Through some trial and error, I can shorten the query to "100 miles in 6000 seconds" and boom, I get the answer of 60 miles per hour. Instead of natural language, I'm using the search engine like a calculator.

The computer has just taught me how to use it. Ideal? No, but we work within the reality we're given. 12 candlepower is dim for you but for someone with decreased vision, that might be completely dark. The computer doesn't know unless it's taught, and we know from looking at history that users would rather the computer train the user than the user having to train the computer.

spydum · on March 4, 2016

You asked: "how fast do I need to go to travel 100 miles in 6000 seconds" Which is equivilent to saying "at what rate do I need to go to travel {rate}". It's a nonsense question, you already know the answer. You need to go 100 miles per 6000 seconds.

What you should have asked is: "100 miles per 6000 second to miles per hour", which it will happily convert the rate you gave, for the one you really wanted.

I guess what your saying is it should be able to figure that out, but at some point, the old phrase "garbage in garbage out" surfaces.. You never told it to convert the unit.

krinchan · on March 3, 2016

Wolfram is, and has always been, much more inclined to understand you if you work out what exactly you are trying to calculate before hand.

Some phrases exist as a "wow, 1 million people phrase this problem this way, let's throw that in." The fact it can take an easily dictated, albeit strictly phrased problem, and get you your answer is really what I love about it. Now if Siri would just stop sending stuff to Google. -_-

indiv0 · on March 3, 2016

What if you could define the equivalent of Bash aliases via voice control? This would allow users to tailor their experience from the default (possibly complex/unintuitive) commands to their own personalized ones.

Example format: "Computer, define X as Y"

"Computer, define subdued lighting as set lighting to candle power twelve"

Then the VUI just adds a new entry to the voice commands where saying X results in Y.

ascorbic · on March 3, 2016

So unrealistic. They'd use candelas.

ghaff · on March 3, 2016

You're thinking too much like an engineer :-) It's not a speech recognition breakdown but it's certainly a voice interface breakdown in the sense of I can't get the device to do what I want it to do. As a user, I don't care where in the pipeline my attempts to communicate a desired action break down. I just know that they do.

gffrd · on March 3, 2016

Exactly. We're used to dealing with either humans, who are intuitive and highly adaptive, and technology, which we manipulate and have total control over (so long as the system displays its status, we can find our way). We're not used to systems that expect us to interact with them in natural language, but have very specific criteria around what we ask for.

It still feels a lot like the old text-based RPGs, in that you spend most of your time trying to figure out how to phrase something to accomplish a basic need, while angrily thinking "it would have just been easier/faster to pick up my phone."

It's 2016. How are we still OK with the unreasonable constraints of technology that make us jump through a hoop like a trained poodle to get the treat?

aryamaan · on March 3, 2016

Same can be said for GUI as well. Remove the search engine concept, you are only left with playlist, song/artist name on such sites.

We don't have audio search engine equivalent yet but that day is also not far.

ghaff · on March 3, 2016

That's the thing. It is a use case with voice commands that map to specific actions. In the case of music, I can give Echo the name of a specific artist or maybe a playlist. But it breaks down pretty quickly if I tell it to play "some good 60s rock."

Jordrok · on March 3, 2016

Ok, that is pretty damn cool. I've played Dark Souls so I can appreciate how difficult that must have been. Very impressive.

Devil's advocate though: this seems more like a case of the guy being good enough at the game to win in spite of the voice controls rather than because of them. Compared to a regular controller/keyboard+mouse/whatever there's just no contest in terms of input speed and precision. Not all genres are a good fit for this either. I'd be really interested to see if anyone could make it work with, say, a competitive FPS game.

mikejmoffitt · on March 3, 2016

Never mind that in order to use a voice service, it requires you to speak at a rate slower than many can type, all while demanding that the people in the room hush up so it won't get confused. Repeat if there was a mistake.

firephreek · on March 3, 2016

Try Hound. It's faster than anything I've tried and it's context management is just impressive as hell. The echos lack of negative clauses is really really frustrating.

ams6110 · on March 4, 2016

I just can't stand talking to a computer. Never liked the idea of it. I loathe voice-controlled telephone menus. I can type faster than I can talk (if you include the inevitable revisions -- even without it's pretty close). I don't even like to leave messages on voicemail. I don't think voice interfaces are anything I will ever use if there's another option.

lyime · on March 3, 2016

That holds true with pretty much all first generation products of it's type. The first "smart phones" couldn't do a whole lot of things. Over time, the Echo will improve and you'll be able to hold conversations with it.

samstave · on March 3, 2016

My children are quite young. The world is going to be an amazingly interesting place when they are my age.

I can recall the first time I ever saw a computer and how primitive they now look.

Now we have little bots that listen to you and reply with info.

When my two-year-old is forty - we will have ghost in the shell.

It's crazy beautiful and scary to me that we all grew up reading cyberpunk fiction and watching anime and not all of us did, but pretty much all of us are actually building that future.

There is a balance between dystopia and utopia though.

We are all working at the Great Game - and the future is going to be interesting, but we can never turn back. So hopefully we keep the balance and get it right.

My worry is that at this literal nascent stage of technology, that we don't fuck it up as we don't fight hard enough for privacy policy.

We need privacy policy that is thinking at least 50 years in advance.

The control of government apparatus is thinking in advance - I personally feel that the tech sector's vision is myopically focused on today's profits and not in the future where it should be viewing, with the exception of this most recent case between apple and the FBI. At least Cook's comments were salient and forward thinking and truly for the greater good... Let's hope that invigorates the tech industry as a whole to think about where we are headed.

ghaff · on March 3, 2016

Speech recognition has improved dramatically over the past few years through using cloud back-ends. It's actually usable for many tasks.

However, we seem to still be pretty far from natural language interfaces that make sensible inferences about actions you're requesting and perhaps join multiple data sources to answer your query. There have been a lot of advances--don't get me wrong. But it's a very hard problem that's been being worked on for a very long time.

chc · on March 3, 2016

Just like you hold conversations with Siri, Cortana and Google Now?

deegles · on March 3, 2016

Are they not first-gen?

chc · on March 3, 2016

Well, I mean, they aren't fixed artifacts like a piece of hardware. I'm pretty sure they have been updated a few times.

baby · on March 3, 2016

is it better than google voice? Siri is completely useless for me but google voice recognize everything I said (love my new iphone 6s but I wish I could say "hey siri" and it would actually work).

ehnto · on March 4, 2016

The other issue is it becomes less useful when more than one person is active in the room. Small party? Interface no longer functioning as talking in the background interferes.

atomical · on March 3, 2016

And if you do get beyond a narrow range does the user spend a lot of time thinking about how to craft a question so that the machine can understand it?

pbreit · on March 3, 2016

How complicated is controlling a TV or a radio? And voice is much easier for a variety of tasks than remote controls.

Jack000 · on March 3, 2016

I think the main problem with voice interfaces is that it's not discoverable. You need a good understanding of what the system can and cannot do, its current state etc before even speaking.

CLI has the same issue, but at least you can man-xxx, which I imagine works a lot better in text than it does in audio.

criddell · on March 3, 2016

I think the goal is that the system gets to be good enough that nobody worries about discoverability any more.

I think Google is quickly getting there with their search interface. I'm always amazed at what a good job Google does when I ask it a question like "what's the name of the instrument powered by steam" and milliseconds later it's showing me info about calliopes.

drzaiusapelord · on March 3, 2016

I really liked how this was done in the movie 'Her.' There's something especially nice about only having your attention distracted audibly and not visually, especially in public.

I wonder if the smartphone age will go away as quickly as it came. I picture a world where we just have smart wearables like a watch which has a tiny visual interface, but a powerful audio one (speaker, earpiece, put watch up to ear, etc). It seems a lot less intrusive. I imagine as we get better with AI and voice recognition, it'll be as practical as a phone. What I'm able to do with Google Now on my watch is fairly impressive today. We already have the technology to understand things in context like "Navigate to Katz's deli" brings up Google Maps to the deli as opposed to a google search results page about navigating to a cat themed deli, which was the status quo not too long ago with voice search.

I imagine carrying around this big selfie/facebook machine around, constantly charging it, whipping it out all the time, etc will be pretty gauche if wearable-only solutions become competitive.

jkestner · on March 3, 2016

For many functional tasks, I can see an auditory UI being superior. But currently most people use their smartphone to skim content. I don't want the equivalent of listening to voicemail for everything.

Not to say that content can't shift for the medium, just as it always does. What would an audio Facebook sound like?

drzaiusapelord · on March 3, 2016

Well, I do that now sorta on my watch with its small screen. I scroll through notifications, but no, I don't get the full FB web or mobile experience. I'm not sure how many people actually want that; I often hear complaints about how phones and apps aren't simple anymore. I also believe that we really haven't figured out the best way to use these small screens. I'm surprised at how usable my watch is sometimes with its 320x320 screen at 1.8". For reference the original iphone was 3.5" at 320x480 resolution.

For teens and such I can see the big phone never going away but for most adults, having an inconspicuous wearable just seems like a more refined experience. I imagine there's a logical procession here from desktop > traditional laptop > ultrabook laptop/convertible > tablet > mobile > wearable. You lose functionality with every step, but depending on the use case, it doesn't really matter. For people in my peer group, a wearable that could work without a phone would sell like hotcakes.

jallmann · on March 3, 2016

> The transition from primarily visual UX towards an auditorial UX is really powerful.

It's also less accessible. I'm sure auditory UI is useful in many cases, but it also seems to be more cumbersome in others. In any case, I hope that pervasive auditory UI doesn't become any sort of standard without an accompanying visual/physical interface.

> Try watching a scary movie with the sound turned off, it turns into a comedy

Allow me to be pedantic and say it is that being fully immersed in the context of the movie that really matters. You could probably achieve a similar suspenseful effect with silence+subtitles, although I'm sure the experience isn't identical. Otherwise, the deaf could never enjoy scary movies, including me.

drzaiusapelord · on March 3, 2016

>It's also less accessible.

For whom? To the blind this would be a godsend. From a practical medical perspective, audio is superior because we have decades of experience with effective ear implants to help the hard of hearing and the deaf, but the visual equivalent still eludes us.

zeveb · on March 3, 2016

> To the blind this would be a godsend.

Actually, I'd imagine that a good old-fashioned tty is pretty good for a blind person: it's TUIs and GUIs that get progressively more painful.

Source: am blind without my glasses; can imagine preferring ed to emacs, vim, Atom, SublimeText if I had to use an audio interface.

jallmann · on March 3, 2016

> To the blind this would be a godsend

For sure. Different interfaces disadvantage different classes of people. There is no silver bullet; I'm trying to point out that an exclusively audio/voice-driven UI would not be desirable.

> we have decades of experience with effective ear implants

The problem is multi-faceted. Hearing loss, especially from a young age, often leads to difficulty speaking -- it is no use if a voice-driven system can't understand you in the first place.

And while cochlear implant technology has helped a lot of people, it is by no means a cure, and there are many, many others that don't benefit enough from assistive technology to achieve functional equivalence (which is the key phrase when talking about accessibility). I have a cochlear implant and haven't worn it in years, because it really doesn't help.

tlrobinson · on March 3, 2016

> It's also less accessible

Well, I think blind people would disagree with you.

> I hope that pervasive auditory UI doesn't become any sort of standard without an accompanying visual/physical interface.

Any speech interface could be trivially translated to a text interface, right?

jallmann · on March 3, 2016

> Well, I think blind people would disagree with you.

Answered downthread.

> Any speech interface could be trivially translated to a text interface, right?

Pretty much, which is why UIs should not be exclusively auditory, that is, delivered without an accompanying visual interface (text or otherwise). Ordering the Echo Dot verbally is a cute gimmick given its premise, but it would really suck if otherwise useful products and services were only usable through audio.

Hopefully the audio UI trend does not follow the obsession over touch screens: a rapidly adopted, de facto standard driven by tastemakers that leave little consideration for others that might prefer an actual keyboard or other physical affordances.

AndrewUnmuted · on March 3, 2016

> Allow me to be pedantic and say it is that being fully immersed in the context of the movie that really matters.

I hope to not be a super pedantic ass for pointing out that the 'immersive' media in films is the audio, not the visual components.

jallmann · on March 3, 2016

> the 'immersive' media in films is the audio, not the visual components

That's a non-falsifiable opinion, really (even if it does apply to the majority of the population). I'm living proof you can enjoy movies without the audio.

It's the sum of our experience that colors our perception -- almost irrevocably in this case, since I imagine it would be difficult for the typical person to really be able to enjoy something in complete and utter silence.

AndrewUnmuted · on March 3, 2016

> I'm living proof you can enjoy movies without the audio.

I am not looking to equate immersion with enjoyment, and by no means do I intend to disrespect the manner by which you enjoy a type of media. My apologies for coming off that way!

When I refer to 'immersive media' I am referring to the 360-degree omnidirectional dispersion pattern of sounds and our similarly omnidirectional hearing of those sounds. This is 'immersive experience' as opposed to a 2-dimensional or stereoscopic experience, which is what we get with visual media. Television/film screens fire light directly at the eyes; even in iMax situations the film is never experienced behind us. That isn't immersive, whereas say a VR headset can potentially offer this type of immersion. But since this technology is still in its infancy I think it too early to call it fully immersive like audio is.

jallmann · on March 3, 2016

> 'immersive media' I am referring to the 360-degree omnidirectional dispersion pattern

Then that is splitting hairs over a definition of immersion, and quite unrelated to how the word was used in my original comment. Had I instead said "fully engrossed," my point would still hold, and you would not have one.

I understand you were being "super pedantic," but if you're going to do that, then you should be super precise in the pedantry, otherwise you're arguing a strawman.

bruceboughton · on March 3, 2016

>> auditorial

Don't you mean oral or aural?

bckmn · on March 3, 2016

You would probably be interested in what we've been building over at https://www.narro.co.

visarga · on March 4, 2016

It would be nice if it could extract forum discussions, like YC and Reddit. Sometimes I like to hear the text I am reading, it helps with concentration.

amelius · on March 4, 2016

Yes, I'd like to see the possibility to select text, right click and select "read out loud".

kalleboo · on March 4, 2016

I think all the browsers on OS X support that using the system text-to-speech (edit: Safari and Chrome, not Firefox)

amelius · on March 4, 2016

I'm using Linux. It seems that Linux is falling behind in the area of speech input/output. I hope they will catch up.

pbreit · on March 3, 2016

Voice will become an important, if not the primary, interface to home/car audio/video.

jordache · on March 3, 2016

"computer lights on" "dimmer"

no thank you.. i will use my hand

dionidium · on March 3, 2016

A device to change the channel on my TV? No thanks; I'll just use the dial on the TV.

jordache · on March 4, 2016

let me pick up my phone, open the app for light control, dial in some setting, and hope the app doesn't crash.

TV remotes are awesome because it has physical buttons, and it's fairly dumb... almost no chance of issues.

prawn · on March 3, 2016

And if you're on the couch watching a movie and the light switch is on the other side of the room? Or you want to switch on the porch light for guests. Or switch off outside lights?

jordache · on March 4, 2016

i get my non-lazy ass up.

For the few times where I may need to walk additionally around the house, it's a non issue

jsalit · on March 6, 2016

and what if you weren't so mobile?

samstave · on March 3, 2016

They should announce Amazon echo for the deaf, which would just be a screen.

pbhjpbhj · on March 4, 2016

... with a couple of kinect type devices to monitor one's signs.

Rezo · on March 3, 2016

"If you have more than one Echo or Echo Dot, you can set a different wake word for each".

This is something I've been thinking is becoming more problematic as well as an opportunity for real ubiquity. I have 3 separate devices nearby that are Google Now voice activated (the newer devices support this even if the screen is off), and they will sometimes trigger at the same time accidentally.

Since the processing is cloud based, and they know my identity, why don't the devices recognize this fact and cooperate. Instead of just 7 beam forming mics in the Echo, if you have two within hearing distance you could have the benefit of 14 and a unified response. Don't tie the request & response to a particular device, instead think of it as ubiquitous network that moves with you as you walk around the household, you should be able to continue your conversation from one room to the next seamlessly.

ansible · on March 3, 2016

Since the processing is cloud based, and they know my identity, why don't the devices recognize this fact and cooperate. Instead of just 7 beam forming mics in the Echo, if you have two within hearing distance you could have the benefit of 14 and a unified response.

The echo and noise reduction software that I'm aware of can't really do that in a reasonable fashion.

With current solutions, you've got one DSP that's receiving all the audio streams simultaneously, and they need to be exactly synchronized in time. Then, using basically pattern-matching, it figures out what direction the user's voice is coming from, and combines some/all of the audio streams together to eliminate environmental noise and make the speech as clear as possible.

To do this with separate devices, you'd want extremely precise time synchronization. Which is possible, but I wouldn't want to implement it.

The extra processing and synchronization would take longer, and delay input to the speech recognition engine. I don't think it would enhance the user experience.

Edit: spelling.

t0mbstone · on March 3, 2016

Just have the Echo that hears the person best be the one that responds. So simple, and easy to implement. I honestly don't understand why Amazon hasn't fixed this yet. It's so fucking obvious.

joekrill · on March 4, 2016

> So simple, and easy to implement.

Ah yes, the rally cry of the person not doing the actual development work... In my experience, rarely is _anything_ "So simple, and easy to implement".

ansible · on March 3, 2016

Agreed. Doing something sensible at a higher level than the actual audio recording would be easily possible.

visarga · on March 4, 2016

> I don't think it would enhance the user experience.

Baidu trains the voice recognizer by adding all kinds of noise to the training data. I think it might be easier to do that than use multiple microphones. The neural net learns to do the difficult process of separation of useful data from noise.

mrbill · on March 3, 2016

I learned to not have the wake word be "Amazon" when I was watching online training for AWS. The Echo went nuts until I finally paused everything and changed the wake word back to "Alexa".

t0mbstone · on March 3, 2016

They really need to make it so that all of the Amazon Echos on the same network use a proximity algorithm to determine which one responds. Simply: The Echo that hears you best should be the one to respond.

I want to have an Echo in every room, and I don't want to have to remember all their different names!

jimktrains2 · on March 3, 2016

> I have 3 separate devices nearby that are Google Now voice activated (the newer devices support this even if the screen is off), and they will sometimes trigger at the same time accidentally.

> Since the processing is cloud based, and they know my identity,

Interesting, so everything said in that room gets processed and potentially sent to Google for indefinite storage? What a 1984-style luxury.

Karunamon · on March 3, 2016

AFAIK, every one of these devices does nothing until a "wake word" is heard, and only then do they record+send.

Having all of the devices listening all the time would be a bandwidth and power nightmare, if not for the sender, for the receiver.

mavhc · on March 3, 2016

Correct, of course it can activate accidentally.

https://history.google.com/history/audio has a list of all audio recorded

jimktrains2 · on March 3, 2016

What about accidental triggers of the wake word? What about planted "wake words" to record people discussion "inappropriate" things?

cthalupa · on March 3, 2016

For the Echo, at least, it has to use your home network, so you could pretty easily run a packet capture to see if it's ever sending audio out when you don't want it to.

Harder for things with cellular data, though.

jimktrains2 · on March 3, 2016

It's a cat-and-mouse game: What if it only sent the clandestine information when it picks up the "normal" word? The point is you don't control the device or its software.

gr3yh47 · on March 3, 2016

>What a 1984-style luxury

exactly why I think none of this is worth it (echo, google now, siri, smart tvs etc) - especially given the current applications of the 3rd party doctrine - you are giving up the right to privacy for everything that is said in your home.

masonhipp · on March 3, 2016

I agree with this entirely. I've been waiting patiently for a way to add microphone distance to my Echo and this is perfect for that... except it doesn't work that way.

I am very much hoping they fix it in the future and add a software layer to combine/route commands with one single wake word.

Nullabillity · on March 3, 2016

It's also a bit annoying that the Android Wear version of Now doesn't work the same as the regular Android version. For example, the full-sized one seems much more flexible with wording, and supports listening in several languages at once, while Wear is limited to one language.

Touche · on March 3, 2016

But it's limited to 3 words which is weird. I'd rather three "Office, order socks" than try to remember that the one in my office goes by "Amazon".

caractacus · on March 3, 2016

When did I turn from the enthusiastic kid who dreamed of audio-controlled personal assistants like this to a cranky old man who doesn't want anything remotely spy-possible in his house?

fluxquanta · on March 3, 2016

I think when we were kids we didn't think that the personal assistant would have to communicate with the outside world via the internet in order to perform its function.

If all of "Alexa" was included in a disconnected local database I bet it would still be as appealing.

Rosie on the Jetsons didn't have to "phone home".

arien · on March 3, 2016

Almost. More specifically, we didn't think that the personal assistant would have to communicate with a corporation that wants our info to make (more) money. The government option doesn't sound any better, either.

I think they are rather creepy, because it's so obvious there is (or, could be) a hidden agenda.

fluxquanta · on March 3, 2016

While indeed creepy, I ordered the original Echo as soon as it was made available, but I'm probably a special case. I live by myself and barely even speak out loud at home.

If Amazon can somehow monetize my primary use of Echo as a glorified kitchen timer I will be impressed.

tenpies · on March 3, 2016

> I live by myself and barely even speak out loud at home.

It occurs to me that the background noise in your home actually reveals a whole lot about the self:

- What you're listening to and when

- What you're watching and when

- What type of gentleman's material you enjoy and when

- When you leave home and get home

- When you wake up, when you go to bed

Some of these can be limited by the size of your house, but the trend in urban dwellings has been towards smaller so one unit could presumably capture every sound in your home.

tdkl · on March 3, 2016

And at the end someone pays money for some company to install a device to collect all this.

The tech insanity has really gone far ..

kuschku · on March 3, 2016

Finally the Telescreen is here.

fluxquanta · on March 3, 2016

Your first three points are moot in my case because, as a testament to your mentioned small apartment size, I consume all my entertainment with headphones after some real passive aggressive comments from neighbors a few years back.

When I wake, sleep, leave, and come home could be monitored by Echo, but it's also already being monitored by other devices I own, and it's data I'm not particularly concerned about at the moment.

_snydly · on March 4, 2016

> I live by myself and barely even speak out loud at home.

Not sure how other people feel about talking out loud at home, but as someone who also lives alone (in a 250 sqft apartment) and always wears headphones, I can't really imagine talking out loud. Just seems weird for some reason. I never use Siri either.

Wonder if that's a living alone thing, or a small apartment thing, or ...?

pc86 · on March 3, 2016

$180 for a kitchen timer seems a bit steep.

lukeschlather · on March 3, 2016

I would pay $180 for a voice-controlled kitchen timer which did not need an Internet connection to function and had verifiably secure command log deletion.

I'm less than enthusiastic about a $180 kitchen timer that uploads everything I say to the cloud for analysis, even if I understand that the analysis is to some degree necessary to improve the voice recognition.

sib · on March 3, 2016

While I hear what you are saying (no pun intended), it's important to be clear that it is not uploading everything you say to the cloud. It's uploading what you say once it wakes up by detecting the wake word, which is done completely locally.

fluxquanta · on March 3, 2016

It was $99 (there was a special offer when it was first announced at the end of 2014).

pc86 · on March 3, 2016

$99 for a kitchen timer seems a bit steep.

at-fates-hands · on March 3, 2016

Considering most smartphones already have this app on them - I'm going to agree.

FREE vs. $99? No contest there my friend

fluxquanta · on March 3, 2016

We all spend our money how we want, and cell phones most certainly aren't free, either.

Aside from that, I didn't purchase the Echo with the intent of it being primarily kitchen timer. It just so happens that after owning it for over a year my usage of it is mostly limited to that.

My usage is probably around 85% timers and alarms, 10% streaming music, 4% shopping lists, and 1% everything else.

at-fates-hands · on March 3, 2016

I'd be interested to find out how much you still use it a year from now.

Do you think you've used like you thought you would, or did you have ideas about how you might use and those didn't pan out or the device didn't work very well for those?

fluxquanta · on March 3, 2016

I ordered it originally purely on the "Oh, cool gadget!" factor, and I was willing to part with $99 for it.

I really didn't have a particular use case in mind at the start, but I was (and still am) impressed by the sound quality from such a small speaker. It's nice to be looking in the fridge and say "Alexa add X to my shopping list" or when my hands are covered with flour say "Alexa set a timer for 30 minutes" or whatever. And for those things it's worth the cost to me.

Most of the features that have rolled out just seem gimmicky, though. Take the news briefing: It either provides too little info to be useful, or it drones on and I get annoyed by the voice which, while it sounds natural compared to Microsoft Sam, still feels cold and artificial. In general I like having more control over my internet actions. I'll never use it to order a pizza or anything from Amazon because I don't know what happens if it misinterprets me or I make a mistake. And the third party apps are clunky ("Alexa, ask X to do Y").

To sum it up, aside from the very basic features I've used since day one it just feels like a toy.

api · on March 4, 2016

Basically everything B2C today is a data play. Customers want everything to be cheap or free, so the only way to make money in B2C is to turn the customer into the product.

It's a deflationary race to the bottom. The bottom is a hell where everything watches you and sells absolutely everything about you to whomever can afford to buy the data.

brooksc · on March 3, 2016

Whenever I read these, I can't tell if the group is paranoid or prescient. But anyway I ordered one via my alexa. Amazon probably already knew I would.

jdminhbg · on March 3, 2016

> I think when we were kids we didn't think that the personal assistant would have to communicate with the outside world via the internet in order to perform its function.

Human personal assistants were connected to the outside world -- how else would they make appointments and reservations, book flights, find out what the weather would be, etc.? The whole point is to be connected to the outside world, automatic or no.

gherkin0 · on March 3, 2016

> Human personal assistants were connected to the outside world...

There's a difference between the "always on" communication these devices have and communication the user specifically requests.

When I want to make an airline reservation, I'm requesting the device to send the booking information to the airline. I'm not asking it to send a recording to the mothership of everything that happened in my home for the last 5 hours, which a human assistant would never do.

fluxquanta · on March 3, 2016

Hah, it'd be like hiring a personal assistant from a staffing agency who is constantly on the phone with the staffing agency parroting what you say.

sib · on March 3, 2016

That's also not what's happening with Echo. You'd literally have a few seconds of audio being sent to Amazon and then some text (the result of the ASR) being sent to the third party ticket search / reservation system.

squeaky-clean · on March 3, 2016

Sure, but I also wouldn't let a human assistant live in my bedroom 24/7 listening to everything I say. I would also choose my words and topic differently when a human assistant is around.

You have to be able to trust that Echo isn't recording everything you say, unless you prefix it with "Alexa", and that this behavior will never change (say this is the behavior for the average user, but with a police warrant, they're able to tap your Echo).

I'm part of the group that thinks the tradeoff is worth it for the convenience, but I understand why many people would disagree.

wmeredith · on March 3, 2016

This is exactly it for me. I'd buy an echo and a dot for every room if it didn't phone home.

kamaal · on March 3, 2016

I wonder what sort of memory related tech it would take to pack nearly all of the internet in a small space, and have it incrementally update(the internet!) and yet write it in available memory.

Besides any contact with outside world would need communication. So you can't have an entirely standalone gadget.

falcolas · on March 3, 2016

Minus videos and images over a certain size... not all that much. And it would compress pretty well.

I wonder if the internet archive has a record of the size required minus images.

sschueller · on March 3, 2016

Couldn't legally the FBI get a court oder to be able to listen in on conversions in a room that has one of these? They already do that with car assistance services. [1]

[1] http://www.cnet.com/news/court-to-fbi-no-spying-on-in-car-co...

tlrobinson · on March 3, 2016

Echo (supposedly) doesn't start sending audio to Amazon until you trigger it with a "wake word", i.e. "Alexa".

Of course:

a) it's not open source so we can't be sure (aside from monitoring network traffic, which is probably encrypted)

b) if the FBI is successful in compelling Apple to develop a backdoor for the iPhone there's nothing stopping them from compelling Amazon to do the same with Echo.

c) better hope you don't say "Alexa" or something Echo mistakes for it.

sib · on March 3, 2016

The traffic is encrypted. But you could certainly watch the network traffic and see that there's no traffic if the Echo doesn't wake and the lights don't turn on. (Of course, you'd have to trust that it isn't time delayed for hours in some sort of intentionally-sneaky way.)

It would also be possible to take a look at the hardware design and determine the linkage between the "mic mute" button light being on and power going to the mics.

The customer can set the device to provide both audio and visual indication when it "wakes up" and begins streaming to the cloud. And, of course, the customer can also press the mic mute button to avoid accidental wake up.

Yes, the FBI could try the same approach with Amazon as they are trying with Apple. For all of our sake, let's hope that Apple wins.

tlrobinson · on March 4, 2016

> It would also be possible to take a look at the hardware design and determine the linkage between the "mic mute" button light being on and power going to the mics.

How would the mics listen for the wake word if they aren't always on?

sib · on March 4, 2016

There is a mic mute button that is able to turn off the mics, which then prevents the device from waking up, as it is not receiving audio signals to process and detect the wake word. When the button is activated (== the mics are off), there is a glowing red light illuminated inside the button.

My point was that you could check to see if the linkage between that red indicator light and the power going to the mics was in software or hardware.

This is analogous to the warning light that many laptops have for when the built-in webcam is on.

e40 · on March 3, 2016

b) if the FBI is successful in compelling Apple to develop a backdoor for the iPhone there's nothing stopping them from compelling Amazon to do the same with Echo.

No backdoor needed if they information is sent to Amazon. All that is needed is a court order for Amazon to hand it over.

tlrobinson · on March 3, 2016

Sure, but all you'd get are commands you give Alexa ("Alexa, turn off the lights", "Alexa, what's the weather today"), which I suppose could be interesting to law enforcement, but certainly not as interesting as the "full-take" of an always-on wiretap.

I'm suggesting in order for the FBI to use Echo (or any other internet connected device that has a microphone) as a wiretap, the FBI could try to compel the manufacturer to write, sign, and push an update that causes the device to transmit audio to the FBI at any point.

That would have seemed a little far fetched in the past, but the current FBI/Apple situation could set a precedent.

e40 · on March 3, 2016

The answer is obviously 'yes'. If there is a way for Amazon to listen to conversations then a court can compel them to give the FBI access.

ergothus · on March 3, 2016

I'm not terribly worried about various ways companies expose me to govt surveillance that requires a court order.

I do worry about said court orders being rubber stamps, and about surveillance that DOESNT require a court order.

Otherwise we can make no technological advancement.

ryandrake · on March 3, 2016

I love "smart" devices, but hate "devices that needlessly insist on connecting to the Internet".

One of the worst offenders is Dropcam. They have a super camera, easy to set up and use. Great picture quality. Would be an awesome baby monitor or "closed circuit TV replacement". But why the goddamn hell does it need to connect to the Internet? Why is the only option available to needlessly stream video out of my home network to the cloud, only so that I can then stream it back into my home network for viewing??? WTF? That's both a waste of outbound bandwidth and a waste of inbound bandwidth. I should be able to put it on my network, switch off the cable modem, and still be able to view video locally. How hard is that? I could do that with a webcam and a really long USB cable!

oaktowner · on March 3, 2016

Their business model depends on some percentage of their customers using the subscription service.

My guess is: if they offered the version you describe, they'd need to make it much more expensive. Which many consumers would find odd: the one with fewer features would cost much more. Granted, those consumers wouldn't be looking at the big picture...but I find many consumers don't. Up front costs matter a lot to consumers.

Paul-ish · on March 3, 2016

As dumb as it sounds, it is probably easier that way. Sometimes in LANs it is easier to get data out then back in. For example, a lot of dorm networks don't support Chromecast devices because chromecasts tries to multicast on the LAN for discovery, but dorms have networking policies that prevent this.

A webcam that sends the data out to the internet then back would avoid the discovery issue by using an external webserver as a rendezvous point.

I don't think people spend a lot of time thinking about their home networking. You could imagine most people just plug in their home routers and it is a crapshoot whether or not the router will support the necessary functionality, whereas a router will always enable communication to the outside world (or people would return it ASAP).

With that said, this seems like a straightforward technical problem that may have technical solutions.

tdkl · on March 3, 2016

Ease of setup for regular Jane/Joe because they know shit all about router configuration. That's why devices just transfer everything over someone else computer a.k.a. "teh cloud".

mahyarm · on March 3, 2016

If you don't care about recording video or video recognition features, the cheapo chinese cams on amazon actually perform pretty well. For $80 you can get 720p video with IR lights, speakers, microphone & it can move around. Usually it doesn't zoom like a dropcam can.

If you willing configure a NAS server somewhere, you can even record the video locally.

hemdawgz · on March 4, 2016

The video quality probably doesn't compare but I've used an old iPhone with iPCamera (i'm sure Android equivalents exist) installed for this purpose, which simply hosts an mjpeg stream at a local IP address. It should be simple to start or stop recording the stream on any device that's connected.

kayoone · on March 3, 2016

Alexa probably uses forms of machine learning and also queries lots of services to find the answers you need. Also it learns from every user and gets better for every user this way. That would be really hard to do with an offline device.

sib · on March 3, 2016

Yes, that is exactly how it works.

If you, as a customer, want to, you can go to Amazon.com and delete all your voice history (or any single interaction).

visakanv · on March 3, 2016

This is probably a function of the amount of bad news you've read over the years about people getting exploited, taken advantage of, spied on, etc. When you're a kid it doesn't even really seem like a thing.

ethbro · on March 3, 2016

When you're a kid, you generally assume people around you are all wonderful.

... Then you gain life experience.

/75% jokingly

danielrm26 · on March 3, 2016

We'll be dead soon. Enjoy the little things.

baby · on March 3, 2016

Nice try NSA.

hyperbovine · on March 3, 2016

Yet I'm guessing you carry a smart phone in your pocket almost everywhere you go.

jegutman · on March 3, 2016

But phones don't have a microphone on them do they? :-)

hyperbovine · on March 4, 2016

Uhm, what?

jegutman · on March 4, 2016

Sorry, I thought the smiley face would've been enough to give away the sarcasm. For some reason the /s felt like it removed the infinitesimal amount of comedy from my post.

bussierem · on March 3, 2016

(should I tell him, guys?)

officemonkey · on March 3, 2016

Don't forget the harried parent of a child with low impulse control.

These things would be a lot less "Big Brother" for me if I had a mic key in my pocket that would only turn the mic on when I squeezed it.

niels_olson · on March 3, 2016

Hey buddy want to bet that Amazon is using this massive collection of voice to text to sell to other companies like Apple and Google?

officemonkey · on March 3, 2016

Riight, because Siri doesn't generate enough voice data for apple.

monkmartinez · on March 3, 2016

The enthusiastic kid would probably get distracted and discouraged when X can not do "What I really want, like Ironman." While the "cranky" old man has been mis-characterized as "cranky" because "cranky" is often confused with wisdom and experience.

wahsd · on March 3, 2016

When you realized that the government was making an all out assault on the most fundamental American rights and the civilian sector did absolutely nothing to assure your privacy and anonymity out of sheer greed and narrow minded foolishness that they would be undermining their own success.

I am sure you would not have a problem using these kinds of systems if it were assured that you could not be tracked or monitored because the devices and systems were secured in overlapping ways.

joshmanders · on March 3, 2016

In hindsight it all sounds amazing and ignoring the spy-possibilties, it gets old fast. I don't use Siri, and I can do alot of this with it. But since I got the first Siri enabled device, I've used it mostly just for joking around and my daughter asks her hockey scores. That's the extent of it.

runjake · on March 3, 2016

Because when we dreamed of this as kids, the thought of the corporations behind these technologies that harvest our data for their gain didn't come up.

edw519 · on March 3, 2016

Exactly when did UnconventionalButTotallyLogical = CrankyOldMan ?

revelation · on March 3, 2016

The moment you clamored for MIT embedded Linux software and the "let's kill all the GPL it's bad for startups" meme came up.

So now this cool audio controlled personal assistant is just another gadget to buy more stuff from Amazon, instead of something you control.

michaelt · on March 3, 2016

Is this voice recognition stuff based on MIT-licensed open source speech recognition? I have a project that would benefit from good quality speech recognition.

revelation · on March 3, 2016

Well, no, that's the other thing: "let's put everything in the cloud so nobody owns anything anymore!".

Voice recognition is done on some Amazon server. If it goes down or changes API in five years, it will render this thing a brick.

danesparza · on March 3, 2016

There is something delightfully ballsy about making this only available to users of Alexa Voice shopping:

"Echo Dot is available in limited quantities and exclusively for Prime members through Alexa Voice Shopping. To order your Echo Dot, use your Amazon Echo or Amazon Fire TV and just ask: "Alexa, order an Echo dot"

Also, this makes me sad. I'd kind of like to try this out, but I have no Alexa voice service currently (I don't think)

xauronx · on March 3, 2016

Even though I own an echo, I wanted to get in early. Here's a link until they remove it: http://www.amazon.com/gp/offer-listing/B00VKTZFB4/

icco · on March 4, 2016

That link still works for ordering :)

mikeash · on March 3, 2016

I don't imagine it will be like that forever. It's just a clever way to limit demand until they can ramp up manufacturing, or work out the bugs, or whatever their motivation is for keeping it in a limited release for now.

GBond · on March 3, 2016

... and also a way to introduce the concept of shopping via Alexa (I would imagine one of AMZN's primary long term goals for the project)

sib · on March 3, 2016

Actually you can already shop via Echo/Alexa today. It's effectively limited to reorders and music for now.

matthewbauer · on March 3, 2016

I think it needs a base Amazon echo to work if I understand correctly.

bovermyer · on March 3, 2016

No, it needs external speakers, unlike the original Echo. However, you only need an Echo to preorder a Dot, you don't need an Echo for a Dot to work.

sp332 · on March 3, 2016

FTA: Includes a built-in speaker so it can work on its own

swores · on March 3, 2016

Built in speaker is for alarms, not media, I think.

wyldfire · on March 3, 2016

it does seem to exclude media.

> Built-in speaker for voice feedback when not connected to external speakers > Includes a built-in speaker so it can work on its own as a smart alarm clock in the bedroom, an assistant in the kitchen, or anywhere you might want a voice-controlled computer

Touche · on March 4, 2016

That's crazy, why do I want this without a speaker? The bluetooth speakers they recommend are all really expensive; a speaker + Echo Dot is more expensive than a regular Echo... why wouldn't I just get a second Echo?

IshKebab · on March 4, 2016

You can plug it into a hifi system.

balls187 · on March 3, 2016

The speaker is for voice feedback only. Doesn't actually support music, news, audiobooks etc.

nogridbag · on March 3, 2016

Do you have a source for that? I have an Echo in my living room, but I was thinking of picking up one of these for my bedroom. I don't really care about sound quality as I would just be using it for Philips Hue, weather, and news.

balls187 · on March 4, 2016

Sure, this was the link that was emailed to me from Amazon, which also included the following text:

> With its built-in speaker, you can place Dot in the bedroom and use it as a smart alarm clock that can also turn off your lights, or use Dot in the kitchen to easily set timers and add items to your shopping list using just your voice

http://www.amazon.com/b?ie=UTF8&node=14047587011&ref_=pe_184...

See the technical details:

> Built-in speaker for voice feedback when not connected to external speakers

My Echo news is a mix of Text2Speech and audio, so I'm not sure that it would work for News.

leecarraher · on March 3, 2016

man that would suck. the computing internals of the echo are less impressive than a raspberry pi -0. The dot has bluetooth and apparently wifi to communicate with speakers and network devices. The real benefit of the echo over other homemade voice command devices like jasper(github.com/jasperproject) is the more proprietary far-field speaker array.

reitoei · on March 3, 2016

> The real benefit of the echo over other homemade voice command devices like jasper(github.com/jasperproject) is the more proprietary far-field speaker array.

Um, and the insane underlying voice API?

eclipxe · on March 3, 2016

No it doesn't

xuhu · on March 3, 2016

I guess it's time to order an Echo.

_1qd4 · on March 3, 2016

Somewhat related, but if I don't subscribe to any of the services listed, this is a pretty useless product for me. I don't listen to internet radio, I don't stream music, I don't order delivery, I don't use uber, there's already 10 million ways to check the weather, and my life isn't busy enough to need a voice-activated calendar.

Is this the future of tech? Like do I need to have some kind of urban-go-getter lifestyle to find use in any of this? When can I get something useful, rather than "thing I already do, but in a new package"?

publicfig · on March 3, 2016

What would you find useful? You seem upset that a product was designed for a user that is not you, but that doesn't mean it doesn't have a use. Subscribing to music streaming services, ordering delivery, using Uber; these aren't incredibly uncommon things just because you don't use them. It is rare for new and exciting technology to just pop out of nowhere. Almost all new products are reiterations of previous products in new and interesting packages, it's just up to you to decide if it's worth moving to.

_1qd4 · on March 3, 2016

Totally fair point! But would you buy an Echo Dot if you only used Uber and didn't use any of the other services? Or if you used 1 or 2 of the services? How many of these services do you need to use before the functionality of Echo becomes apparent?

I want to be a fly-on-the-wall when someone sets one of these up in their home. I can't picture it fitting in with my lifestyle, so I'm curious to see how others would actually use it. Or would it just gather dust and become a conversation piece?

hirsin · on March 3, 2016

I find it fantastically useful for social gatherings in my small apartment. While cooking we listen to music from the Echo, and have equal control over the music selection (vs "Who has the iPhone? Can you turn it up? Oh, it needs unlocked") and timers for cooking. It could be far more powerful with playlist creation.

After that, it's Uber, schedule, and weather on my way out the door. As I leave I ask it to turn off the lights.

So I use at least 5 of its features (and stream Pandora/NPR on it, so 7?), and find it useful. I don't think I would miss it, but I do find myself wishing for it a bit when I'm at a friend's house that doesn't have one.