I recently vacationed in Japan, and I found that using Google Translate on the phone wasn't practical because the other person would have no clue what was going on or how they are supposed to work with the app. The language auto-detect was abysmal; half the time it thought Japanese was English. They would start responding to the translated statement before the app played the chime, and I'd have to rudely interrupt them and fumble with the app to get it to start listening again. They were never sure when it was "okay" to start talking again.
In the end I tended to communicate everything I really needed by holding up the number of things I wanted on my fingers and pointing and smiling, caveman-style. They would usually know just enough English to say, "Two? Ok." Then the price of whatever I was buying would show up on the register surrounded by mystery Japanese characters, but the numbers were Western Arabic, so that was all I needed to know.
One thing I learned is that you can get by with very little knowledge of the local language to successfully travel and eat.
I had the opposite experience. Everyone I tried to communicate with via google translate responded positively except one disgruntled train station attendant.
I would say excuse me, I don't speak Japanese can you please help me. And then I'd show them my phone with the translated version of whatever my query is and then press the mic button and they'd get the point that they can speak now. Worked great.
In my experience the whole thing works better with people typing. Firstly voice recognition isn’t perfect. The person has to check what they say was transcribed correctly and can’t correct it with a keyboard if they see it’s wrong the first time, they have to just try again.
Typing means people go a bit more slowly and think about what they are writing, sometimes reworking sentences to be clearer.
Also if they are typing in front of your face, the translation constantly updates based on the partial sentence they’ve typed and this can be quite revealing, which is useful if the final translation isn’t crystal clear.
For face-to-face it wasn't great because speech recognition is error prone. Usually I would always type and the person I'm interacting with would speak and if they saw the text was wrong they'd run it again.
For chatting over sms/whatever it was great. Enough to have complex conversations and get to know people that I couldn't possibly communicate with otherwise. It's not perfect, but it's a lot better than nothing if you aren't proficient in the local language.
Also your mileage will vary by language. English-Japanese seems to be a pretty good combination. English-Chinese and English-Russian are a lot more flaky.
This is different. I just tried it and the assistant translation isn't turn based. It auto detects the language after each pause and translates it to the other language. No chimes either. Just talk normally as you would.
Japanese is really difficult to translate. There's so much unstated context, totally different sentence structure, concepts that don't exist in English. I use Google Translate a lot with Japanese, but more to check my work. I'm not sure it could even really work with this kind of interface (sentence a in -> sentence b out).
Check out jisho.org- it will translate and call out sentence structure, and is clearer about assumptions.
I almost wish Google Translate provided a confidence measure.
There are specific translator hardware like Travis Translator[1] which could resolve the awkward input issues, I wonder whether they could stand up-to Google’s translation prowess.
A couple of weeks ago I was going to Human Resources on the other side of campus and there was a Chinese family wandering around, obviously lost.
The mother showed me her phone with some Chinese-language map app that I'd never seen before. It indicted that there was a shopping mall where we were standing. Obviously, her map app was wrong since the company has been at this location for 30 years.
But I was able to say to my phone, "Hey, Siri. How to you say 'I'm sorry, there is no shopping center here.' In Chinese?" And then I held my phone for her to see while Siri both printed out the translation on the screen and spoke it to her. I said a few other hopefully helpful phrases to her, but she seemed happy with my guidance and did lots of smiling and nodding.
(I assume the article is about the Google version of this. I wasn't able to read the article because Wired popped up so many ads and DIV modals on the screen that there wasn't any actual story text.)
It's useful, but after so many years the state of machine translations and speech recognition in general is still not exactly reliable. It's like it doesn't have context or doesn't know how to apply it. I've heard success stories like this before, experienced them a few times as well, but most of the times the experience for me is subpar to the point it gets so annoying and needs so much manual intervention I gave up, thinking I'll just try again in 5 or 10 years and see if it's any beter.
Maybe my accent or pronounciation sucks but I tried getting Siri to write down text messages about 10 times. Most of the times it was close, but none of the times the words were 100% correct and in more than 50% of cases that led to the produced sentences not conveying the original meaning. Same for navigation. Names of cities (in Europe) seem problematic, like confusing Miltenberg (DE) with Milton in Canada or so. Similar for Google Translate. Our Portugese taxi driver didn't speak English and was worried about getting us to the airport in time. His phone showed us he was worried about the weather. I get 'tempo' can mean both, but it's these subtle differences technology still is lacking.
My wife's name is Nada. I pronounce it nA-da and siri says nah-da. If I don't use the siri pronunciation it won't find the contact. Took me a while to figure out that work around.
I have a similar problem with my car's native voice recognition. But Siri gets both the recognition and pronunciation correct. I wish my car had CarPlay.
Perhaps an issue with her maps app applying GCJ-02 or BD-09? Apparently the "in china" check for the noise function is a simple bounding box, which includes much of the surrounding countries.
Isn’t the title wrong? The audio is sent to the server so it’s not “on my phone.”
This type of translation was already available “on my phone” completely offline (written, with Google Translate)
I use translation often and I was hoping to finally have an easy way to have a written conversation, but this still doesn’t show the right keyboard when picking the language in “Keyboard” mode.
The title is about expanding this feature to the phones (it was before available only on Home devices) so in that context, it did start working on the phones.
Assistant (just like the counterparts Siri and Alexa) is always using server for processing and users of them are aware of that.
How is this different than picking up the phone and having your convo go through ATT/Verizon networks? or using your ISP? Both parties can "legally" work with authorities to wiretap you?
Are you worried that the (training) algorithms that run on your voice somehow end up leaking your identity? Or are you worried that someone at Google knows your voice?
Also, not sure if Google has this fact in their ToS but if they do, what is the issue?
Phone networks are not primarily advertising agencies, and they're regulated as utilities, so the difference is nontrivial.
And the personalized risk isn't from random strangers knowing your voice -- it's your stalker ex, or other bad actors, who might get way more insight into your life than you want
> Or are you worried that someone at Google knows your voice?
Google already knows who you are, now they know a bit more about you, including what you talk about, and of course your voice which probably can be used to locate you with all those "smart" speakers around.
Please don't compare tech companies with telecommunications, banks or other sectors. There are hundreds of years of laws and jurisprudence regulating all other sectors an "basically none" regulating tech.
For the record, I don't what to need an ad tech support and feel like the no named someone on the other side knows that I had sex with my wife last night.
Am I asking too much? Is tech ignorance my only refuge?
Will I need to censor myself all the time because I can never know if/when some bad it ashaming situation may occur?
Can't people really see how 24/7 digital surveillance isn't healthy?
Haven't we learned already society can't blindly trust corporations?
I saw this video before its public release circa 1993. Autotranslation, virtual agents, realtime video conferencing on handheld devices, virtual reality... all there. I was finishing my postdoc at Bell Labs where it was shown to us as a glimpse into the company's future plans. I didn't know where the bandwidth would come from and neither did the presenter when I asked except to say that "It will have to be built, won't it?." Needless to say I didn't have the foresight to invest in it, either
Interestingly, 1993 is also the year a rather large research endeavor began in Europe, paving the way for the kind of on-the-fly speech translation technology that is now built into Google Assistant:
"However, there's always a chance Assistant could accidentally start recording snippets of conversations and therefore potentially sensitive and identifiable information. " - This is why security people want a hardware microphone switch.
The Google Translate app has had this feature for years. Near real time audio, and video translation too! It works very well for many languages but not especially well for Chinese (admittedly a harder problem than Spanish).
When I was in Tokyo a bartender chatted with me for a while using a hand held stand-alone device that did voice (audio without relying on text) two way near-real-time translation.
It worked really well and I was confused I have never seen one before.
It's interesting that all the positive Google Translate experiences shared here have to do with Chinese and Japanese. In general, my experience with translations of European languages has been abysmal, for anything but the simplest expressions. The resulting expressions are often so ungrammatical that they were basically unintelligible.
I remember a Greek taxi driver who picked me up for a long trip, and seeing the distance, initially assumed I was going to the airport - and asked about it, prompting a flurry of No no nos from me and pointing on a map. He later tried to use Google translate to explain why he had assumed this (Greek to English) , but what came out was so garbled I only understood that he was saying something about distance and airport, prompting another confused flurry of map pointing (he abandoned the hope of explaining the initial confusiom and resigned to just driving...) . It was only minutes later, trying to think about what had happened, that I finally puzzled out what that translation must have meant.
For translations where one side is not English, Google Translate does a horrible job, because it will incorrectly translate to English first, then incorrectly translate from English to the target language. This means you end up with translation errors only comprehensible to someone who speaks all three languages!
But even without that, even for what should be simple translations from English to e.g. Swedish, it makes so many nonsensical errors. Not just misunderstanding context, but fabricating novel and absurd translations of common words.
I think it's gotten worse since they switched to their whole-sentence neural net system. At least in the past, the individual words made some sense, and you could click on them individually to see other (sometimes more accurate) alternatives.
Ah, from the title I thought they figured out how to do it locally on your phone, without being connected to Google. Knowing they are only into making Ad products aka tracking, it would be a shame to get used to such a nice free offering. Thanks but no thanks, I guess us suckers will have to take a few hrs to learn a few of the local language phrases.
It would be great if OCR got to the point where you could just point the phone camera at a sign and have it output 1:1 what is in front of you with translated text, like a magic little window frame. Shit still struggles with parsing PDFs though so I'm not holding out too much hope, though.
In the end I tended to communicate everything I really needed by holding up the number of things I wanted on my fingers and pointing and smiling, caveman-style. They would usually know just enough English to say, "Two? Ok." Then the price of whatever I was buying would show up on the register surrounded by mystery Japanese characters, but the numbers were Western Arabic, so that was all I needed to know.
One thing I learned is that you can get by with very little knowledge of the local language to successfully travel and eat.