Still impressive. It works pretty well and without that cloud that Google likes to tell us we really need.
Google is a bit better of course with many common expressions but I'm sure that can run locally too if they'd want to. Mozilla just has don't catching up to do because they don't monetize our data. So less budget to work with.
> Still impressive. It works pretty well and without that cloud that Google likes to tell us we really need.
This is still using Google's cloud to host the models and your browser has to repeatedly download them on demand. We shouldn't need to depend on Google at all, but with Firefox Translations we still do and they're still collecting data about us.
I think this comment is the prime example of Firefox being unable to do an objectively and unqualified Good Thing without a million people showering hate into the comments.
It's not just that I have high expectations of firefox, they claim to have high expectations of themselves. They heavily market themselves as being privacy friendly and often they have been, but they aren't always.
In this case, I agree that this is, largely, a "Good Thing" although not unqualified since some number of users who wouldn't have otherwise will end up repeatedly sending data to Google, probably without even being aware of it. The data they'd give up is (to me at least) small compared to the data they would have been surrendering to online translation services, but that's not really the point.
It just don't understand how they stared from "Protect your privacy from sites like translate.google.com by using this add-on to translate webpages locally!" and ended up at "Let's make firefox users connect to Google's servers every time they use this feature!" If you're creating a product designed for people concerned about their privacy, it should beyond obvious that making your users send data to Google is a problem.
It's not like they couldn't host those files themselves at mozilla.org or (as others have pointed out) just keep them locally and avoid making a bunch of unnecessary connections to a remote host entirely. If they'd done that it would also allow Firefox Translations to work when you aren't connected to the internet.
It's really not hate though. It's love and concern. I love Firefox, and I want it to do better!
>It just don't understand how they stared from "Protect your privacy from sites like translate.google.com by using this add-on to translate webpages locally!" and ended up at "Let's make firefox users connect to Google's servers every time they use this feature!" If you're creating a product designed for people concerned about their privacy, it should beyond obvious that making your users send data to Google is a problem.
Don't you think that except for the PII data which shouldn't be used for training at all those (training) datasets can be stored at any place and it does not make a difference from the privacy point of view? Or I wrongly interpret their purpose...
Models are downloaded only once and then cached, and not repeatedly like the OP mentioned. Source: Me. I've developed it. If you disagree, are seeing a different behavior or have further questions, please reach out in the repo: github.com/mozilla/firefox-translations/
Thanks once for the response, and eleventy times for actually developing a non-cloud translation thingy. As for the caching thing I was really hoping this was the case so I guess that makes it three.
Good to know! I still hope you can find a better place to host the files, but it's nice knowing the problem only happens once per file (so long as the cache remains anyway)
Yeah they should just use another cloud to serve the files. Using your main competitor is really disingenous, because they can glance all kinds of usage data from it (if not more)
I'm not sure why this is done because this kind of filehosting is easily replaced by something more privacy-friendly.
We retrain models as we get new datasets and only if they improve, which is not common. So far we haven't updated any model. When it's time, then yes, they will be updated, but it's definitely not a frequent process.
> But does Google upload what you translate later?
If you use Google Translate, of course it does because everything is done on their servers
> Would be cool to have Firefox Translations integrated into TOR.
Tor Browser is just a forked firefox so this should not be too difficult. I believe they disable addons by default because they can leak data and they can't check all addons for this. Not sure if you can switch it back on though. I suppose they could validate this one as it's so important. I would recommend submitting a feature request to the tor project.
>> But does Google upload what you translate later?
> If you use Google Translate, of course it does because everything is done on their servers
As mentioned by GGP, the Google Translate app for Android (at least) allows you to download the model for a given language (pair?), after which you no longer need any kind of Internet connection to translate. That implies everything is done locally, not on Google’s servers. GP’s question was whether the app will still save your queries and submit them once a connection becomes available just to scratch that data collection itch.
> GP’s question was whether the app will still save your queries and submit them once a connection becomes available just to scratch that data collection itch.
disclaimer: googler
This can be tested. Translate shows up in your Google 'My Activity' page, so you can do some offline translations, then switch the network back on, and see if the translations show up in My Activity. Assuming you can trust the My Activity page to be complete and accurate (my opinion is you can, but i would say that)
and FTR: I've actually just tried it and offline translations do not show up in my activity so I highly doubt they're being surreptitiously uploaded.
>As mentioned by GGP, the Google Translate app for Android (at least) allows you to download the model for a given language (pair?), after which you no longer need any kind of Internet connection to translate.
This isn't true. Google claims this, but it just doesn't work that way: I've had many, many cases of trying to translate stuff with a bad cellular data connection and it doesn't work, even though I have the language pack downloaded.
I don't think offline translation kicks in automatically when you have a bad (as opposed to no) connection. You can easily verify that it can translate without any connection (both on iOS and Android) by downloading the language and putting the phone in airplane mode. (At least, the basic text translation works fine. The more advanced features, such as speech and image translation, don't.)
Also, Microsoft's Translator app can do the same (offline translation for text) and IME is about on par with Google).
>Also, Microsoft's Translator app can do the same (offline translation for text) and IME is about on par with Google)
Interesting, I'll have to try this.
Well, I tried installing the app and using image translate mode on some Japanese and the results were not very good, not nearly as good as Google Translate. I'll try it out later with regular text.
I also looked at the phrasebook feature. That's a pretty neat idea actually. However, for some really strange reason it defaulted to showing me phrases in Spanish. I have no idea why it thinks I would want to speak Spanish (My system language is English, and I live in Japan, so obviously I want to convert to Japanese. No one speaks Spanish here.)
> using image translate mode on some Japanese and the results were not very good,
I think the honest truth is that Japanese is the ultimate challenge of any translation too.
My Japanese friends tell me that DeepL is about as close as you will ever get to a passable translation quality.
But DeepL does not do image translation.
On a recent trip to Japan I installed six image translation apps on my phone.
None were perfect, I found Naver Papago to be the most consistently usable (although it was far from perfect).
Interesting observations I made during the extensive testing:
1) The majority of image translation apps don't like Japanese when written vertically, I found they perform best with horizontally written Japanese.
2) All image translation apps *REALLY* don't like hand-written Japanese. Some of them *MIGHT* translate *SOME* of the text. But really all of them only really work consistently with machine-printed text.
The other issue with deepl is that it has limited language pairs. I wonder what limits it. The language I'd like should have enough of a corpus of text.
That’s just bad programming. Turn on Airplane Mode and it will work. A bunch of apps won’t even try to use offline data when they’re “online”, even if the connection is 1 byte/second.
It’s not bad programming if the server has a bigger better model, thus gives better results, and the local model is just a lower quality but smaller portable model.
That said, let my give my HN 2c and say that Google Translate is pretty bad these days. It’s community/user adjustments, for example, are guaranteed to be bad. In Spanish, you instantly know you’re looking at a user “correction” because the translation has no accents. “como estas”. It’s bad in 100% of cases, every time I see that “user verified” symbol.
I think the offline model doesn’t have the user adjustments, but the offline model also seems to be lower quality. Back when I translated a lot, I used to know when my internet was offline mid session because of the difference in translation quality.
So I ask for a translation and it fails because it times out, giving me an error. And you call that good programming?
I get it that the server translations are better, but currently I’m not seeing any translation at all. You, Google Translate developer, should catch the error and show the offline translation instead.
Oh, I see. By “doesn’t work” I thought they (and you) just meant it still hits the server even though you have a model downloaded.
Yeah, on a spotty mobile connection, most services tend to be optimistic that it’s better to wait than to assume your internet is down. iOS online/offline callback is very optimistic, probably because for most services, trying something in a degraded 20b/s conn is better than giving up and going “sorry, no internet.” (Funnily enough, the iOS App Store gives up way too soon)
So I agree. I think the right thing to do is to do an instant translation with the local model, when available. Maybe a cherry on top is to see if the server has a better translation in the background.
you can see the model sizes here: https://gist.github.com/jelmervdl/1a48816e4c3643ff5d9e1fd682...
they are like 15MB per language pair each way