Hacker News new | past | comments | ask | show | jobs | submit login
Almond: An Open, Privacy-Preserving Virtual Assistant (stanford.edu)
181 points by tlrobinson on Nov 29, 2019 | hide | past | favorite | 37 comments



Home Assistant (the excellent open source home automation project) recently integrated with Almond (and Ada for voice): https://www.home-assistant.io/blog/2019/11/20/privacy-focuse...


How do I know which home automation project I should invest my time into? I'm aware of at least 3 FOSS ones.


Home Assistant is the clear winner, in my mind. It's a very robust project (four full-time devs, plus a huge community), very actively developed (https://www.home-assistant.io/blog/categories/release-notes/), and has a very healthy and helpful support community.


They do very similar things, usually even identical. Get the one that does the things you like (i.e. that's ready with very little configuration to do the things you would like for it to do). Maybe look at the community, see if the help given is given freely and positively. That will impact your enjoyment of the software and I feel like that is the most important factor.


There are a lot. Most robust community appears to be Home Assistant. It’s not very point and click friendly yet, though. If that matters to you.


Here's current discussion of Home Assistant: https://news.ycombinator.com/item?id=21665125


Go by community. Home assistant is absolutely fantastic.


Home Assistant is the best one by a long shot. Largest community, most integrations, etc.

Just one more comment for the pile of 5+ that already replied.


I was more impressed by this project's backbone service Thinkgpedia, a common natural language interface to different web services and internet-connected hardware. I'll definitely find use for it in future!

I started using Almond application for Gnome. It delivers on many fronts, but still lacks polish.

Some typed prompts would never get an answer. Some prompts got misinterpreted, for example "tell me a dad joke each day at 9 am" would get scheduled at 12 am. The wake word is followed by audio ping, but there should also be a visual clue.

Still, very cool technology. My favorite skill is "Miscellaneous Interfaces" with queries for random number, opening URL, throwing a coin...

I'll try to stick with this for a week, to automate some OS actions and my git work flow.


After few days of evaluating I see that Gnome client is broken in several fundamental ways and unusable as of now.

    - No way to clear history, it's getting huge now and mostly consists of "Sorry, I don't understand" answers
    - When going to My Skills and back to chat, it scrolls to beginning which is frustrating.
    - Scheduled actions never actually happen.
    - In-app "training" doesn't work, returns "Sorry, I did not understand that. Use ‘help’ to learn what I can do for you." 
    - Online training _could_ work if one could learn their ThingTalk language. My usecase was simple - teach it that "throw dice" is same as random number from 1 to 6, but I could not conjure precise ThingTalk spell.
    - Screenshot, locking PC and other OS actions didn't work on XFCE. Not Almond's fault, but less usable for me.
Are authors dogfooding? It lacks quite a few quality-of-life improvements.


I came across their NSF research proposal abstract [1] and it is indeed quite interesting. I'm curious about their federation model and how exactly they're enabling privacy.

[1] https://www.nsf.gov/awardsearch/showAward?AWD_ID=1900638&His...


Too bad Chrome stopped supporting the web speech-to-text API. But I guess it overwhelmed Google's network/servers!? I remember back in Windows XP? There was speech recognition, where you first had to read a bunch of text in order to train it. It didn't use a online service afaik. And it was decent. With today's hardware it should be possible to do speech to text on the client! eg. without sending the audio to the cloud for processing!


Web speech to text is still available on Chrome... https://www.google.com/intl/en/chrome/demos/speech.html


I get a cryptic error message that just says "network", others seem to have the same problem. Does it work for you??

Here's another demo that shows the error message at the bottom: https://mdn.github.io/web-speech-api/speech-color-changer/


When did Chrome stop supporting web speech to text?


It worked for me until recently. So a few months back... for android apps you need an app key, but there is no way to do that for a web app afaik, maybe if you used the google api instead of the standard web api. But i prefer platform independency and privacy over quality.


Can anyone tell me how this is "Privacy-Preserving"? Like how is this different than what the major tech firms like Google, MS, and Apple offer?


From what I can tell it's available as actual software running on your device instead of an online service. Additionally, it seems to be fully free software.


From what I can read in the privacy policy, speech recognition is performed via an API from Microsoft. So it's both online, cannot be considered fully free software (as a major part is performed by a closed API) and I'd hesitate to call it privacy preserving. At least when using the voice command feature. Written commands should be offline, private and done by open source I think.

Someone please tell me I'm wrong.


From the Privacy Policy:

> The Almond app also makes use of the Microsoft Cognitive Services Speech API to perform speech recognition. This is only activate when you click on the "Listen" button. ... These services are governed by their own privacy policies.

Too bad. I hoped this would have an offline speech recognition engine.

Also with all the talk about preserving privacy I find it disappointing that you have to dig through the privacy policy to see that your audio is sent to MS.


Yeah, with that 3rd party's (and whoever else) policy, Almond is neither "Open" nor "Privacy-Preserving". The title almost seems like clickbait now.

The project itself (Almost) seems very useful though and I would gladly use it if it could run locally instead.


Can you tell me what you found useful?

None of their examples made sense to me... notify me when the nytimes adds an article? So... like constantly? Or filter on a keyword? Like I’m going to know ahead of time all the potential keywords for a given article of interest that hasn’t been written yet? And I need to know that second vs my daily news consumption during the right time?


Check out Snips [0]. My guess is they could bake this in. I think if "smart" speakers had the option for the user to control how they wanted it integrated the consumer would have much control as it would force the manufacturers to compete where it makes sense (the hardware platform) vs what Amazon and Google do today to lock people in and use their data in ways the majority of us don't want it used.

[0] https://snips.ai/


Snips was recently acquired by Sonos, so it's one less independent solution in that space.


Snips beibg acquired by Sonos is not the worst outcome, glad Sonos has a strategy to add their own smart assistant to their speakers rather than being beholden to large players.


I'm doubtful that Sonos will continue to maintain all the open source projects from Snips [1]. Sonos doesn't even appear to have a public GitHub presence. They'll most likely abandon those projects and integrate the software in their devices.

[1] https://github.com/snipsco


Well, somehow I missed this and it's really disappointing. I liked Sonos' products until they had to add in smart speaker functionality. I won't buy their speakers that have a microphone and hopefully more people continually let them know that as well. At a minimum Sonos should offer a physical on/off switch to electrically disconnect any microphone. I don't trust any product in this space not to eventually do the wrong thing and collect more data than they've claimed to limit themselves to. The Sonos products seem to be changing for the worse. I think it's inevitable they're getting to the point where you won't be able to use their products unless they have a valid logged in account. It's truly frustrating all of these products that can operate without Internet phone home check-ins, yet people complacently buy them and agree to ridiculous ToS. Another one down, I guess.


Sonos recently introduced the One SL speaker, which seems similar to the Sonos One but without Alexa.


Yes, that was a great move, it's basically what the One used to be prior to them adding the mic. While they haven't done it yet I hope they don't drop the mic-less larger models. We'll see, from what I know Sonos has a very healthy margin (most speaker companies do) and the need to monetize user data would be viewed by many as over the top.


With all these Speech API's from the likes of Google, Microsoft that operate in similar ways (put data in, get data on what it thinks you said out) I wonder what the difficulty is in interacting them via common interface. That way the user could pick their poison from the choice of speech APIs, or even use a local solution.

Curious if there is examples of something like this already around?


This talks about privacy and data ownership, but I can’t tell from the site if the processing is all local (like Snips or Mycroft) or if data still has to be trusted with a remote party. Maybe I missed something.


The privacy policy answers these questions:

https://almond.stanford.edu/about/privacy


Mycroft is not at all all-local. It also sends audio data into the cloud for processing. It's a disgrace really.


Oh wow! Well, glad I went with Snips then.




It would be cool to integrate the deep speech engine with home assistant. That's one of the big benifits of leon in my opinion.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: