Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To my understanding, these smart speakers only phone home when you say the keyword, right? They aren't storing or sending everything they hear throughout the day. So the cases where this could be abused by police has to be small. Seems like it would likely be used to verify that someone was home when they said they were because they asked Alexa a question at that time.

In other words, I don't think Amazon ever receives the "wrongthink" Alexa hears, unless you say it directly to Alexa.



I haven't seen evidence either way and I don't own one. It should be pretty easy I'd think. You can measure the traffic volume when nobody is there + 8 hours or so and when you're having lively conversation + 8 hours or so (in case there's some kind of delay)

I know lack of evidence doesn't mean it doesn't exist, but if that came up empty then it'd be hiding pretty good.

Although honestly I'd delay transmission until user interaction and then hide in that noise - it'd be the first thing I do.

Eh, look at the traffic anyway


As you figured out, there would be many options to hide the "secret" calls. There is also no need to send these secret calls from all devices, it could be used only by a set of devices that are owned by "interesting" people. Assuming that is true, if a security researcher looked into things no longer proves anything.


Sure but there's probably provable ways to look for hidden stuff as well unless they are sending empty data.

You can do multiple runs of feeding pre-recorded messages into say multiple speakers and do the trial over many days. Then on a series of other speakers you can do a robust sequence of pre-recorded conversations followed by the same pre-recorded messages at the same time and then do statistical analysis on traffic volume.

I just presume these things are listening to everything and recording everything. I think that should be the general assumption if you bring essentially an "internet microphone machine" into your home.

If not by the company who sold it to you then by 3rd party hackers, clever app developers, or some other group. Every marketer wants to know what their customers are saying in the privacy of their own home.

As a tangent I've long wanted to have fun with this ... start a campaign to start collectively talking about a ridiculous product (say a vacuum cleaner with elephant ears that flap in proportion to the amount of dirt it picks up) in private conversations and see if a company releases it by listening in. "There's significant consumer demand for the dumbo-vac!"


> Sure but there's probably provable ways to look for hidden stuff as well unless they are sending empty data.

Isn't this equivalent to the halting problem? Even with source code, there is a chance the compiler was compromised. In practice, these devices are closed source, so you would need to verify all the possible code paths.

Moreover, we know that NSA coerced phone companies into exposing metadata. What is the probability NSA has not requested backdoors of Amazon, Google, and the like?


I don't think it is however there is an unprovable element here in that it's difficult to prove absence. However, depending on implementation it may not very difficult to demonstrate presence! If you run the basic tests and there's clearly a substantial difference between the two you're done. If there isn't, you need to dig deeper.

This is just the nature of indirect observation. People in the natural sciences deal with this problem all the time.


Apart from targeted spying being a thing, the keyword recording trigger is not so reliable that it acts as a good filter.

https://moniotrlab.ccis.neu.edu/smart-speakers-study-pets20/


Unless these devices have enough performance to do on-device speech recognition, they could have to stream audio on an ongoing basis.

I think the data stream from uploading compressed audio for an extended period would be difficult to hide.


> To my understanding, these smart speakers only phone home when you say the keyword, right? They aren't storing or sending everything they hear throughout the day.

Yeah, but there are mistriggers as well - I think you should see them in myactivity.google.com with Assistant filter enabled.


At least for Google/Nest devices, you can turn on notification sounds.


The hardware is still there, and all it takes is an exploit or a firmware update to disable notifications and enable recording.


The context is accidental triggers. A mass exploit would have millions of ISP cry out in terror as their upstream usage suddenly jumps.


I.e. Apple contractors 'regularly hear confidential details' on Siri recordings:

https://www.theguardian.com/technology/2019/jul/26/apple-con...


Not any more.


> To my understanding, these smart speakers only phone home when you say the keyword, right? They aren't storing or sending everything they hear throughout the day.

1) There has been various cases of such devices being triggered incorrectly and uploading chunks of recordings

2) It's all implemented in software. It's extremely easy for the vendor to enable more keywords or record for longer times.

3) It's impossible to prove that 2 is not happening already in limited cases

4) There is a proven long history of very effective global surveillance programs targeting every electronic device (phones, cell towers, carrier-grade routers, PCs, servers).


To echo izacus's comment, they are very sensible and there's dozens of variations of the keywords that they accept.

We have a Nest Hub, and saying Google twenty times a day was a deal breaker so we all use some other variation that works 99% of the time, and looking at history it accidentally triggers itself a few dozen more times during the day.

It has a real value for us for now, but privacy issues are real in my opinion.


I've taken a more cautious approach to assistants and only tend to use Alexa if it requires a physical keypress first, e.g., a Fire TV 4K rather than having bespoke Echo devices throughout the house. There's _some_ value on the home automation side of the offering until I can make everything closed-circuit around here.

The Google Mini that Spotify sent me, however, went straight into a pile.


>> To my understanding, these smart speakers only phone home when you say the keyword, right? They aren't storing or sending everything they hear throughout the day.

This is the intended behavior.

But - there is a non-zero rate of false positives (when the device detects something that it thinks is the wake word, but is not), in which case, the audio is streamed to the cloud. This audio could (should?) be used for model training to improve the future precision of wake word detection. But, it could also potentially end up being subpoenaed by a law enforcement agency.


The wake-word is different in different regions/languages right? That suggests they either have special silicon for each language (unlikely) or it's programmable. Is it reprogrammable over-the-air? Even if it can't be given more than one wakeword, it could presumably be reprogrammed to use a very common word, like 'the.' When in this surveillance mode it would remain quiet unless you also said the traditional wake word.


Yes, there are multiple different wake words and they can be updated / reprogrammed via firmware updates.


There are some 7000+ languages in the world. Baking recognition for 7000+ words in silicon is not a huge task.


I think it's unlikely that's what they actually did.


The thing is it doesn't need to record anything (to spy on you).

Voice recognition is 'on' all the time as it needs to recognise 'keyword'. All you need then is simple transcription into text.


Yes but that's not records that they actually have. The reason you need a wake word is because that processing is done locally on the device, it's not until you say the wake word that it starts streaming the audio data to the central server for processing.

Its certainly possible for Amazon/Google/Whoever to send your device a firmware update that turns it into an always-on microphone, but it doesn't do that by default


Voice recognition is 'on' all the time as it needs to recognise 'keyword'

Not to say it's not a concern, but that's not really how these devices work – at least in the Alexa case. They're just matching for a specific hotword, rather than constantly performing speech-to-text (which is computationally expensive and done remotely). Think of it more like Shazam or the other audio fingerprinting services – you don't have to actually transcribe the text to understand if a particular word has been heard.


You think it is phoning home with transcriptions 24/7? My understanding is that this is probably inaccurate, and it only phones home when the on-board electronics recognize the wake word.


How many kb of text do you speak per day? It is just gonna be noise compared to when the JS framework of the week is updated. Amazon and Google can hide that.


> these smart speakers only phone home when you say the keyword, right?

Without it being open source, there's no guarantee though?


You can get a pretty good idea from power usage, storage, and internet usage. Unless there's multiple hidden revolutionary breakthroughs in speech transcription or compression, nothing unsavory is happening at scale.


...most didn't know about the Intel ME chip. A designated onboard black-box chip for transcriptions, that doesn't rely on an a server, would seriously benefit tech corps

Would it really be the first time we were lied to/surveilled?

When will we stop giving the hyper-growth oriented Silicon Valley startup world the benefit of the doubt?


No, someone (Google I think) admitted that they collect more than the audio around recognised keywords because in order to use ML to improve their voice recognition they need the data. More than just false positives. But guess what, they have to have humans to determine that, so they have people sitting around listening to very concerning things (the article said suspected domestic violence) and then having PTSD from it.

And if one of them is doing it, they all are, they all think the same and have the same incentives. This entire play is about the data.


I'm fairly sure there have been multiple stories about smart speakers sending things they hear without the keywords.


Could you please stop creating accounts for every few comments you post? We ban accounts that do that. This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html.

You needn't use your real name, of course, but for HN to be a community, users need some identity for other users to relate to. Otherwise we may as well have no usernames and no community, and that would be a different kind of forum. https://hn.algolia.com/?sort=byDate&dateRange=all&type=comme...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: