It could also store compressed voice waveforms in such a way that any reproduction from the compressed data would sound horrible but would be at least somewhat intelligible to human listeners.
1200 bits per second is almost enough for toll-quality speech -- and I'm referring to the state of the art a few years ago. Speech codecs are probably better now. But let's stick with 1200 bps. That's enough to store continuous speech in the vicinity of the device for a year, using only about 5 GB.
My guess is that if you cared only about intelligibility and not fidelity, you could do the job with 10%-20% of that space.
So yes: Alexa could easily be collecting and storing a vast amount of data that isn't immediately transmitted or used.
1200 bits per second is almost enough for toll-quality speech -- and I'm referring to the state of the art a few years ago. Speech codecs are probably better now. But let's stick with 1200 bps. That's enough to store continuous speech in the vicinity of the device for a year, using only about 5 GB.
My guess is that if you cared only about intelligibility and not fidelity, you could do the job with 10%-20% of that space.
So yes: Alexa could easily be collecting and storing a vast amount of data that isn't immediately transmitted or used.