It will be interesting to see whether this is about the app (i.e. implementing similar functionality somewhere in the iPhone OS or even offering it as a extra download) or about the technology (i.e. improving voice control). I bet itβs rather more the last thing.
I don't think it is about the technology (improving voice control) because as you can see in the video the speech recognition is from Nuance.
My guess is that it is about the integration of all these webservices and APIs and the agent behind it. There is some research going on at Stanford about this, here is a presentation http://logic.stanford.edu/talks/Wizard/