One of the issues with voice interaction is that it has a command-response type of interaction. The system is waiting for a command.
The next breakthroughs to come to AI systems, whether they be Alexa, Google Assistant, Siri, or others, will come from anticipating the users’ commands before they’re issued. Today, this type of anticipation can be programmed for users.
People can use IFTTT to program their lights to go on if their phone connects to their home WiFi. However, this is still a manual process to setup. Ideally, companies working in this space will seem smarter if they can start to find areas where they can seek out the creation of rules and automatically implement them. If I’m always commanding Alexa to turn on the lights in my office when I arrive, can it just figure out how to do this without me commanding it?
The other area, beyond just anticipating and actuation commands, is to use time and request history to bias the ASR and NLU of a voice interactive system. In the same example, if “Turn on the light” is almost always commanded between 9–11 AM on my device, why not have weigh the STT results more heavily towards that. Also, it’s possible to increase the likelihood for any intent that matches the command I usually issue at that time.
It’s likely that anticipatory programming is going to be the start to these devices become “scary smart”.