One of the more difficult features to get right in voice interaction is calling contacts. Common common names, like David or Beth, can easily be picked up by STT engines but more difficult names (like mine) can be tricky to decipher. Alternate spellings of names can also make it difficult for rules to be developed around specific contacts.

One way for some engines to get around this is to ingest contact names. Google on Android does this by looking at your phone contacts. The Nuance Dragon app and Siri also do this.

When we were developing the email application on the Ubi, we were using Google’s Android Speech Recognizer. We added a pronunciation field for contact names that would used for the likely output from Google. For “Leor” it would often pick up “New York” so that’s what we’d put in the pronunciation field.

As Alexa and Google Assistant get to know us better, the need for intervention into the system is lessening. Soon, just be ID’ing our voice, these systems will be able to pull up all of our contacts.

