I focus a lot on English speech recognition. After all, Alexa and Google Home were marketed in the US first and US/Canada have a very “US is the center of the universe” viewpoint.
Things get interesting when you move voice interaction to other languages. Chinese has more words than English and more dialects, which means it likely requires the number of speech samples to achieve the same error rate. Let’s say over 300,000 words vs English with ~180,000 words. The same is when it comes to predictive typing or swipe typing.
One of the things that struck me recently was the high accuracy of Google speech recognition… in Hebrew. I have a non-Israeli accent and make a lot of speaking errors, but the voice transcription required fewer corrections than in English. As well, predictive typing seemed to be much more likely to suggest the next word.