A Voice-first Future Is Coming

November 14, 2018

By in Blog

Digital assistants are slowly integrating themselves into our homes, but they are far from being truly intelligent. Companies are considering using voicebots in their customer support but are afraid that voicebots are still a bit too clumsy to talk with customers. How far off is the future of smart and highly cooperative robots? And how is a voice-first approach going to change our lives?

People started communicating verbally much earlier than they started writing or typing text messages. Voice is still the most effective way to communicate and transfer ideas from person to person. No wonder we’d rather speak and listen than read and write. That is also why video content on social media wins over text articles.
Therefore, a voice-first future would be a logical step forward. For decades, technology was not ready to support it yet. Until now…
Let's see, which scenarios could be opened by adopting a voice interface into our daily lives.

The Current Landscape

Before we get into the future, let's remind ourselves about what is speech technology used for today—at least in big countries where technology is omnipresent.

Voice search. Typing is slooow, especially on mobile devices, no wonder more and more people are using voice as the primary input when searching for information.

Dictating. And not even just for voice search, but also for dictating their short text messages. Journalists often dictate articles and then just correct the output. Also, pathologists dictate their autopsy reports during the procedure, so they can save time and perform more autopsies.

Speech analytics. There is much more information hidden in a voice than the transcription itself. Many of today's contact centers segment their customers automatically based on speech metadata such as age, gender or spoken language. Would you like to know why middle-aged men are calling your contact center today? Or what problems teenage girls solve with your agents? It’s no problem at all with Business Intelligence tools over speech technology outputs.

Voice biometrics. Simple voice commands via personal assistants can be messy if you can’t confirm the identity of the speaker. “Buy a Tesla,” could ruin your wallet if it were said by your child, who should not be authorized to shop via a digital assistant. Today, simple voice biometrics is used within assistants and more robust biometrics at contact centers of progressive financial institutions.

The Future

Now, let's investigate the future in greater detail.

Voicebots. Some of the calls at contact centers can already be automated today. And we will see more and more voicebots (voice chatbots) over time. The simple voicebot is just an interactive and more convenient IVR. But one day, talking to a human at a contact center will become a premium service. Voicebots will dominate over text chatbots since we are more used to talking than typing. Even in industry—warehouses, production lines, CNCs— there will be a high level of automation and voicebot conversation used. A significant rise in voicebots will be seen in 2020.

Contextual voice commands. Yes, there are some simple voice commands available today with current digital assistants. And the number of their skills is growing. They are helpful but not truly "intelligent". What would happen today if you said: "Order me a pizza, I will be at my girlfriend’s house…"? What kind of pizza would be ordered? Where would it be delivered? Does the system consider that I have a new girlfriend? Intelligent systems of the future will not need a five-minute conversation to clarify all options. They will work even with short utterances, using additional information about the user, adapt based on the history, verify the identity, detect the mood, etc. They will understand us in the same way that our mother does. Many of these approaches are being used even today, but reaching a decent level is expected by 2021.

Smart augmented reality. By 2022, warehouses, complicated assembly lines or service repairs will use augmented reality by and large due to handsfree interactivity thanks to a voice interface. Augmented reality will finally get smart.

True AI assistance. Remember Jarvis from Iron Man? I would love to have one. Technology will be assisting people in natural ways, doing more and more things alone. A true understanding of content will be important for AI to perform well, especially while supporting you in your job, in an enterprise environment, where more people can be present at a meeting, having tons of requests, personalized. Your AI will need to cooperate with the AIs of your colleagues. With the complexity of the problem and need for infrastructure changes, we won’t see collaborative enterprise digital assistants before 2023.

Biometrics everywhere. With the rise of voice interaction, the need for authorization will increase exponentially. Everything that is going to be said to a computer will have a “voiceprint” confirming the identity of the speaker. Thanks to voice biometrics, a lot of interactions will be faster and more convenient for users. We can expect biometrics everywhere by 2023.

AI on device. The need for intelligent homes or cars or office robots will grow significantly in the upcoming years, the number of connected devices (Internet of Things - IoT) will explode and so will traffic over the Internet. For a critical environment where you need to respond quickly, solely cloud computing won’t be the right path. More often, a blend of cloud and on-device computing will be seen, so-called edge computing. But how do we make the robots " think" on a relatively cheap embedded device, strongly limited in computing power and memory? The "miniaturization" or simplification of the technologies will open up new possibilities and ensure a higher degree of AI adoption. It will be the same thing that happened when mainframe computers turned into PCs back in 1970s and 1980s. We will interact with technology in the most natural way possible, the way we interact with other people.

“Speech technologies are a key enabler for future interaction with technology”

The Shape of Digital Things to Come

It is a fact that speech technologies are commonplace but that's just a start since speech technologies are a keystone and key enabler for the future. A future where AI is omnipresent. A future where speech is the most natural human-computer interaction interface. A voice-first future.

Share now!

Recent posts