November 25, 2021
By Pavel Jiřík in Blog
As the world’s first vendor to introduce voice biometrics technology powered entirely by deep neural networks, Phonexia has been involved in perfecting machine learning concepts for speaker recognition for quite some time.
In 2020, we developed a revolutionary version of our cutting-edge speaker identification technology, designed specifically for passive voice biometric authentication of clients reaching out to contact centers. The main advantage of the technology is the ability to verify a speaker over the phone with extreme accuracy after only three seconds of free speech.
Its free speech authentication capability means that a person can start talking to a contact center agent about anything straight away and be verified instantly as they speak. This passive voice biometrics approach is a genius way to enhance both authentication security and the customer experience at the same time. And all that without introducing additional passwords for clients to remember.
It is the most seamless way to strengthen your contact center’s ability to minimize the exposure of your clients to voice phishing and identity theft attacks while improving client experience and promoting self-service over the phone at the same time. It is one of those perfect win/win situations for businesses and clients.
As already hinted earlier, the accuracy of voice biometrics plays a vital role in its deployment in contact centers. The question is, how do you measure it?
Phonexia Speaker Identification’s Accuracy After 3 Seconds of Free Speech
When we released the new generation of Phonexia Speaker Identification, we measured its accuracy on one of the world’s top scientific test sets – the NIST Speech Recognition Evaluation 2016 test set to be specific. The NIST is the scientist’s place to go for testing innovative concepts and pressure these systems as much as possible. And this is absolutely fine. There has to be a way to test your systems to their limits.
Based on these measurements, the Phonexia Speaker Identification technology was able to identify speakers with over 92% accuracy after three seconds of free speech. Although this is an excellent result in itself, it may not always reflect the conditions of the real world.
Especially if you can fine-tune the voice verification solution based on the unique environment of each contact center and its conversational data to improve the accuracy further.
That said, we wanted to find out how Phonexia Speaker Identification would perform when tested on a real contact center’s data without any calibrations.
Therefore, when we got an opportunity to test the solution against a real bank’s contact center data without any adjustments, we happily accepted.
How Did Phonexia Speaker Identification Perform on a Real Bank’s Contact Center Data Out of the Box?
Below, you can see a graph that represents the accuracy of Phonexia’s speaker identification technology released over the past few years, including the latest “out-of-the-box” test of Phonexia Speaker Identification performed on the bank’s contact center data:
As you can see, the accuracy of the Phonexia voice biometrics technology has been constantly improving over the years, with the biggest improvement achieved between 2015 and 2019 when we, as the first voice biometrics vendor in the world, went full-on with state-of-the-art deep neural networks.
Then, in 2020, we introduced an extremely accurate generation of the Phonexia Speaker Identification technology, powered by highly sophisticated deep neural networks, that can identify a speaker's voice in just three seconds of free speech.
And finally, when we tested our out-of-the-box, uncalibrated speaker identification technology on the real bank’s contact center data, we achieved an outstanding verification accuracy of 96% after three seconds of free speech.
Phonexia Speaker Identification technology offers an incredibly robust 96% voice verification accuracy achievable out of the box after only three seconds of free speech.
The accuracy doesn’t stop there, however, but increases further with the speech length, exceeding 97% accuracy after five seconds of free speech.
Therefore, it is entirely possible for businesses to strengthen their authentication processes with passive voice biometrics immediately after the deployment of Phonexia Speaker Identification and focus on further fine-tuning later on.
Even though every contact center’s circumstances and infrastructure are unique in many ways, and the final uncalibrated performance may differ, the cutting-edge performance of the Phonexia Speaker Identification technology can provide organizations with ultimate value right from the outset. And even greater value after calibration.