June 26, 2019
By Pavel Jiřík in Blog
Our fastest, 5th generation of Speech to Text and Keyword Spotting technology has been out since last summer, and we have been gradually extending the engine to support more and more languages. As we added two more languages to the engine last week, it is a perfect time to explore its performance in more detail.
As mentioned above, the 5th generation of Phonexia Speech to Text and Keyword Spotting technology is the fastest generation, providing much quicker transcription and keyword-spotting capabilities.
It currently supports English, Russian, Czech, Slovak, Polish, and Dutch languages, with more languages coming soon.
The very first language upgraded to the 5th generation was the Polish language back in July 2018, allowing Polish speech to be transcribed to text seven times faster than in real-time.
The keyword spotting achieved 29x FtRT (faster than real-time) performance. In other words, a single CPU core would only take one hour to search 29 hours of Polish speech for specific keywords.
We then extended the engine with the Czech and Dutch languages in December 2018. Both languages are currently achieving 7x FtRT transcription performance and 29x FtRT keyword spotting performance, being on par with the Polish language.
Then, spring 2019 came, and it was time to release another Slavic language—Slovak. Our engine managed to transcribe it even faster than the previously released languages, delivering 9x FtRT transcription performance. The keyword spotting performance was 27x FtRT.
Last week, we added our latest language extension—the Russian and English language. Our engine achieves 8x FtRT transcription performance for both languages and, even more impressively, 40x FtRT for Russian and 47x FtRT for English language when keyword spotting!
Check out this summary table:
So, what's next?
We are now working on the Latin American Spanish language and expect to release it during this summer. But stay tuned as more languages are on the way!