April 27, 2022
By Pavel Jirik in Blog
In alignment with Phonexia’s core purpose of solving everyday challenges through voice, we are happy to announce that our Speech to Text technology is now available on the Amazon Web Services (AWS) Marketplace.
Until recently, Phonexia Speech to Text was available only as a solution that needed to be installed on premises (or manually to a private cloud).
From now on, however, businesses of all sizes can access our cutting-edge Speech to Text technology natively on the AWS Marketplace and build sophisticated voicebots, virtual assistants, and speech analytics with ease.
Starting today, you can delegate the complexities of Speech to Text installation and hardware setup to Amazon Web Services and start using Phonexia Speech to Text (STT) technology in just a few moments. All that from the familiar Amazon Web Services interface.
Instantly Ready for Your Projects
We have made sure our STT is extremely easy to use in your projects:
- Phonexia Speech to Text’s API has been greatly simplified to keep you focused on your project rather than complex API requests.
- There is no need to sign an NDA or contact our sales.
- The whole process from buying to installation can be finished within a few minutes.
- The STT API can be easily explored via a built-in Swagger.
- You can even test the STT without writing a single line of code!
Which Phonexia Speech to Text Languages Are Available on the AWS Marketplace?
We have expanded the AWS Marketplace with our current best, sixth generation of Speech to Text (STT). It supports speech transcription in the following 14 languages:
- Arabic (Gulf)
- Arabic (Levantine)
- English (US)
Additionally, the sixth generation of Phonexia STT offers conversational AI-friendly features to enhance Speech to Text accuracy even further.
This feature is especially useful when building voicebots and virtual assistants. It enables developers to define a list of words and phrases that will be preferred over other (similar sounding) words that may appear in the speech being transcribed (real-time or postprocessed).
For example, the STT for a voicebot that helps bank clients in emergency situations could be easily customized to favor the word “card” over other words (such as the word “cart”), resulting in the preference of “I lost my card” over “I lost my cart” speech transcription.
An identical approach can be used for enhancing STT performance in speech analytics use cases where the context of audio recordings can be anticipated to some degree (e.g., emergency calls vs. booking requests).
Preferred Phrases allow powerful finetuning of Speech to Text accuracy based on the unique specifics of each speech transcription use case to ensure the most optimized STT performance.
Real-Time Addition of Unknown Words
The sixth generation of Phonexia STT also comes with real-time addition of unknown words functionality.
Akin to Preferred Phrases, developers can define a list of custom words before each Speech to Text request to enhance its accuracy further.
Typically, it can be used to add product names, industry-specific words, and even slang expressions to the default STT dictionary.
However, what makes this feature especially useful is the fact that the custom words can be added to the dictionary instantly at any time (real-time).
Therefore, the developers of conversational AI solutions can customize the Phonexia STT dictionary dynamically on the go and adjust the STT accuracy easily whenever a situation calls for it.
Classes in the Czech Speech to Text
As a Czech company, the Czech language certainly has a special place in our hearts. Therefore, if there are any speech recognition advancements ready for production, it usually receives them first.
This is the reason why our Czech Speech to Text is the first to offer this additional accuracy improvement—Classes.
How does it help? Simply said, Phonexia’s Czech STT is enhanced to recognize Czech names, cities, and other words (classes) specific to Czech culture and nationality.
Not only does it dramatically improve the STT accuracy, but Classes also simplify the work with Preferred Phrases.
Instead of listing multiple variations of the Preferred Phrases such as “my name is Paul”, “my name is Peter”, and “my name is Charles”, you can use a related type of class instead. For example, “My name is ‘first_name’”.
Let’s Go Transcribe Some Speech
Here you have it. Our most advanced STT technology is now easily accessible on the Amazon Web Services Marketplace and ready to transcribe human speech.
We look forward to having it used by businesses and developers for their conversational AI and speech analytics solutions!