VRX - The Verbyx ASR
All Verbyx speech recognition products are built upon the VRX ASR platform. VRX is a high performance and modern ASR that supports multiple operating modes. VRX modes include statistical language models (SLM), context free grammar and keyphrase spotting (VRX SKIP).
VRX is a W3C standards based ASR supporting both SISR and SRGS (XML and ABNF). We have no desire to lock our clients into a proprietary infrastructure. Using standards based technology is the most effective way to protect your investment.
A Multi-Function ASR
Context Free Grammar
The VRX Context Free Grammar (CFG) implementation is a Verbyx design, providing superior accuracy and speed. The VRX Dynamic Constrained Grammar System (DCGS) utilizes a number of features to achieve this leading performance. VRX Auto Flatten Grammar will determine the resources available and dynamically flatten the grammar for best performance. Tri-phone level crossword handling across sub grammar boundaries, minimizes the need for application specific tuning. VRX has no artificial grammar constraints and is being used by a client with one of the largest deployed CFG applications to date with over 75,000 supported phrases.
Verbyx developed SKIP to satisfy the demands of a customer with specific application requirements, that could not be met with their existing ASR systems. SKIP is best suited for fast-time detection of words and phrases. It is suited to providing actionable information in call-center applications or for producing a searchable index from large quantities of audio. SKIP can process audio at up to 100x real time using an single Intel Core i7 processor core and can be scaled to run multiple instances on a single AMD or Intel processor. Verbyx designed SKIP to be simple to use. There are no requirements for complex statistical language models.
CFG provides the best approach for accuracy, but it demands that supported sentences be pre-defined. This can be a lengthy task in applications that require large grammars. The statistical language model is a more appropriate solution for applications that require a less rigid approach to sentence recognition. As SLM does not require pre-defined sentences, it is often quicker to implement than CFG. However, significant quantities of subject relevant text are required to train effective language models. Statistical language models are best suited for natural language applications, such as video subtitling and call analytics.
Verbyx VRX - Speed, Accuracy, Scalability!
Key VRX Benefits
Acoustic Models Adaptation
Rapid Model Adaptation Process
VRX adaptation can provide meaningful accuracy improvements with as little as 15 minutes of data. VRX adaptation is particularly effective when adapting for strong accents or individual speaker characteristics.
Dynamic Auto Flatten
Automatic Resource Optimization
Running with flat grammars can provide performance improvements to your application. Unfortunately in large complex grammars, flattening can lead to memory resource issues. VRX Dynamic Auto Flatten will flatten the grammar automatically to optimize performance and make best use of available memory resources.
Definition of Undefined Words
Auto pronunciation is a useful tool for quickly generating a list of possible pronunciations to be included in the ASR dictionary. For best performance, auto generated entries should be evaluated and adjusted by an expert prior to their inclusion.
Accurate Out-of-the-Box Performance
Full crossword modeling in speech recognition is one of the most important features for system accuracy. VRX crossword triphone modeling works with flat and non-flat grammars and more importantly reduces significantly the need for extensive tuning in your speech deployments.
Micro Voice Models
Reduced WER for Digits and Alphabet
Micro Voice Models were developed by Verbyx as a means to handle poor accuracy in traditionally difficult to handle areas such as digits. Micro Voice Models can provide a dramatic reduction in word error rates.
Reduced Data Acoustic Models
Acoustic models require hundreds and quite often thousands of hours of transcribed audio for the training process. This volume of training data is most often difficult to obtain and very expensive to produce. Verbyx has developed a proprietary technology that produces remarkable performance from as little as 10 hours of data.
Partial Feature List
VRX is by the nature of the task, a complex software package with a lengthy list of features. Our software development team have produced a highly robust and reliable package.
VRX has dynamic grammar support with very fast run-time compilation and operates in real-time (low-latency) and batch (max throughput) modes. When running multiple concurrent engines, each can use the same or different configurations. VRX is highly optimized for modern CPU's utilizing SSE/AVX instruction sets.
VRX can handle huge grammars and has no arbitrary size limitations. Along with W3C compliance for SISR and SRGS, VRX supports ASCII, UTF-8, UTF-16, UCS-4 and ISO-8859-1 character-sets.
- Multiple concurrent engines
- Partial and N-best hypotheses
- Multiple character set support
- Ultra-large grammar support
- Optimized for SSE/AVX instruction sets
- Dynamic grammar with fast compile
- C++ and C API