ASR Language Model
In very basic terms, a language model is a collection of words and phrases that you wish to be recognizable by your Automatic Speech Recognition (ASR) component. For simulation systems, this is typically in the form of a constrained grammar model (the pros and cons of different types of language models will be discussed in a future blog).
The Downside of Using a Constrained Grammar Language Model
In a Constrained Grammar application, every needed phrase must be added in advance to the language model. For some applications, this can be a monumental task.
In an earlier post, an ATC simulator for the United States Air Force was discussed. The contract requirements for this simulator called for the ASR to support phrases from the FAA ATC handbook (FAA 7110.65) plus commonly used terminology. In analyzing these requirements, it was determined that for the tower simulation task, there were approximately 60 commands or phrases that a controller might use. For estimating purposes and to add some buffer for commonly used terminology, it was determined that 120 phrases would be enough.
Today, that same simulator has support for literally tens of thousands of instructions. Given that a controller may string up to 8 phrases in a single transmission, there are billions of permutations.
An Example of Langauge Model Implementation
If the analysis showed only 60 phrases, then how did the system evolve to support so many more? Unfortunately for the ASR developer, there are many ways to issue each of those phrases. In one case, the taxi instructions, there were more than eighteen thousand variations of the instruction. Additionally, you can not rely on users sticking to formal terminology, especially if they are students. With the use of unexpected terminology, your system will behave in a manner that negatively impacts training effectiveness.
Specifying Language Model Phrase Support
Can you possibly define in advance all the sentences to be supported by the language model? Quite simply you cannot! Therefore when defining system requirements to be used in a competitive tender process, a different approach is needed. This approach will be described later in the series.
Free Download
Please download our Free Guide, that discusses the complexities of defining speech recognition requirements. The guide provides more detail on the types of language models and the benefits and negatives of each type.