Speech Recognition in noise

The accuracy of speech recognition systems has increased to the point where they are now a practical alternative to typing text into a computer. The majority of commercially available speech recognition systems employ Hidden Markov Models (HMMs) for modelling the components of speech. Though the phonemes and higher level components of speech are well modelled by HMMs this technique is easily degraded by background noise.

While the acoustic conditions of PC based speech input can be easily controlled by close microphone techniques or by using a noise-cancelling microphone, there are other scenarios where it is less easy to stop noise entering the speech recognition system. Such scenarios include speech recognition over telephone lines, in aircraft cockpits, in industrial environments etc.. Under these conditions standard HMM-based speech recognition systems behave badly.

The commercial exploitation of speech recognition systems depends on expanding the application base to include the use of such systems in noisy conditions. A promising method of increasing robustness to noise is to model the noise itself in addition to the speech. Such systems are able to determine whether a specific component of the acoustic sample is speech or noise and therefore do not attempt to further classify noise as a component of speech with the result that the accrual of speech recognition in noise is greatly enhanced.