Inventi Impact: Acoustics

Articles

Inventi:eas/17405/15

Advanced Acoustic Modelling Techniques in MP3 Speech Recognition

01-Jan-1970 Research 2015 : October - December

Michal Borsky, Petr Pollak, Petr Mizera

The automatic recognition of MP3 compressed speech presents a challenge to the current systems due to the lossy\nnature of compression which causes irreversible degradation of the speech wave. This article evaluates the\nperformance of a recognition system optimized for MP3 compressed speech with current state-of-the-art acoustic\nmodelling techniques and one specific front-end compensation method. The article concentrates on acoustic model\nadaptation, discriminative training, and additional dithering as prominent means of compensating for the described\ndistortion in the task of phoneme and large vocabulary continuous speech recognition (LVCSR). The experiments\npresented on the phoneme task show a dramatic increase of the recognition error for unvoiced speech units as a\ndirect result of compression. The application of acoustic model adaptation has proved to yield the highest relative\ncontribution while the gain of discriminative training diminished with decreasing bit-rate. The application of\nadditional dithering yielded a consistent improvement only for the MFCC features, but the overall results were still\nworse than those for the PLP features.

How to Cite this Article
CC Compliant Citation: Borsky et al., Advanced acoustic\nmodelling techniques in MP3 speech recognition, EURASIP\nJournal on Audio, Speech, and Music Processing (2015)\n2015:20, DOI 10.1186/s13636-015-0064-7, (http://\ncreativecommons.org/licenses/by/4.0).
Download Full Text

Call Us: +4 (800) 888-0008

Inventi Impact: Acoustics

Articles

Inventi:eas/17405/15

Advanced Acoustic Modelling Techniques in MP3 Speech Recognition

How to Cite this Article

Links

Contact Us