A typical speech recognition system is push button operated (Push-to-talk), which requires hand movement\r\nand hence mixed multi-modal interface. However, for disabled patients and those who use hands-busy applications\r\n(e.g., where the user has objects to manipulate or device to control while asking for assistance from another device)\r\nmovement may be restricted or impossible. One alternative is to use Speech Only Interface. The method that is\r\nbeing proposed is called Wake-Up-Word Speech Recognition (WUW-SR). A WUW-SR system would allow the user\r\nto operate (activate) many systems (Cell phone, Computer, Elevator, etc.) with speech commands instead of hand\r\nmovements. This paper introduces a new front-end paradigm of the Wake-Up-Word Speech Recognition. The state of\r\nthe art WUW-SR system is based on three different sets of features: (1) Mel-frequency Cepstral Coefficients (MFCC),\r\n(2) Linear Predictive Coding Coefficients (LPC), and (3) Enhanced Mel-frequency Cepstral Coefficients (ENH_MFCC),\r\nthese features are decoded with corresponding Hidden Markov Models (HMMs) in the back-end stage of the WUWSR.\r\nWe present an experimental FPGA design and implementation of a novel architecture of a real time feature\r\nextraction processor that generates MFCC, LPC, and ENH_MFCC features simultaneously. In the WUW-SR system,\r\nthe recognizer front-end is located at the terminal which is typically connected over a data network to remote back-end\r\nrecognition (e.g., server). The three sets of feature extraction of speech (MFCC, LPC, and ENH-MFCC) are performed\r\nat the front-end. These extracted features are then compressed and transmitted to the server via a dedicated channel,\r\nwhere subsequently they are decoded. Our front-end can be added to any hand-held electronic device compatible\r\nwith WUW-SR and command (activate) it by using our voice only (no push to talk as is presently done). Our front-end\r\nis designed, simulated and implemented in Altera DSP development kit with Cyclone III FPGA as a portable system\r\nacting as a processor that is capable of computing three different sets of features at a much faster rate than software.\r\nIt is cost effective, consumes very little power, and it is not limited by having to operate on a general-purpose computer\r\nso it can be used on any portable device.
Loading....