Environmental sound recognition is an important function of robots and intelligent computer systems. In this research, we use\r\na multistage perceptron neural network system for environmental sound recognition. The input data is a combination of timevariance\r\npattern of instantaneous powers and frequency-variance pattern with instantaneous spectrumat the power peak, referred\r\nto as a time-frequency intersection pattern. Spectra of many environmental sounds change more slowly than those of speech or\r\nvoice, so the intersectional time-frequency pattern will preserve the major features of environmental sounds but with drastically\r\nreduced data requirements. Two experiments were conducted using an original database and an open database created by the\r\nRWCP project. The recognition rate for 20 kinds of environmental sounds was 92%. The recognition rate of the new method\r\nwas about 12% higher than methods using only an instantaneous spectrum. The results are also comparable with HMM-based\r\nmethods, although those methods need to treat the time variance of an input vector series with more complicated computations
Loading....