Current Issue : April - June Volume : 2013 Issue Number : 2 Articles : 4 Articles
Interaction with human musicians is a challenging task for robots as it involves online perception and precise\r\nsynchronization. In this paper, we present a consistent and theoretically sound framework for combining\r\nperception and control for accurate musical timing. For the perception, we develop a hierarchical hidden Markov\r\nmodel that combines event detection and tempo tracking. The robot performance is formulated as a linear\r\nquadratic control problem that is able to generate a surprisingly complex timing behavior in adapting the tempo.\r\nWe provide results with both simulated and real data. In our experiments, a simple Lego robot percussionist\r\naccompanied the music by detecting the tempo and position of clave patterns in the polyphonic music. The robot\r\nsuccessfully synchronized itself with the music by quickly adapting to the changes in the tempo...
The application of Missing Data Techniques (MDT) to increase the noise robustness of HMM/GMM-based large\r\nvocabulary speech recognizers is hampered by a large computational burden. The likelihood evaluations imply\r\nsolving many constrained least squares (CLSQ) optimization problems. As an alternative, researchers have proposed\r\nfrontend MDT or have made oversimplifying independence assumptions for the backend acoustic model. In this\r\narticle, we propose a fast Multi-Candidate (MC) approach that solves the per-Gaussian CLSQ problems\r\napproximately by selecting the best from a small set of candidate solutions, which are generated as the MDT\r\nsolutions on a reduced set of cluster Gaussians. Experiments show that the MC MDT runs equally fast as the\r\nuncompensated recognizer while achieving the accuracy of the full backend optimization approach. The\r\nexperiments also show that exploiting the more accurate acoustic model of the backend does pay off in terms of\r\naccuracy when compared to frontend MDT....
In this article, we present the evaluation results for the task of speaker diarization of broadcast news, which was part of\r\nthe Albayzin 2010 evaluation campaign of language and speech technologies. The evaluation data consists of a\r\nsubset of the Catalan broadcast news database recorded from the 3/24 TV channel. The description of five submitted\r\nsystems from five different research labs is given, marking the common as well as the distinctive system features. The\r\ndiarization performance is analyzed in the context of the diarization error rate, the number of detected speakers and\r\nalso the acoustic background conditions. An effort is also made to put the achieved results in relation to the particular\r\nsystem design features....
A new method to secure speech communication using the discrete wavelet transforms (DWT) and the fast Fourier\r\ntransform is presented in this article. In the first phase of the hiding technique, we separate the speech\r\nhigh-frequency components from the low-frequency components using the DWT. In a second phase, we exploit\r\nthe low-pass spectral proprieties of the speech spectrum to hide another secret speech signal in the low-amplitude\r\nhigh-frequency regions of the cover speech signal. The proposed method allows hiding a large amount of secret\r\ninformation while rendering the steganalysis more complex. Experimental results prove the efficiency of the\r\nproposed hiding technique since the stego signals are perceptually indistinguishable from the equivalent cover\r\nsignal, while being able to recover the secret speech message with slight degradation in the quality....
Loading....