Current Issue : July - September Volume : 2018 Issue Number : 3 Articles : 5 Articles
Automatic extraction of acoustic regions of interest from recordings captured in realistic clinical environments is a\nnecessary preprocessing step in any cry analysis system. In this study, we propose a hidden Markov model (HMM)\nbased audio segmentation method to identify the relevant acoustic parts of the cry signal (i.e., expiratory and\ninspiratory phases) from recordings made in natural environments with various interfering acoustic sources. We\nexamine and optimize the performance of the system by using different audio features and HMM topologies. In\nparticular, we propose using fundamental frequency and aperiodicity features. We also propose a method for\nadapting the segmentation system trained on acoustic material captured in a particular acoustic environment to a\ndifferent acoustic environment by using feature normalization and semi-supervised learning (SSL). The performance\nof the system was evaluated by analyzing a total of 3 h and 10 min of audio material from 109 infants, captured in a\nvariety of recording conditions in hospital wards and clinics. The proposed system yields frame-based accuracy up to\n89.2%. We conclude that the proposed system offers a solution for automated segmentation of cry signals in cry\nanalysis applications...
Pattern recognition on neural activations from naturalistic music listening has been successful at predicting neural responses of listeners from musical features, and vice versa. Inter-subject differences in the decoding accuracies have arisen partly from musical training that has widely recognized structural and functional effects on the brain. We propose and evaluate a decoding approach aimed at predicting the musicianship class of an individual listener from dynamic neural processing of musical features. Whole brain functional magnetic resonance imaging (fMRI) data was acquired from musicians and nonmusicians during listening of three musical pieces from different genres. Six musical features, representing low-level (timbre) and high-level (rhythm and tonality) aspects of music perception, were computed from the acoustic signals, and classification into musicians and nonmusicians was performed on the musical feature and parcellated fMRI time series. Cross-validated classification accuracy reached 77% with nine regions, comprising frontal and temporal cortical regions, caudate nucleus, and cingulate gyrus. The processing of high-level musical features at right superior temporal gyrus was most influenced by listeners� musical training. The study demonstrates the feasibility to decode musicianship from how individual brains listen to music, attaining accuracy comparable to current results from automated clinical diagnosis of neurological and psychological disorders....
In music, the perception of pitch is governed largely by its tonal function given the preceding harmonic structure of the music. While behavioral research has advanced our understanding of the perceptual representation of musical pitch, relatively little is known about its representational structure in the brain. Using Magnetoencephalography (MEG), we recorded evoked neural responses to different tones presented within a tonal context. Multivariate Pattern Analysis (MVPA) was applied to ââ?¬Å?decodeââ?¬Â the stimulus that listeners heard based on the underlying neural activity. We then characterized the structure of the brainââ?¬â?¢s representation using decoding accuracy as a proxy for representational distance, and compared this structure to several well established perceptual and acoustic models. The observed neural representation was best accounted for by a model based on the Standard Tonal Hierarchy, whereby differences in the neural encoding of musical pitches correspond to their differences in perceived stability. By confirming that perceptual differences honor those in the underlying neuronal population coding, our results provide a crucial link in understanding the cognitive foundations of musical pitch across psychological and neural domains....
Conversational spoken dialogue systems that interact with the user rather than merely\nreading the text can be equipped with hesitations to manage dialogue flow and user attention.\nBased on a series of empirical studies, we elaborated a hesitation synthesis strategy for dialogue\nsystems, which inserts hesitations of a scalable extent wherever needed in the ongoing utterance.\nPreviously, evaluations of hesitation systems have shown that synthesis quality is affected negatively\nby hesitations, but that they result in improvements of interaction quality. We argue that due to its\nconversational nature, hesitation synthesis needs interactive evaluation rather than traditional mean\nopinion score (MOS)-based questionnaires. To validate this claim, we dually evaluate our system�s\nspeech synthesis component, on the one hand, linked to the dialogue system evaluation, and on the\nother hand, in a traditional MOS way. We are thus able to analyze and discuss differences that arise\ndue to the evaluation methodology. Our results suggest that MOS scales are not sufficient to assess\nspeech synthesis quality, leading to implications for future research that are discussed in this paper.\nFurthermore, our results indicate that synthetic hesitations are able to increase task performance and\nthat an elaborated hesitation strategy is necessary to avoid likability issues....
Aging is associated with decline in both cognitive and auditory abilities. However,\nevidence suggests that music perception is relatively spared, despite relying on auditory\nand cognitive abilities that tend to decline with age. It is therefore likely that older adults\nengage compensatory mechanisms which should be evident in the underlying functional\nneurophysiology related to processing music. In other words, the perception of musical\nstructure would be similar or enhanced in older compared to younger adults, while\nthe underlying functional neurophysiology would be different. The present study aimed\nto compare the electrophysiological brain responses of younger and older adults to\nmelodic incongruities during a passive and active listening task. Older and younger adults\nhad a similar ability to detect an out-of-tune incongruity (i.e., non-chromatic), while the\namplitudes of the ERAN and P600 were reduced in older adults compared to younger\nadults. On the other hand, out-of-key incongruities (i.e., non-diatonic), were better\ndetected by older adults compared to younger adults, while the ERAN and P600 were\ncomparable between the two age groups. This pattern of results indicates that perception\nof tonal structure is preserved in older adults, despite age-related neurophysiological\nchanges in how melodic violations are processed....
Loading....