With the development of the marine economy and the increase in marine activities, deep saturation diving has gained significant attention. Helium speech communication is indispensable for saturation diving operations and is a critical technology for deep saturation diving, serving as the sole communication method to ensure the smooth execution of such operations. This study introduces deep learning into helium speech recognition and proposes a spectrogram-based dual-model helium speech recognition method. First, we extract the spectrogram features from the helium speech. Then, we combine a deep fully convolutional neural network with connectionist temporal classification (CTC) to form an acoustic model, in which the spectrogram features of helium speech are used as an input to convert speech signals into phonetic sequences. Finally, a maximum entropy hidden Markov model (MEMM) is employed as the language model to convert the phonetic sequences to word outputs, which is regarded as a dynamic programming problem. We use a Viterbi algorithm to find the optimal path to decode the phonetic sequences to word sequences. The simulation results show that the method can effectively recognize helium speech with a recognition rate of 97.89% for isolated words and 95.99% for continuous helium speech.
Loading....