With the rapid development of artificial intelligence and digital media technology, modern animation technology has greatly improved the creative efficiency of creators through computer-generated graphics, electronic manual painting, and other means, and its number has also experienced explosive growth. The intelligent completion of emotional expression identification within animation works holds immense significance for both animation production learners and the creation of intelligent animation works. Consequently, emotion recognition has emerged as a focal point of research attention. This paper focuses on the analysis of emotional states in animation works. First, by analyzing the characteristics of emotional expression in animation, the model data foundation for using sound and video information is determined. Subsequently, we perform individual feature extraction for these two types of information using gated recurrent unit (GRU). Finally, we employ a multiattention mechanism to fuse the multimodal information derived from audio and video sources. The experimental outcomes demonstrate that the proposed method framework attains a recognition accuracy exceeding 90% for the three distinct emotional categories. Remarkably, the recognition rate for negative emotions reaches an impressive 94.7%, significantly surpassing the performance of single-modal approaches and other feature fusion methods. This research presents invaluable insights for the training of multimedia animation production professionals, empowering them to better grasp the nuances of emotion transfer within animation and, thereby, realize productions of elevated quality, which will greatly improve the market operational efficiency of animation industry.
Loading....