Human motion prediction aims at predicting the future poses according to the motion dynamics given by the sequence of history\nposes. We present a new hierarchical static-dynamic encoder-decoder structure to predict the human motion with residual CNNs.\nSpecifically, to better mine the law of the motion, a new residual CNN-based structure, v-CMU, is proposed to encode not only the\nstatic information but also the dynamic information. Based on v-CMU, a hierarchical structure is proposed to model different\ncorrelations between the different given poses and the predicted pose. Moreover, a new loss function combining the static and\ndynamic information is introduced in the decoder to guide the prediction of the future poses. Our framework features two-folds:\n(1) more effective dynamics mined due to the fusion of information of the poses and the dynamic information between poses and\nthe hierarchical structure; (2) better decoding or prediction performance, thanks to the mid-level supervision introduced by the\nnew loss function considering both the static and dynamic losses. Extensive experiments show that our algorithm can achieve\nstate-of-the-art performance on the challenging G3D and FNTU datasets. The code is available at https://github.com/liujin0/\nSDnet.
Loading....