In this paper, we study the challenge of image-to-video retrieval, which uses the query image to search relevant frames from a large\ncollection of videos. A novel framework based on convolutional neural networks (CNNs) is proposed to perform large-scale video\nretrieval with low storage cost and high search efficiency. Our framework consists of the key-frame extraction algorithm and the\nfeature aggregation strategy. Specifically, the key-frame extraction algorithm takes advantage of the clustering idea so that\nredundant information is removed in video data and storage cost is greatly reduced. The feature aggregation strategy adopts\naverage pooling to encode deep local convolutional features followed by coarse-to-fine retrieval, which allows rapid retrieval in the\nlarge-scale video database. The results from extensive experiments on two publicly available datasets demonstrate that the\nproposed method achieves superior efficiency as well as accuracy over other state-of-the-art visual search methods.
Loading....