Background: Vision-based surveillance and monitoring is a potential alternative for early detection of respiratory\ndisease outbreaks in urban areas complementing molecular diagnostics and hospital and doctor visit-based alert\nsystems. Visible actions representing typical flu-like symptoms include sneeze and cough that are associated with\nchanging patterns of hand to head distances, among others. The technical difficulties lie in the high complexity and\nlarge variation of those actions as well as numerous similar background actions such as scratching head, cell phone\nuse, eating, drinking and so on.\nResults: In this paper, we make a first attempt at the challenging problem of recognizing flu-like symptoms from\nvideos. Since there was no related dataset available, we created a new public health dataset for action recognition\nthat includes two major flu-like symptom related actions (sneeze and cough) and a number of background actions.\nWe also developed a suitable novel algorithm by introducing two types of Action Matching Kernels, where both types\naim to integrate two aspects of local features, namely the space-time layout and the Bag-of-Words representations. In\nparticular, we show that the Pyramid Match Kernel and Spatial Pyramid Matching are both special cases of our\nproposed kernels. Besides experimenting on standard testbed, the proposed algorithm is evaluated also on the new\nsneeze and cough set. Empirically, we observe that our approach achieves competitive performance compared to the\nstate-of-the-arts, while recognition on the new public health dataset is shown to be a non-trivial task even with simple\nsingle person unobstructed view.\nConclusions: Our sneeze and cough video dataset and newly developed action recognition algorithm is the first of\nits kind and aims to kick-start the field of action recognition of flu-like symptoms from videos. It will be challenging but\nnecessary in future developments to consider more complex real-life scenario of detecting these actions\nsimultaneously from multiple persons in possibly crowded environments
Loading....