Joey Tianyi Zhou, Le Zhang*, Zhiwen Fang, Jiawei Du, Xi Peng, Yang Xiao
IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 12, pp. 4639-4647, doi: 10.1109/TCSVT.2019.2962229
Publication year: 2020

Recent video anomaly detection methods focus on reconstructing or predicting frames. Under this umbrella, the long-standing inter-class data-imbalance problem resorts to the imbalance between foreground and stationary background objects in video anomaly detection and this has been less investigated by existing solutions. Naively optimizing the reconstructing loss yields a biased optimization towards background reconstruction rather than the objects of interest in the foreground. To solve this, we proposed a simple yet effective solution, termed attention-driven loss to alleviate the foreground-background imbalance problem in anomaly detection. Specifically, we compute a single mask map that summarizes the frame evolution of moving foreground regions and suppresses the background in the training video clips. After that, we construct an attention map through the combination of the mask map and background to give different weights to the foreground and background region respectively. The proposed attention-driven loss is independent of backbone networks and can be easily augmented in most existing anomaly detection models. Augmented with attention-driven loss, the model is able to achieve AUC 86.0% on Avenue, 83.9% on Ped1, 96% on Ped2 datasets. Extensive experimental results and ablation studies further validate the effectiveness of our model. Project page is available at


邮箱地址不会被公开。 必填项已用*标注