基于卷积自编码器分块学习的视频异常事件检测与定位

doi:10.16337/j.1004-9037.2021.03.007

首页 > 按月查看>2021年第3月 >489-497. DOI:10.16337/j.1004-9037.2021.03.007

基于卷积自编码器分块学习的视频异常事件检测与定位
DOI:
                        10.16337/j.1004-9037.2021.03.007
                    
作者:
                        
                        
                    
作者单位:南京师范大学计算机与电子信息学院/人工智能学院，南京 210023
作者简介:
通讯作者:
基金项目:国家自然科学基金(41971343)资助项目。

Convolutional Auto-Encoder Patch Learning Based Video Anomaly Event Detection and Localization

Author:

Affiliation:

School of Computer and Electronic Information/School of Artificial Intelligence, Nanjing Normal University, Nanjing 210023, China

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

摘要:

视频异常事件检测与定位旨在检测视频中发生的异常事件，并锁定其在视频中发生的位置。但是视频场景复杂多样，并且异常发生的位置随机多变，导致发生的异常事件难以被精准定位。本文提出了一种基于卷积自编码器分块学习的视频异常事件检测与定位方法，首先将视频帧进行均匀划分，提取视频帧中每一块的光流和方向梯度直方图（Histogram of oriented gradient， HOG）特征，然后为视频中的不同图块分别设计卷积自编码器以学习正常运动模式特征，最后在异常事件检测过程中利用卷积自编码器的重构误差大小进行异常判断。该方法可以有效地针对视频不同区域进行特征学习，提升了异常事件定位的准确度。所提方法在UCSD Ped1、UCSD Ped2、CUHK Avenue三个公开数据集上进行实验，结果表明该方法能够准确定位异常事件，并且帧级别AUC（Area under the curve）平均提升了5.61%。

Abstract:

Video anomaly event detection and localization aim to detect abnormal events and lock its localization in the video. However， the video scenes are complex and diverse， and the localizations where anomaly events occur are random and changeable， which makes it difficult to accurately locate the occurred abnormal events. This paper proposes a video anomaly event detection and localization method based on convolutional auto-encoder patch learning. Firstly， we divide the video frames evenly into patches and extract the optical flow and the histogram of oriented gradient （HOG） feature of each patch. Then， at the different patches in the video， we individually design a convolutional auto-encoder to learn the feature in the normal motion mode. During the anomaly event detection process， the reconstruction loss of the convolutional auto-encoder is used for anomaly detection. The proposed method can effectively perform feature learning for different regions of the video and improve the accuracy of anomaly event localization. Experimental results on three public datasets， UCSD Ped1， UCSD Ped2， and CUHK Avenue， demonstrate that the frame level AUC （area under the curve） of this method is increased by 5.61% on average and can accurately locate anomaly events.