Human Abnormal Activity Recognition from Video Using Motion Tracking

PDF (828KB), PP.52-63

Views: 0 Downloads: 0

Author(s)

Manoj Kumar 1,2,* Anoop Kumar Patel 2 Mantosh Biswas 2 Sandeep Singh Sengar 3

1. JSS Academy of Technical Education, NOIDA, India

2. National Institute of Technology, Kurukshetra, India

3. Department of Computer Science, Cardiff Metropolitan University, Cardiff, United Kingdom, CF5 2YB

* Corresponding author.

DOI: https://doi.org/10.5815/ijigsp.2024.03.05

Received: 18 May 2023 / Revised: 17 Jun. 2023 / Accepted: 12 Aug. 2023 / Published: 8 Jun. 2024

Index Terms

Velocity, Orientation, Classifier, Thresholding, Optical Flow, Suspicious Activity, Adaptive thresholding, SVM

Abstract

The detection of violent behavior in the public environment using video content has become increasingly important in recent years due to the rise of violent incidents and the ease of sharing and disseminating video content through social media platforms. Efficient and effective techniques for detecting violent behavior in video content can assist authorities with identifying potential hazards, preventing crimes, and promoting public safety. Violence detection can also help to mitigate the psychological damage caused by viewing violent content, particularly in vulnerable populations such as infants and victims of violence. We have proposed an algorithm to calculate new descriptors using the magnitude and orientation of optical flow (MOOF) in the video. Descriptors are extracted from MOOF based on four binary histograms each by applying various weighted thresholds. These descriptors are used to train Support Vector Machine (SVM) and classify the video as violent or nonviolent. The proposed algorithm has been tested on the publicly available Hockey Fight Dataset and Violent Flow dataset. The results demonstrate that the proposed descriptors outperform the state-of-the-art algorithms with an accuracy of 91.5% and 78.5% on the Hockey Fight and Violent Flow datasets, respectively.

Cite This Paper

Manoj Kumar, Anoop Kumar Patel, Mantosh Biswas, Sandeep Singh Sengar, "Human Abnormal Activity Recognition from Video Using Motion Tracking", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.16, No.3, pp. 52-63, 2024. DOI:10.5815/ijigsp.2024.03.05

Reference

[1]S. Tanberk, Z. H. Kilimci, D. B. Tukel, M. Uysal, and S. Akyokus, “A Hybrid Deep Model Using Deep Learning and Dense Optical Flow Approaches for Human Activity Recognition,” IEEE Access, vol. 8, pp. 19799–19809, 2020, doi: 10.1109/ACCESS.2020.2968529.
[2]X. Zhang, S. Yang, J. Zhang, and W. Zhang, “Video anomaly detection and localization using motion-field shape description and homogeneity testing,” Pattern Recognit., vol. 105, p. 107394, 2020, doi: 10.1016/j.patcog.2020.107394.
[3]S. Herath, M. Harandi, and F. Porikli, “Going deeper into action recognition: A survey,” Image Vis. Comput., vol. 60, pp. 4–21, Apr. 2017, doi: 10.1016/J.IMAVIS.2017.01.010.
[4]H. S. Park and J. Shi, “Force from Motion: Decoding Control Force of Activity in a First-Person Video,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 3, pp. 622–635, 2020, doi: 10.1109/TPAMI.2018.2883327.
[5]S. H. Kumar and P. Sivaprakash, “New approach for action recognition using motion based features,” 2013 IEEE Conf. Inf. Commun. Technol. ICT 2013, no. Ict, pp. 1247–1252, 2013, doi: 10.1109/CICT.2013.6558292.
[6]L. Sun, Y. Chen, W. Luo, H. Wu, and C. Zhang, “Discriminative Clip Mining for Video Anomaly Detection,” Proc. - Int. Conf. Image Process. ICIP, vol. 2020-October, pp. 2121–2125, Oct. 2020, doi: 10.1109/ICIP40778.2020.9191072.
[7]J. Mahmoodi and A. Salajeghe, “A classification method based on optical flow for violence detection,” Expert Syst. Appl., vol. 127, pp. 121–127, 2019, doi: 10.1016/j.eswa.2019.02.032.
[8]J. Sochor, J. Spanhel, and A. Herout, “BoxCars: Improving Fine-Grained Recognition of Vehicles Using 3-D Bounding Boxes in Traffic Surveillance,” IEEE Trans. Intell. Transp. Syst., vol. 20, no. 1, pp. 97–108, Jan. 2019, doi: 10.1109/TITS.2018.2799228.
[9]J. Liu, Z. Wang, and H. Liu, “HDS-SP: A novel descriptor for skeleton-based human action recognition,” Neurocomputing, vol. 385, pp. 22–32, 2020, doi: 10.1016/j.neucom.2019.11.048.
[10]W. Li and D. Cosker, “Video interpolation using optical flow and Laplacian smoothness,” Neurocomputing, vol. 220, pp. 236–243, 2017, doi: 10.1016/j.neucom.2016.04.064.
[11]T. Beeler et al., “High-quality passive facial performance capture using anchor frames,” ACM Trans. Graph., vol. 30, no. 4, pp. 1–10, Jul. 2011, doi: 10.1145/2010324.1964970.
[12]D. Bradley, W. Heidrich, T. Popa, and A. Sheffer, “High resolution passive facial performance capture,” ACM Trans. Graph., Jul. 2010, doi: 10.1145/1778765.1778778.
[13]S. Kamal and A. Jalal, “A Hybrid Feature Extraction Approach for Human Detection, Tracking and Activity Recognition Using Depth Sensors,” Arab. J. Sci. Eng., vol. 41, no. 3, pp. 1043–1051, Mar. 2016, doi: 10.1007/S13369-015-1955-8/METRICS.
[14]T. Hassner, Y. Itcher, and O. Kliper-Gross, “Violent flows: Real-time detection of violent crowd behavior,” IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Work., pp. 1–6, 2012, doi: 10.1109/CVPRW.2012.6239348.
[15]G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely Connected Convolutional Networks”, Accessed: May 11, 2022. [Online]. Available: https://github.com/liuzhuang13/DenseNet.
[16]M. Kumar and M. Biswas, “Abnormal human activity detection by convolutional recurrent neural network using fuzzy logic,” Multimed. Tools Appl., no. 0123456789, 2023, doi: 10.1007/s11042-023-15904-x.
[17]Y. Gao, H. Liu, X. Sun, C. Wang, and Y. Liu, “Violence detection using Oriented VIolent Flows,” Image Vis. Comput., vol. 48–49, pp. 37–41, Apr. 2016, doi: 10.1016/J.IMAVIS.2016.01.006.
[18]C. Dai, X. Liu, J. Lai, P. Li, and H. C. Chao, “Human behavior deep recognition architecture for smart city applications in the 5G environment,” IEEE Netw., vol. 33, no. 5, pp. 206–211, Sep. 2019, doi: 10.1109/MNET.2019.1800310.
[19]A. Ben Mabrouk and E. Zagrouba, “Spatio-temporal feature using optical flow based distribution for violence detection,” Pattern Recognit. Lett., vol. 92, pp. 62–67, Jun. 2017, doi: 10.1016/J.PATREC.2017.04.015.
[20]C. Dhiman and D. K. Vishwakarma, “A review of state-of-the-art techniques for abnormal human activity recognition,” Eng. Appl. Artif. Intell., vol. 77, no. September 2018, pp. 21–45, 2019, doi: 10.1016/j.engappai.2018.08.014.
[21]H. Wang, M. M. Ullah, A. Kläser, I. Laptev, and C. Schmid, “Evaluation of local spatio-temporal features for action recognition,” Br. Mach. Vis. Conf. BMVC 2009 - Proc., pp. 124.1-124.11, Sep. 2009, doi: 10.5244/C.23.124.
[22]A. Ladjailia, I. Bouchrika, H. F. Merouani, N. Harrati, and Z. Mahfouf, “Human activity recognition via optical flow: decomposing activities into basic actions,” Neural Comput. Appl., vol. 32, no. 21, pp. 16387–16400, Nov. 2020, doi: 10.1007/S00521-018-3951-X/METRICS.
[23]N. Yudistira and T. Kurita, “Multiresolution Local Autocorrelation of Optical Flows over Time for Action Recognition,” Proc. - 2015 IEEE Int. Conf. Syst. Man, Cybern. SMC 2015, pp. 1930–1935, 2016, doi: 10.1109/SMC.2015.337.
[24]Y. Shi, W. Zeng, T. Huang, and Y. Wang, “Learning Deep Trajectory Descriptor for action recognition in videos using deep neural networks,” Proc. - IEEE Int. Conf. Multimed. Expo, vol. 2015-August, Aug. 2015, doi: 10.1109/ICME.2015.7177461.
[25]H. Wang and C. Schmid, “Action recognition with improved trajectories,” Proc. IEEE Int. Conf. Comput. Vis., pp. 3551–3558, 2013, doi: 10.1109/ICCV.2013.441.
[26]R. V. H. M. Colque, C. Caetano, M. T. L. De Andrade, and W. R. Schwartz, “Histograms of Optical Flow Orientation and Magnitude and Entropy to Detect Anomalous Events in Videos,” IEEE Trans. Circuits Syst. Video Technol., vol. 27, no. 3, pp. 673–682, Mar. 2017, doi: 10.1109/TCSVT.2016.2637778.
[27]Y. Zhang, H. Lu, L. Zhang, and X. Ruan, “Combining motion and appearance cues for anomaly detection,” Pattern Recognit., vol. 51, pp. 443–452, Mar. 2016, doi: 10.1016/J.PATCOG.2015.09.005.
[28]D. K. Singh, S. Paroothi, M. K. Rusia, and M. A. Ansari, “Human Crowd Detection for City Wide Surveillance,” Procedia Comput. Sci., vol. 171, no. 2019, pp. 350–359, 2020, doi: 10.1016/j.procs.2020.04.036.
[29]H. Kuehne, A. Richard, and J. Gall, “A Hybrid RNN-HMM Approach for Weakly Supervised Temporal Action Segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 4, pp. 765–779, Apr. 2020, doi: 10.1109/TPAMI.2018.2884469.
[30]C. I. Patel, S. Garg, T. Zaveri, A. Banerjee, and R. Patel, “Human action recognition using fusion of features for unconstrained video sequences,” Comput. Electr. Eng., vol. 70, pp. 284–301, Aug. 2018, doi: 10.1016/J.COMPELECENG.2016.06.004.
[31]L. Yang et al., “A Refined Non-Driving Activity Classification Using a Two-Stream Convolutional Neural Network,” IEEE Sens. J., vol. 21, no. 14, pp. 15574–15583, 2021, doi: 10.1109/JSEN.2020.3005810.
[32]A. Ullah, K. Muhammad, K. Haydarov, I. U. Haq, M. Lee, and S. W. Baik, “One-Shot Learning for Surveillance Anomaly Recognition using Siamese 3D CNN,” Proc. Int. Jt. Conf. Neural Networks, Jul. 2020, doi: 10.1109/IJCNN48605.2020.9207595.
[33]Neeraj, V. Singhal, J. Mathew, and R. K. Behera, “Detection of alcoholism using EEG signals and a CNN-LSTM-ATTN network,” Comput. Biol. Med., vol. 138, no. June, p. 104940, 2021, doi: 10.1016/j.compbiomed.2021.104940.
[34]C. Dai, X. Liu, and J. Lai, “Human action recognition using two-stream attention based LSTM networks,” Appl. Soft Comput. J., vol. 86, p. 105820, Jan. 2020, doi: 10.1016/J.ASOC.2019.105820.
[35]K. Simonyan and A. Zisserman, “Two-Stream Convolutional Networks for Action Recognition in Videos,” Adv. Neural Inf. Process. Syst., vol. 1, no. January, pp. 568–576, Jun. 2014, doi: 10.48550/arxiv.1406.2199.
[36]P. Kumar, G. P. Gupta, and R. Tripathi, “An ensemble learning and fog-cloud architecture-driven cyber-attack detection framework for IoMT networks,” Comput. Commun., vol. 166, no. April 2020, pp. 110–124, 2021, doi: 10.1016/j.comcom.2020.12.003.
[37]Y. Hu, R. Song, Y. Li, P. Rao, and Y. Wang, “Highly accurate optical flow estimation on superpixel tree,” Image Vis. Comput., vol. 52, pp. 167–177, 2016, doi: 10.1016/j.imavis.2016.06.004.
[38]Y. Fan, M. D. Levine, G. Wen, and S. Qiu, “A deep neural network for real-time detection of falling humans in naturally occurring scenes,” Neurocomputing, vol. 260, pp. 43–58, 2017, doi: 10.1016/j.neucom.2017.02.082.
[39]Y. Li et al., “Large-scale gesture recognition with a fusion of RGB-D data based on optical flow and the C3D model,” Pattern Recognit. Lett., vol. 119, pp. 187–194, 2019, doi: 10.1016/j.patrec.2017.12.003.