IJISA Vol. 11, No. 3, 8 Mar. 2019
Cover page and Table of Contents: PDF (size: 760KB)
Full Text (PDF, 760KB), PP.42-52
Views: 0 Downloads: 0
Action recognition, RGB-D sensor, skeleton joint, classification
Most of action recognition methods allow achieving high action recognition accuracy, but only after processing the entire video sequence, however, for security issues, it is primordial to detect dangerous behavior occurrence as soon as possible allowing early warnings. In this paper, we present a human activity recognition method using 3D skeleton information, recovered by an RGB-D sensor by proposing a new descriptor modeling the dynamic relation between 3D locations of skeleton joints expressed in Euclidean distance and spherical coordinates between the normalized joints, A PCA dimension reduction is used to remove noisy information and enhance recognition accuracy while improving calculation and decision time. We also study the accuracy of the proposed descriptor calculated on limited few first frames and using limited skeleton joint number, to perform early action detection by exploring several classifiers. We test this approach on two datasets, MSR Daily Activity 3D and our own dataset called INDACT. Experimental evaluation shows that the proposed approach can robustly classify actions outperforming state-of-the-art methods and maintain good accuracy score even using limited frame number and reduced skeleton joints.
Adlen Kerboua, Mohamed Batouche, "3D Skeleton Action Recognition for Security Improvement", International Journal of Intelligent Systems and Applications(IJISA), Vol.11, No.3, pp.42-52, 2019. DOI:10.5815/ijisa.2019.03.05
[1]X. Yang and Y. Tian, “Effective 3D Action Recognition Using Eigenjoints,” Journal of Visual Communication and Image Representation, Vol. 25, No. 1, pp. 2-11, 2014, doi: 10.1016/j.jvcir.2013.03.001.
[2]O. Maksymiv, T. Rak and D. Peleshko, “Video-based Flame Detection using LBP-based Descriptor: Influences of Classifiers Variety on Detection Efficiency,” International Journal of Intelligent Systems and Applications, Vol.9, No.2, pp. 42-48, 2017.
[3]K.C. Manjunatha, H.S. Mohana and P.A Vijaya, “Implementation of Computer Vision Based Industrial Fire Safety Automation by Using Neuro-Fuzzy Algorithms,” International Journal of Information Technology and Computer Science, vol.7, no.4, pp.14-27, 2015.
[4]G. Cheng, Y. Wan, A. Saudagar, K. Namuduri and B. Buckles, “Advances in human action recognition: A survey,” In preprint arXiv: 1501.05964, 2015. https://arxiv.org/abs/1501.05964.
[5]T. Hassner, “A Critical Review of Action Recognition Benchmarks,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR), pp. 245-250, 2013, doi: 10.1109/CVPRW.2013.43.
[6]L. Presti and M. La Cascia, “3D Skeleton-based Human Action Classification: Survey,” Pattern Recognition, Vol. 53, pp. 130-147, 2016, doi: 10.1016/j.patcog.2015.11.019.
[7]J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman and A. Blake, “Real-time human pose recognition in parts from single depth images,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1297-1304, 2011, doi: 10.1145/2398356.2398381.
[8]R. Gishick, J. Shotton, P. Kohli, A. Criminisi and A. Fitzgibbon, “Efficient regression of general activity human poses from depth images,” Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 415-422, 2011, doi: 10.1109/ICCV.2011.6126270.
[9]M. Sun, P. Kohli and J. Shotton, “Conditional regression forests for human pose estimation,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3394-3401, 2012, doi: 10.1109/CVPR.2012.6248079.
[10]W. Li, Z. Zhang and Z. Liu, “Action recognition based on a bag of 3D points,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 9-14, 2010, doi: 10.1109/CVPRW.2010.5543273.
[11]L. Xia, C. Chen and J. Aggarwal, “View invariant human action recognition using histograms of 3D joints,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 20-27, 2012, doi: 10.1109/CVPRW.2012.6239233.
[12]J. Sung, C. Ponce, B. Selman and A. Saxena, “Unstructured human activity detection from RGBD images,” Proceedings of IEEE International Conference on Robotics and Automation (ICRA), pp. 842-849, 2012, doi: 10.1109/ICRA.2012.6224591.
[13]X. Yang, C. Zhang and Y. Tian, “Recognizing actions using depth motion maps-based histograms of oriented gradients,” Proceedings of the 20th ACM international conference on Multimedia, pp. 1057-1060, 2012, doi: 10.1145/2393347.2396382.
[14]J. Wang, Z. Liu, Y. Wu and J. Yuan, “Mining Actionlet ensemble for action recognition with depth cameras,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1290-1297, 2012, doi: 10.1109/CVPR.2012.6247813.
[15]H. Wu, W. Pan, X. Xiong and S. Xu, “Human activity recognition based on the combined SVM&HMM,” Proceedings of IEEE International Conference on Information and Automation (ICIA), pp. 219–224, 2014, doi: 10.1109/ICInfA.2014.6932656.
[16]A. Eweiwi, M. Cheema, C. Bauckhage and J. Gall, “Efficient pose-based action recognition,” Asian Conference on Computer Vision (ACCV), pp. 428-443, 2014, doi: 10.1007/978-3-319-16814-2_28.
[17]K. Schindler and L. Gool, “Action snippets: how many frames does human action recognition require? Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-8, 2008, doi: 10.1109/CVPR.2008.4587730.
[18]M. Zanfir, M. Leordeanu and C. Sminchisescu, “The moving pose: An efficient 3d kinematics descriptor for low-latency action recognition and detection,” Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 2752-2759, 2013, doi: 10.1109/ICCV.2013.342.
[19]A. Kerboua, M. Batouche and A. Debbeh, “RGB-D & SVM action recognition for security improvement,” Proceedings of the ACM Mediterranean Conference on Pattern Recognition and Artificial Intelligence (MedPRAI), pp. 137-143, 2016, doi: 10.1145/3038884.3038907.
[20]J. Medina-Catzin, A. Gonzalez, C. Brito-Loeza, V. Uc-Cetina, “Body Gestures Recognition System to Control a Service Robot,” International Journal of Information Technology and Computer Science, Vol.9, No.9, pp. 69-76, 2017.
[21]A. Shahroudy, J. Liu, T. Ng and G. Wang, “NTU RGB+D: A Large-Scale Dataset for 3D Human Activity Analysis,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1010-1019, 2016, doi: 10.1109/CVPR.2016.115.