Hybrid deep learning model-based human action recognition in indoor environment

Manoj Kumar Sain; Rabul Hussain Laskar; Joyeeta Singha; Sandeep Saini

doi:10.1017/S0263574723001327

Hybrid deep learning model-based human action recognition in indoor environment

Published online by Cambridge University Press: 10 October 2023

Manoj Kumar Sain

Rabul Hussain Laskar ,

Joyeeta Singha and

Sandeep Saini

Show author details

Manoj Kumar Sain*: Affiliation:
Department of Electronics and Communication, The LNM Institute of Information Technology, Jaipur, India
Rabul Hussain Laskar: Affiliation:
Department of Electronics and Communication, National Institute of Technology, Silchar, Assam, India
Joyeeta Singha: Affiliation:
Department of Electronics and Communication, The LNM Institute of Information Technology, Jaipur, India
Sandeep Saini: Affiliation:
Department of Electronics and Communication, The LNM Institute of Information Technology, Jaipur, India
*: Corresponding author: Manoj Kumar Sain; Email: manoj.sain@lnmiit.ac.in

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Human activity recognition (HAR) is an emerging challenge among researchers. HAR has many possible uses in various fields, including healthcare, sports, and security. Furthermore, there are only a few publicly accessible datasets for classifying and recognizing physical activity in the literature, and these datasets comprise fewer activities. We created and compared our dataset with available datasets, that is, NTU-RGBD, UP-FALL, UR-Fall, WISDM, and UCI HAR. The proposed dataset consists of seven activities: eating, exercise, handshake, situps, vomiting, headache, and walking. The activities were collected from 20 people between the ages of 25 and 40 years using Kinect V2 sensor at 30 FPS. For classification, we use deep learning architectures based on convolutional neural network (CNN) and long short-term memory (LSTM). Additionally, we developed a novel hybrid deep learning model by combining a CNN, a bidirectional LSTM unit, and a fully connected layer for activity identification. The suggested model builds unique guided features using the preprocessed skeleton coordinates and their distinctive geometrical and kinematic aspects. Results from the experiment are contrasted with the performance of stand-alone CNNs, LSTMs, and ConvLSTM. The proposed model’s accuracy of 99.5% surpasses that of CNN, LSTM, and ConvLSTM, which have accuracy rates of 95.76%, 97%, and 98.89%, respectively. Our proposed technique is invariant of stance, speed, individual, clothes, etc. The proposed dataset sample is accessible to the general public.

Keywords

Human Activity Recognition (HAR)Convolutional Neural Network Deep Learning LSTM BiLSTM Kinect V2 sensor

Type: Research Article
Information: Robotica , Volume 41 , Issue 12 , December 2023 , pp. 3788 - 3817

DOI: https://doi.org/10.1017/S0263574723001327 [Opens in a new window]
Copyright: © The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Gupta, N., Gupta, S. K., Pathak, R. K., Jain, V., Rashidi, P. and Suri, J. S., “Human activity recognition in artificial intelligence framework: A narrative review,” Artif. Intell. Rev. 55(6), 4755–4808 (2022).CrossRef Google Scholar PubMed

Zhang, S., Li, Y., Zhang, S., Shahabi, F., Xia, S., Deng, Y. and Alshurafa, N., “Deep learning in human activity recognition withwearable sensors: A review on advances,” Sensors 22(4), 1476 (2022).CrossRef Google Scholar PubMed

Zhang, Y., Zhang, F., Jin, Y., Cen, Y., Voronin, V. and Wan, S., “Local correlation ensemble with gcn based on attention features for cross-domain person re-id,” ACM Trans. Multimed. Comput. Commun. Appl. 19(2), 1–22 (2023).Google Scholar

Gao, Z., Xuan, H. Z., Zhang, H., Wan, S. and Choo, K. K. R., “Adaptive fusion and category-level dictionary learning model for multiview human action recognition,” IEEE Internet Things J. 6(6), 9280–9293 (2019).CrossRef Google Scholar

Nair, N., Thomas, C. and Jayagopi, D. B., “Human Activity Recognition Using Temporal Convolutional Network,” In: iWOAR’18: Proceedings of the 5th International Workshop on Sensor-based Activity Recognition and Interaction (Association for Computing Machinery, 2018).Google Scholar

Zeng, M., Nguyen, L. T., Yu, B., Mengshoel, O. J., Zhu, J., Wu, P. and Zhang, J., “Convolutional Neural Networks for Human Activity Recognition Using Mobile Sensors,” In: 6th International Conference on Mobile Computing, Applications and Services (2014) pp. 197–205.Google Scholar

Dhanraj, S., De, S. and Dash, D., “Efficient Smartphone-based Human Activity Recognition Using Convolutional Neural Network,” In: 2019 International Conference on Information Technology (ICIT) (2019) pp. 307–312.CrossRef Google Scholar

Wang, J., Chen, Y., Hao, S., Peng, X. and Hu, L., “Deep learning for sensor-based activity recognition: A survey,” Pattern Recogn. Lett. 119(1), 3–11 (2019). https://www.sciencedirect.com/science/article/pii/S016786551830045X CrossRef Google Scholar

Ke, Q., Bennamoun, M., An, S., Sohel, F. and Boussaid, F., “A New Representation of Skeleton Sequences for 3D Action Recognition,” In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017). http://dx.doi.org/10.1109/CVPR.2017.486 Google Scholar

Liu, H., Tu, J. and Liu, M., Two-stream 3D convolutional neural network for skeleton-based action recognition (2017). ArXiv, abs/1705.08106Google Scholar

Ding, W., Liu, K., Belyaev, E. and Cheng, F., “Tensor-based linear dynamical systems for action recognition from 3D skeletons,” Pattern Recogn. 77(1), 75–86 (2018). https://www.sciencedirect.com/science/article/pii/S0031320317304909 CrossRef Google Scholar

Scano, A., Chiavenna, A., Malosio, M., Tosatti, L. M. and Molteni, F., “Kinect v2 implementation and testing of the reaching performance scale for motor evaluation of patients with neurological impairment,” Med. Eng. Phys. 56(1), 54–58 (2018). https://www.sciencedirect.com/science/article/pii/S1350453318300596 CrossRef Google Scholar PubMed

Bijalwan, V., Semwal, V. B. and Gupta, V., “Wearable sensor-based pattern mining for human activity recognition: Deep learning approach,” Ind. Robot 49(1), 21–33 (2022).CrossRef Google Scholar

Jain, R., Semwal, V. B. and Kaushik, P., “Deep ensemble learning approach for lower extremity activities recognition using wearable sensors,” Expert Syst. 39(6), e12743 (2022).CrossRef Google Scholar

Bijalwan, V., Semwal, V. B., Singh, G. and Mandal, T. K., “HDL-PSR: Modelling spatio-temporal features using hybrid deep learning approach for post-stroke rehabilitation,” Neural Process. Lett. 55(1), 279–298 (2023).CrossRef Google Scholar

Semwal, V. B., Gupta, A. and Lalwani, P., “An optimized hybrid deep learning model using ensemble learning approach for human walking activities recognition,” J Supercomput. 77(11), 12256–12279 (2021).CrossRef Google Scholar

Dua, N., Singh, S. N. and Semwal, V. B., “Multi-input CNN-GRU based human activity recognition using wearable sensors,” Computing 103(7), 1461–1478 (2021).CrossRef Google Scholar

Yadav, S. K., Tiwari, K., Pandey, H. M. and Akbar, S. A., “Skeleton-based human activity recognition using convlstm and guided feature learning,” Soft Comput. 26(2), 877–890 (2022).CrossRef Google Scholar

Ashwini, K., Amutha, R. and raj, S. A., “Skeletal Data Based Activity Recognition System,” In: 2020 International Conference on Communication and Signal Processing (ICCSP) (2020) pp. 444–447.CrossRef Google Scholar

Liu, J., Wang, G., Hu, P., Duan, L.-Y. and Kot, A. C., “Global Context-Aware Attention LSTM Networks for 3D Action Recognition,” In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) pp. 3671–3680.CrossRef Google Scholar

Yan, Y., Liao, T., Zhao, J., Wang, J., Ma, L., Lv, W., Xiong, J. and Wang, L., Deep transfer learning with graph neural network for sensor-based human activity recognition (2022). http://arxiv.org/abs/2203.07910 Google Scholar

Jiang, X., Lu, Y., Lu, Z. and Zhou, H., Smartphone-Based Human Activity Recognition Using CNN in Frequency Domain: APWeb-WAIM 2018 International Workshops: MWDA, BAH, KGMA, DMMOOC, DS, Macau, China, July 23-25, 2018, Revised Selected Papers (2018) pp. 101–110.Google Scholar

Gholamrezaii, M. and Almodarresi, S. M. T., “Human Activity Recognition Using 2D Convolutional Neural Networks,” In: 2019 27th Iranian Conference on Electrical Engineering (ICEE) (2019) pp. 1682–1686.CrossRef Google Scholar

Abedin, A., Abbasnejad, E., Shi, Q., Ranasinghe, D. and Rezatofighi, H., “Deep Auto-Set: A Deep Auto-Encoder-Set Network for Activity Recognition Using Wearables,” In: MobiQuitous ’18: Proceedings of the 15th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services (2018) pp. 246–253.Google Scholar

Yu, S. and Qin, L., “Human Activity Recognition with Smartphone Inertial Sensors Using Bidir-LSTM Networks,” In: 2018 3rd International Conference on Mechanical, Control and Computer Engineering (ICMCCE) (2018) pp. 219–224.CrossRef Google Scholar

Zhao, Y., Yang, R., Chevalier, G., Xu, X. and Zhang, Z., “Deep residual Bidir-LSTM for human activity recognition using wearable sensors,” Math. Probl. Eng. 2018, 1–13 (2018).Google Scholar

Deep, S. and Zheng, X., “Hybrid Model Featuring CNN and LSTM Architecture for Human Activity Recognition on Smartphone Sensor Data,” In: 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT) (2019) pp. 259–264.CrossRef Google Scholar

Xia, K., Huang, J. and Wang, H., “Lstm-cnn architecture for human activity recognition,” IEEE Access 8(1), 56855–56866 (2020).CrossRef Google Scholar

Chen, H., Wang, G., Xue, J.-H. and He, L., “A novel hierarchical framework for human action recognition,” Pattern Recognit. 55(1), 148–159 (2016).CrossRef Google Scholar

Banos, O., Galvez, J. M., Damas, M., Pomares, H. and Rojas, I., “Window size impact in human activity recognition,” Sensors (Switzerland) 14(4), 6474–6499 (2014).CrossRef Google Scholar PubMed

Mekruksavanich, S. and Jitpattanakul, A., “LSTM networks using smartphone data for sensor-based human activity recognition in smart homes,” Sensors 21(5), 1–25 (2021).CrossRef Google Scholar PubMed

Khan, I. U., Afzal, S. and Lee, J. W., “Human activity recognition via hybrid deep learning based model,” Sensors 22(1), 323 (2022).CrossRef Google Scholar PubMed

Salim, H., Alaziz, M. and Abdalla, T., “Human activity recognition using the human skeleton provided by kinect,” Iraqi J. Electr. Electron. Eng. 17(2), 183–189 (2021).CrossRef Google Scholar

Khatun, M. A., Yousuf, M. A., Ahmed, S., Uddin, M. Z., Alyami, S. A., Al-Ashhab, S., Akhdar, H. F., Khan, A., Azad, A. and Moni, M. A., “Deep CNN-LSTM with self-attention model for human activity recognition using wearable sensor,” IEEE J. Transl. Eng. Health Med. 10(1), 1–16 (2022).CrossRef Google Scholar PubMed

Shahroudy, A., Liu, J., Ng, T.-T. and Wang, G., “Ntu rgb+d: A large scale dataset for 3D human activity analysis,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA (2016) pp. 1010–1019.Google Scholar

Martínez-Villaseñor, L., Ponce, H., Brieva, J., Moya-Albor, E., Núñez-Martínez, J. and Peñafort-Asturiano, C., “Up-fall detection dataset: A multimodal approach,” Sensors (Switzerland) 19(9), 1988 ( 2019).CrossRef Google Scholar PubMed

Lotfi, A., Albawendi, S., Powell, H., Appiah, K. and Langensiepen, C., “Supporting independent living for older adults; employing a visual based fall detection through analysing the motion and shape of the human body,” IEEE Access 6(1), 70272–70282 (2018).CrossRef Google Scholar

Varshney, N., Bakariya, B., Kushwaha, A. K. S. and Khare, M., “Human activity recognition by combining external features with accelerometer sensor data using deep learning network model,” Multimed. Tools Appl. 81(24), 34633–34652 (2022). doi: 10.1007/s11042-021-11313-0.CrossRef Google Scholar

Mim, T. R., Amatullah, M., Afreen, S., Yousuf, M. A., Uddin, S., Alyami, S. A., Hasan, K. F. and Moni, M. A., “Gru-inc: An inception-attention based approach using gru for human activity recognition,” Expert Syst. Appl. 216(2), 119419 (2023).CrossRef Google Scholar

Dua, N., Singh, S. N., Semwal, V. B. and Challa, S. K., “Inception inspired CNN-GRU hybrid network for human activity recognition,” Multimed. Tools Appl. 82(4), 5369–5403 (2023).CrossRef Google Scholar

Usmani, A., Siddiqui, N. and Islam, S., “Skeleton joint trajectories based human activity recognition using deep RNN,” Multimed. Tools Appl. (2023). doi: 10.1007/s11042-023-15024-6.CrossRef Google Scholar

Article contents

Hybrid deep learning model-based human action recognition in indoor environment

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests