The Real-time fitness action recognition using LSTM/GRU techniques
DOI:
https://doi.org/10.25098/9.1.30Keywords:
Human action recognition, LSTM, GRU, Accuracy, UCF datasetAbstract
Human action recognition in fitness refers to the ability of technology to identify and track human movements during the exercise or physical activity. This may include recognizing specific exercises, such as Squats or push-ups, as well as tracking the overall movement patterns and providing feedback on forms and techniques. This technology is often used in fitness apps or wearable devices to help individuals improve their workouts and prevent injury. Using Artificial Intelligence (AI) to recognize human actions in fitness is a great idea. With the help of Artificial Intelligence, fitness enthusiasts can get accurate and real-time feedback on their workout routines, which can help them improve their performance and achieve their fitness goals faster. However, it is important to ensure that the data collected by these devices are secure and not misused in any way. Overall, it is believed that AI has the potential to revolutionize the way we approach fitness and lead to a healthier lifestyle for many people. Recurrent networks consist of distributed parameters that are present in each layer of the network. This study introduces dual models for recognizing many actions, one using LSTM and the other using GRU. An experiment was carried out on the EUCF Sport action dataset to compare the accuracy rates of the two models. The results indicate that both the LSTM and GRU models had significantly better accuracy rates than other state-of-the-art action recognition models, with recorded accuracies of 99.78% and 98.53% respectively on EUCF Sports dataset.
References
I. Jegham, A. Ben, I. Alouani, and M. Ali, “Forensic Science International : Digital Investigation Vision-based human action recognition : An overview and real world challenges,” Forensic Sci. Int. Digit. Investig., vol. 32, p. 200901, 2020, doi: 10.1016/j.fsidi.2019.200901.
Angelov, P.P. et al. (2016) “Autonomous data density based Clustering Method,” 2016 International Joint Conference on Neural Networks (IJCNN) [Preprint]. Available at: https://doi.org/10.1109/ijcnn.2016.7727498.
A. Voulodimos, N. Doulamis, A. Doulamis, and E. Protopapadakis, “Deep Learning for Computer Vision: A Brief Review,” Comput. Intell. Neurosci., vol. 2018, 2018, doi: 10.1155/2018/7068349.
N. Ahmed, K. Nouduri, and K. Palaniappan, “A Hybrid Approach for Human Activity Recognition with Support Vector Machine and 1D Convolutional Neural Network,” no. October, 2020, doi: 10.1109/AIPR50011.2020.9425332.
A. B. Sargano, P. Angelov, and Z. Habib, “A comprehensive review on handcrafted and learning-based action representation approaches for human activity recognition,” Appl. Sci., vol. 7, no. 1, 2017, doi: 10.3390/app7010110.
Muhamad, Azhee W., & Mohammed, Aree A. (2022). “Review on recent Computer Vision Methods for Human Action Recognition”. Advances in Distributed Computing and Artificial Intelligence Journal, pp. 361- 379,2021, DOI: https://doi.org/ 10.14201/ ADCAIJ2021104361379.
M. M. Hossain Shuvo, N. Ahmed, K. Nouduri, and K. Palaniappan, “A hybrid approach for human activity recognition with support vector machine and 1d convolutional neural network,” Proc. - Appl. Imag. Pattern Recognit. Work., vol. 2020-October, no. October, 2020, doi: 10.1109/AIPR50011.2020.9425332.
N. ur R. Malik, S. A. R. Abu-Bakar, U. U. Sheikh, A. Channa, and N. Popescu, “Cascading Pose Features with CNN-LSTM for Multiview Human Action Recognition,” Signals, vol. 4, no. 1, pp. 40–55, 2023, doi: 10.3390/signals4010002.
J. Zhang, L. Zi, Y. Hou, M. Wang, W. Jiang, and D. Deng, “A Deep Learning-Based Approach to Enable Action Recognition for Construction Equipment,” Adv. Civ. Eng., vol. 2020, 2020, doi: 10.1155/2020/8812928.
I. Sipiran and B. Bustos, “Harris 3D: A robust extension of the Harris operator for interest point detection on 3D meshes,” Vis. Comput., vol. 27, no. 11, pp. 963–976, 2011, doi: 10.1007/s00371-011-0610-y.
A. Franco, A. Magnani, and D. Maio, “A multimodal approach for human activity recognition based on skeleton and RGB data,” Pattern Recognit. Lett., vol. 131, pp. 293–299, 2020, doi: 10.1016/j.patrec.2020.01.010.
Kurnianggoro, L., Wahyono and Jo, K.-H. (2018) “A survey of 2D shape representation: Methods, evaluations, and future research directions,” Neurocomputing, 300, pp. 1–16. Available at: https://doi.org/10.1016/j.neucom.2018.02.093.
D. R. Beddiar, B. Nini, M. Sabokrou, and A. Hadid, “Vision-based human activity recognition: a survey,” Multimed. Tools Appl., vol. 79, no. 41–42, pp. 30509–30555, 2020, doi: 10.1007/s11042-020-09004-3.
Naeem, H.B. et al. (2020) “Multiple batches of motion history images (MB-mhis) for multi-view human action recognition,” Arabian Journal for Science and Engineering, 45(8), pp. 6109–6124. Available at: https://doi.org/10.1007/s13369-020-04481-y.
L. Cai, C. Liu, R. Yuan, and H. Ding, “Human action recognition using Lie Group features and convolutional neural networks,” Nonlinear Dyn., vol. 99, no. 4, pp. 3253–3263, 2020, doi: 10.1007/s11071-020-05468-y.
A. B. Sargano, P. Angelov, and Z. Habib, “A comprehensive review on handcrafted and learning-based action representation approaches for human activity recognition,” Appl. Sci., vol. 7, no. 1, 2017, doi: 10.3390/app7010110.
Li, C. et al. (2019) “Illumination-aware faster R-CNN for robust multispectral pedestrian detection,” Pattern Recognition, 85, pp. 161–171. Available at: https://doi.org/10.1016/j.patcog.2018.08.005.
Ullah, A., Ahmad, J., Muhammad, K., Sajjad, M., &Baik, S. W. (2017). Action recognition in video sequences using deep bi-directional LSTM with CNN features. IEEE access, 6, 1155-1166.
Chu, W. et al. (2019) “Sparse coding guided spatiotemporal feature learning for abnormal event detection in large videos,” IEEE Transactions on Multimedia, 21(1), pp. 246–255. Available at: https://doi.org/10.1109/tmm.2018.2846411.
K. Greff, R. K. Srivastava, J. Koutnik, B. R. Steunebrink, and J. Schmidhuber, “LSTM: A Search Space Odyssey,” IEEE Trans. Neural Networks Learn. Syst., vol. 28, no. 10, pp. 2222–2232, 2017, doi: 10.1109/ TNNLS.2016.2582924.
Ida, Y. and Fujiwara, Y. (2020) “Improving generalization performance of adaptive learning rate by switching from block diagonal matrix preconditioning to SGD,” 2020 International Joint Conference on Neural Networks (IJCNN) [Preprint]. Available at: https://doi.org/10.1109/ijcnn48605.2020.9207425.
Wang, Z. et al. (2022) ‘Human action recognition based on improved two-stream convolution network’, Applied Sciences, 12(12), p. 5784. doi:10.3390/app12125784.
Wang, X. et al. (2019) ‘I3d-LSTM: A new model for human action recognition’, IOP Conference Series: Materials Science and Engineering, 569(3), p. 032035. doi:10.1088/1757-899x/569/3/032035.
Muhamad, A.W. and Mohammed, A.A. (2023) ‘A comparative study using improved LSTM /GRU for human action recognition’, 15th March 2023. Vol.101. No 5, 101(5), pp. 1863–1879. doi:10.21203/rs.3.rs-2380406/v1.
Reddy, K.K. and Shah, M. (2013) “Recognizing 50 human action categories of web videos,” Machine Vision and Applications, 24(5), pp. 971–981. Available at: https://doi.org/10.1007/s00138-012-0450-4.
Sultana, F., Sufian, A. and Dutta, P. (2020) “A review of object detection models based on Convolutional Neural Network,” Advances in Intelligent Systems and Computing, pp. 1–16. Available at: https://doi.org/10.1007/978-981-15-4288-6_1.
D. S. David and M. Samraj, “Artech Journal of Effective Research in Engineering and Technology ( AJERET ) ISSN : 2523-6164 A Comprehensive Survey of Emotion Recognition System in Facial Expression,” no. 3, pp. 76–81, 2020.
K.G. Dhal, A. Das, S. Ray, J. Gálvez, and S. Das, “Histogram equalization variants as optimization problems: a review,” Archives of Computational Methods in Engineering, vol. 28, no. 3, pp.1471-1496, 2021.
K. Munadi, K. Muchtar, N. Maulina, and B. Pradhan, “Image enhancement for tuberculosis detection using deep learning,” IEEE Access, vol. 8, pp.217897-217907, 2020.
Z. Qin, Y. Zhang, S. Meng, Z. Qin, and K. R. Choo, “Imaging and fusing time series for wearable sensor-based human activity recognition,” Inf. Fusion, vol. 53, no. March 2019, pp. 80–87, 2020, doi: 10.1016/j.inffus.2019.06.014.
Qin, Zhen et al. (2020) ‘Imaging and fusing time series for wearable sensor-based human activity recognition’, Information Fusion, 53, pp. 80–87. doi:10.1016/j.inffus.2019.06.014.
Wang, Y. et al. (2019) “Classification of mice hepatic granuloma microscopic images based on a deep convolutional neural network,” Applied Soft Computing, 74, pp. 40–50. Available at: https://doi.org/10.1016/j.asoc.2018.10.006.
Alom, M.Z. et al. (2018) “Improved inception-residual convolutional neural network for object recognition,” Neural Computing and Applications, 32(1), pp. 279–293. Available at: https://doi.org/10.1007/s00521-018-3627-6.
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the Inception Architecture for Computer Vision,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-December, pp. 2818–2826, 2016, doi: 10.1109/CVPR.2016.308
Weinzaepfel, P.; Harchaoui, Z.; Schmid, C. Learning to track for spatio-temporal action localization. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 3164–3172
Mahasseni, B.; Todorovic, S. Regularizing long short-term memory with 3D human-skeleton sequences for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 3054–3062.
Donahue, J.; Anne Hendricks, L.; Guadarrama, S.; Rohrbach, M.; Venugopalan, S.; Saenko, K.; Darrell, T. Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 2625–2634.
Yu, S. et al. (2016) “Stratified pooling based deep convolutional Neural Networks for Human Action Recognition,” Multimedia Tools and Applications, 76(11), pp. 13367–13382. Available at: https://doi.org/10.1007/s11042-016-3768-5.
Gammulle, H. et al. (2017) “Two stream LSTM: A deep fusion framework for Human Action Recognition,” 2017 IEEE Winter Conference on Applications of Computer Vision (WACV) [Preprint]. Available at: https://doi.org/10.1109/wacv.2017.27.
Meng, B., Liu, X.J. and Wang, X. (2018) “Human action recognition based on quaternion spatial-temporal convolutional neural network and LSTM in RGB videos,” Multimedia Tools and Applications, 77(20), pp. 26901–26918. Available at: https://doi.org/10.1007/s11042-018-5893-9.
M.R., A., Makker, M. and Ashok, A. (2019) “Anomaly detection in surveillance videos,” 2019 26th International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW) [Preprint]. Available at: https://doi.org/10.1109/hipcw.2019.00031.
N. Jaouedi, N. Boujnah, and M. S. Bouhlel, “A new hybrid deep learning model for human action recognition,” J. King Saud Univ. - Comput. Inf. Sci., vol. 32, no. 4, pp. 447–453, 2020, doi: 10.1016/j.jksuci.2019.09.004.
Leong, M.C. et al. (2020) “Semi-cnn architecture for effective spatio-temporal learning in action recognition,” Applied Sciences, 10(2), p. 557. Available at: https://doi.org/10.3390/app10020557.
Kumar, B.S., Raju, S.V. and Reddy, H.V. (2021) “Human action recognition using a novel deep learning approach,” IOP Conference Series: Materials Science and Engineering, 1042(1), p. 012031. Available at: https://doi.org/10.1088/1757-899x/1042/1/012031.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
SJCUS's open access articles are published under a Creative Commons Attribution CC-BY-NC-ND 4.0 license.
