-
摘要: 基于視頻分析技術對生產現場人員安全帽佩戴情況進行自動檢測與識別是保障安全生產的重要手段. 但是,復雜的現場環境和多變的外界因素為安全帽檢測與識別的精確性提出挑戰. 本文基于YOLOv5模型的框架,提出一種DS-YOLOv5安全帽檢測與識別模型. 首先,利用改進的Deep SORT多目標跟蹤的優勢,提高視頻檢測中多目標檢測和有遮擋的容錯率,減少漏檢情況;其次在主干網絡中融合簡化的Transformer模塊,加強對圖像的全局信息的捕獲進而加強對小目標的特征學習;最后在網絡的Neck部分應用雙向特征金字塔網絡(BiFPN)融合多尺度特征,以便適應由攝影距離造成的目標尺度變化. 所提模型在GDUT-HWD和MOT多目標跟蹤數據集上進行了驗證實驗,結果表明DS-YOLOv5模型可以更好地適應遮擋和目標尺度變化,全類平均精度(mAP)可以達到95.5%,優于其他常見的安全帽檢測與識別方法.Abstract: Automatic detection and recognition of safety helmet wearing based on video analysis is important to ensure production safety. It is inefficient to supervise whether workers wear safety helmets by manual means. With the advancement of deep learning, using computer vision to assist in the detection of safety helmet-wearing holds significant research and application value. However, complex environments and variable factors pose challenges in achieving accurate detection and recognition of safety helmet usage. Helmet-wearing detection methods are generally classified as traditional machine learning and deep learning methods. Traditional machine learning methods employ manually selected features or statistical features, resulting in poor model stability. Deep learning–based methods are further divided into “two-stage” and “one-stage” methods. The two-stage method has high detection accuracy but cannot achieve real-time detection, while the one-stage counterpart is the reverse. Achieving accuracy as well as real-time detection is an important challenge in the development of video-based helmet detection systems. Accurate and quick detection of helmets is essential for effective real-time monitoring of production sites. To address these challenges, this paper proposes DS-YOLOv5—a real-time helmet detection and recognition model based on the YOLOv5 models. The proposed model solves three main problems: First, insufficient global information extraction problem of convolutional neural network (CNN) models. Second, the lacking robustness of the deep SORT for multiple targets and occlusion problems in video scenes. Third, the inadequate feature extraction of multiscale targets. The DS-YOLOv5 model takes advantage of the improved Deep SORT multitarget tracking algorithm to reduce the rate of missed detections in multitarget detection and occlusion and increase the error tolerance in video detection. Further, a simplified transformer module is integrated into the backbone network to enhance the capture of global information from images and thus enhance feature learning for small targets. Finally, the bidirectional feature pyramid network is used to fuse multiscale features, which can better adapt to target scale changes caused by the photographic distance. The DS-YOLOv5 model was validated using the GDUT-HWD dataset by ablation and comparison experiments. In these experiments, the tracking capability of the improved Deep SORT is compared with the YOLOv5 model using the public pedestrian dataset MOT. The results of the comparison of the five one-stage methods and four helmet detection and recognition models revealed that the proposed model has better capability for dealing with occlusion and target scale. Further, the model achieved mean average orecision (mAP) of 95.5%, which is superior to that of the other helmet detection and recognition models.
-
Key words:
- object detection /
- helmet wearing /
- YOLOv5 /
- deep SORT /
- transformer
-
圖 5 Deep SORT與改進模型在MOT16-09測試對比. (a) Deep SORT在場景 1 的檢測結果; (b) Deep SORT在場景 2 的檢測結果; (c) Deep SORT在場景 3 的檢測結果; (d)改進模型在場景 1 的檢測結果; (e)改進模型在場景 2 的檢測結果; (f)改進模型在場景 3 的檢測結果
Figure 5. Comparison of Deep SORT and improved Deep SORT in MOT16-09 detection: (a) detection results of Deep SORT in scenario 1; (b) detection results of Deep SORT in scenario 2; (c) detection results of Deep SORT in scenario 3; (d) detection results of improved Deep SORT in scenario 1; (e) detection results of improved Deep SORT in scenario2; (f) detection results of improved Deep SORT in scenario 3
表 1 基于GDUT-HWD數據集的消融實驗結果
Table 1. Results of the ablation experiments based on the GDUT-HWD dataset
Model mAP/% P/% R/% F1/% Param/106 Spend time/min YOLOv5 92.5 92.1 90.6 91.4 7.1 6.3 YOLOv5 + BiFPN 93.9 92.5 91.6 92.0 7.5 6.1 YOLOv5 + Transformer 93.7 92.6 97 94.7 8.1 9.1 YOLOv5 + Deep SORT 93.3 89.3 91.9 90.6 7.1 6.3 YOLOv5 + BiFPN + Transformer 94.4 92.1 99 95.3 8.2 8.8 YOLOv5 + BiFPN + Transformer + Deep SORT 95.5 89.6 98 93.6 8.2 8.8 表 2 Deep SORT改進前后的實驗結果對比
Table 2. Comparison results for improved Deep SORT
Model MOTA↑ MOTP↑ IDs↓ Deep SORT 48.5 77.1 48 Improved Deep SORT 51.2 77.6 40 表 3 基于GDUT數據集的對比實驗結果
Table 3. Results of comparison experiments based on the GDUT dataset
Model PNone/% PRed/% PWhite/% PYellow/% Pblue/% P/% R/% mAP/% Time/ms Weight/MB fps SSD512[6] 74.8 78.8 79.5 86.3 80.8 83.5 79.2 81.6 36.8 34.6 — YOLOv3[10] 82.4 92.4 75.7 81.9 94.4 86.9 80.4 86.2 14.5 84.3 2 YOLOv3-tiny[10] 74.1 82.8 75.3 80.8 85.0 84.5 76.3 79.6 6.4 28.2 12 YOLOv4[11] 83.4 94.2 86.1 92.0 95.7 92.4 81.1 90.3 14.2 237.4 — YOLOv5[12] 90.2 93.6 91.4 92.6 93.4 92.1 90.6 92.5 2.0 14.2 25 DS-YOLOv5 92.5 95.3 96.2 98.6 95.0 89.6 98.0 95.5 2.2 15.7 18 表 4 安全帽檢測模型對比實驗
Table 4. Results of the comparison of different safety helmet detection models
啪啪啪视频 -
參考文獻
[1] Liu X H, Ye X N. Skin color detection and hu moments in helmet recognition research. J East China Univ Sci Technol (Nat Sci Ed), 2014, 40(3): 365劉曉慧, 葉西寧. 膚色檢測和Hu矩在安全帽識別中的應用. 華東理工大學學報(自然科學版), 2014, 40(3):365 [2] Li K, Zhao X G, Bian J, et al. Automatic safety helmet wearing detection // 2017 IEEE 7th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER). Honolulu, 2017: 617 [3] Zhang L Y, Wu W H, Niu H M, et al. Summary of application research on helmet detection algorithm based on deep learning. Comput Eng Appl, 2022, 58(16): 1張立藝, 武文紅, 牛恒茂, 等. 深度學習中的安全帽檢測算法應用研究綜述. 計算機工程與應用, 2022, 58(16):1 [4] Yogameena B, Menaka K, Saravana Perumaal S. Deep learning-based helmet wear analysis of a motorcycle rider for intelligent surveillance system. IET Intell Transp Syst, 2019, 13(7): 1190 doi: 10.1049/iet-its.2018.5241 [5] Ferdous M, Masudul Ahsan S M. Multi-scale safety hardhat wearing detection using deep learning: A top-down and bottom-up module // 2021 International Conference on Electrical, Communication, and Computer Engineering (ICECCE ). Kuala Lumpur, 2021: 1 [6] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector // European Conference on Computer Vision. Amsterdam, 2016: 21 [7] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection // 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, 2016: 779 [8] Wu J X, Cai N, Chen W J, et al. Automatic detection of hardhats worn by construction personnel: A deep learning approach and benchmark dataset. Autom Constr, 2019, 106: 102894 doi: 10.1016/j.autcon.2019.102894 [9] Xu X F, Zhao W F, Zou H Q, et al. Detection algorithm of safety helmet wear based on MobileNet-SSD. Comput Eng, 2021, 47(10): 298徐先峰, 趙萬福, 鄒浩泉, 等. 基于MobileNet-SSD的安全帽佩戴檢測算法. 計算機工程, 2021, 47(10):298 [10] Redmon J, Farhadi A. Yolov3: An incremental improvement [J/OL]. arXiv eprints (2018-4-8) [2022-11-11]. https://arxiv.org/abs/1804.02767 [11] Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4: Optimal speed and accuracy of object detection [J/OL]. arXiv (2020-4-23) [2022-11-11]. https://arxiv.org/abs/2004.10934 [12] Ultralytics. yolov5 [DB/OL]. Ultralytics (2023)[2022-11-11] https://github.com/ultralytics/yolov5 [13] Zhao R, Liu H, Liu P L, et al. Research on safety helmet detection algorithm based on improved YOLOv5s. J B Univ Aeronaut Astronaut, 2021: 1趙睿, 劉輝, 劉沛霖, 等. 基于改進YOLOv5s的安全帽檢測算法. 北京航空航天大學學報, 2021:1 [14] Yue H, Huang X M, Lin M H, et al. Helmet-wearing detection based on improved YOLOv5. Comput Mod, 2022(6): 104岳衡, 黃曉明, 林明輝, 等. 基于改進YOLOv5的安全帽佩戴檢測. 計算機與現代化, 2022(6):104 [15] Huang L Q, Jiang L W, Gao X F. Improved algorithm of YOLOv3 for real-time helmet wear detection in videos. Mod Comput, 2020(30): 32黃林泉, 蔣良衛, 高曉峰. 改進YOLOv3的實時性視頻安全帽佩戴檢測算法. 現代計算機, 2020(30):32 [16] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need // Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, 2017: 6000 [17] Zhu X K, Lyu S C, Wang X, et al. TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios [J/OL]. arXiv (2021-8-26) [2022-11-11]. https://arxiv.org/abs/2108.11539 [18] He K M, Zhang X Y, Ren S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell, 2015, 37(9): 1904 doi: 10.1109/TPAMI.2015.2389824 [19] Liu S, Qi L, Qin H F, et al. Path aggregation network for instance segmentation // 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, 2018: 8759 [20] Wojke N, Bewley A, Paulus D. Simple online and realtime tracking with a deep association metric // 2017 IEEE International Conference on Image Processing (ICIP). Beijing, 2017: 3645 [21] Kalman R E. A new approach to linear filtering and prediction problems. J Basic Eng, 1960, 82(1): 35 doi: 10.1115/1.3662552 [22] Zheng Z H, Wang P, Liu W, et al. Distance-IoU loss: Faster and better learning for bounding box regression. Proc AAAI Conf Artif Intell, 2020, 34(7): 12993 [23] Carion N, Massa F, Synnaeve G, et al. End-to-end object detection with transformers // European Conference on Computer Vision. Glasgow, 2020: 213 [24] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16x16 words: Transformers for image recognition at scale [J/OL]. arXiv (2021-6-3) [2022-11-11]. https://arxiv.org/abs/2010.11929 [25] Tan M X, Pang R M, Le Q V. EfficientDet: scalable and efficient object detection // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, 2020: 10778 [26] Zheng L, Shen L Y, Tian L, et al. Scalable person re-identification: A benchmark // 2015 IEEE International Conference on Computer Vision (ICCV). Santiago, 2015: 1116 [27] Milan A, Leal-Taixé L, Reid I, et al. MOT16: A benchmark for multi-object tracking [J/OL]. arXiv eprints (2016-5-3) [2022-11-11]. https://ui.adsabs.harvard.edu/abs/2016arXiv160300831M [28] Lv Z X, Wei X, Ma Z G. Improve the lightweight safety helmet detection method of YOLOX. Comput Eng Appl, 2022, 59(1): 61呂志軒, 魏霞, 馬志鋼. 改進YOLOX的輕量級安全帽檢測方法. 計算機工程與應用, 2022, 59(1):61 [29] Wang J B, Wu Y X. Helmet wearing detection algorithm of improved YOLOV4-tiny, Comput Eng Appl, 2023, 59(4): 183王建波, 武友新. 改進YOLOv4-tiny 的安全帽佩戴檢測算法. 計算機工程與應用. 2023, 59(4):183 -