基于改進RT-DETR的有遮擋交通標志檢測算法

于天河; 楊壯壯; 胡金帥; 常夢瑤; 王文龍

doi:10.13374/j.issn2095-9389.2025.06.11.001

摘要: 針對交通標志檢測中目標尺寸小、檢測精度低等問題，尤其是在遠距離拍攝、遮擋嚴重的情況下，傳統檢測算法往往難以準確識別交通標志. 本文提出了一種基于改進RT-DETR的交通標志檢測算法. 首先，考慮到當前交通標志被遮擋情況下數據集的匱乏，自建一個遮擋條件下的交通標志數據集. 然后，在反向殘差移動塊中引入膨脹重參數塊，構建了一個輕量級的復合膨脹殘差塊來替換原始主干提取網絡中的BasicBlock，增強了模型的特征提取能力. 最后，對RT-DETR模型的損失函數進行了優化，提出了DS-IoU聯合損失函數加快收模型斂速度. 實驗結果表明，改進后的算法在自制數據集上的mAP為94.2%，相比于原始算法增加量為4.7%，在公開數據集TT100K和CCTSDB2021的mAP分別為92.8%和91.7%，相比于原始算法增加量分別為3.1%和2.4%，Params和GFLOPs相比于原始的算法分別降低了26.0%和12.5%. 本文提出的改進方法極大地減少了計算量和參數數量，有效提升了遮擋情況下的交通標志的檢測精度.

Abstract: Accurate traffic-sign detection is a foundational capability for intelligent transportation systems and autonomous driving technologies; however, it remains a formidable challenge in real-world environments characterized by small scales, severe occlusions, highly variable lighting conditions, and complex backgrounds. Traditional convolutional neural network (CNN)-based detectors often struggle to maintain reliable performance when traffic signs appear at long distances or become partially hidden by vehicles, foliage, or roadside infrastructure owing to inherent limitations in feature extraction, scale sensitivity, and model robustness. To overcome these limitations, this paper presents an enhanced RT-DETR-based approach specifically tailored for occluded-traffic-sign detection under resource-constrained conditions. First, recognizing the scarcity of publicly available data that accurately reflect occlusion scenarios, we curated the traffic sign dataset under occlusion conditions (TSDOC), which comprises 4698 high-resolution images annotated across eight common traffic sign categories—including prohibitory, warning, and indicative signs—with 3572 images allocated for training and 1126 for testing. TSDOC systematically simulates real driving environments by incorporating diverse occlusion types, such as partial masking by other vehicles, foreign object attachment, dynamic shadows, and varying degrees of weather-induced visibility reduction. This enables a rigorous evaluation of detection methods under complex, safety-critical scenarios that closely mirror roadside conditions. Second, to improve the small and occluded object representation without incurring in excessive computational overhead, we redesigned the RT-DETR backbone by replacing the standard ResNet-18 BasicBlock with a novel composite dilated residual block (CDRB). Each CDRB integrates a dilated reparameterization block (DRB) into an inverted residual mobile block (iRMB), thereby combining multi-scale dilated convolutions that capture long-range pixel dependencies essential for reconstructing partially visible sign features with structural reparameterization techniques that streamline the inference graph for reduced latency. Consequently, the modified backbone achieves a 26.0% reduction in parameter count and a 12.5% decrease in floating-point operations per second (GFLOPs) compared to the baseline RT-DETR-R18, while maintaining or improving feature discrimination for occluded targets. Third, for faster convergence and enhanced localization precision—particularly for small and partially occluded signs—we introduce the dynamic scaled IoU loss (DS-IoU), a novel joint loss function that integrates Inner-IoU’s auxiliary bounding box strategy with a dynamically adjustable scaling factor Ratio and incorporates the minimal point distance metric from MPDIoU. This adaptive loss formulation emphasizes interior region overlap and geometric consistency during training, effectively replacing the conventional GIoU loss and enabling the model to focus on the most informative spatial regions under challenging conditions. Comprehensive experiments demonstrate the effectiveness of the proposed approach. On the TSDOC, TT100K, and CCTSDB2021 benchmarks, the proposed model achieved a mean average precision (mAP) of 94.2%, 92.8%, and 91.7%, respectively (a 4.7%, 3.1%, and 2.4% gain over RT-DETR). The real-time inference speed reached 112.8 s^?1?a 18.5% improvement over RT-DETR. Ablation studies show that replacing the backbone with CDRB yields a 2.8% mAP increase, while DS-IoU further boosts recall under occlusion by 3.7%. This lightweight architecture and optimized loss function deliver higher detection accuracy and efficiency in occluded-traffic-sign scenarios, making it well suited for deployment in resource-constrained embedded systems.

基于改進RT-DETR的有遮擋交通標志檢測算法

Blocked-traffic-sign detection algorithm based on improved RT-DETR