基于自適應角度分類與動態樣本匹配的旋轉目標檢測方法

劉寒; 周鵬; 閆函

doi:10.13374/j.issn2095-9389.2025.06.09.006

摘要: 旋轉目標檢測旨在精準識別任意方向分布的目標，常用于遙感、工業字符識別等復雜場景. 針對角度回歸不連續與樣本匹配不穩定等挑戰，本文提出了一種基于角度分類的新型檢測框，該方法在YOLOv8基礎上進行了兩方面關鍵改進：設計了結合目標幾何形狀的自適應角度平滑標簽（SA-ASL），將角度預測由回歸問題轉化為自適應的標簽分類問題，提升角度預測精度與穩定性；另外引入了漸進式的動態正負樣本匹配機制，融合水平與旋轉IoU，增強模型訓練過程中的正樣本選擇質量. 本方法在公開的DOTA數據集上的mAP值達到0.786，在工業字符數據集上的mAP值達到了0.924，顯示出良好的泛化能力與魯棒性，證明其在旋轉目標檢測任務中的實用價值.

Abstract: Rotated object detection (ROD) is a critical subtask in computer vision, particularly in real-world applications such as aerial remote sensing and industrial character detection, where objects frequently appear in arbitrary orientations with diverse aspect ratios. Unlike standard object detection, which assumes axis-aligned bounding boxes, ROD requires precise estimation of both object location and orientation. Conventional rotation regression methods suffer from angle periodicity and discontinuity, resulting in unstable training and inaccurate predictions. In addition, densely packed scenes with complex backgrounds make positive and negative sample assignment highly sensitive, often leading to suboptimal convergence. To address these challenges, this study proposes a rotation-aware object detection approach based on YOLOv8, enhanced through two key components: a shape-aware adaptive angle classification strategy and a progressive dynamic matching mechanism. The angle classification strategy replaces traditional continuous angle regression with discrete angle classification. Angle annotations are transformed into soft label vectors using a circular Gaussian window function to preserve angle periodicity. A novel feature of this design is the incorporation of target shape information, where the smoothing parameter of the label distribution is adaptively adjusted based on the object’s aspect ratio. Specifically, for elongated targets such as ships or text lines, a narrow window enforces sharp classification around the true angle, enabling fine-grained orientation discrimination. Conversely, for square-like or low-aspect-ratio objects, a wider window accommodates angular ambiguity and stabilizes training across diverse target geometries. This shape-aware mechanism mitigates angular discontinuities and enhances classification accuracy in multi-oriented detection tasks. To complement the angle classification, a progressive dynamic sample matching mechanism is introduced to improve the quality of positive sample selection during training. Instead of relying solely on rotated IoU (rIoU)—which is unreliable in early training when angle predictions are inaccurate—the method begins with horizontal IoU (hIoU) and gradually incorporates rIoU through linear interpolation as training proceeds. The final matching score integrates three components: classification confidence, IoU-based localization quality, and a cosine-based angle-consistency term. This unified metric guides the selection of top-K positive samples for each ground truth object, emphasizing high-quality matches while suppressing low-quality or ambiguous ones. This progressive transition improves training stability, accelerates convergence, and enhances rotation alignment between predictions and ground truth. Extensive experiments are conducted on two datasets. On the DOTA dataset, which includes multiple object classes with diverse orientations and aspect ratios, the proposed method achieves a mean Average Precision (mAP) of 0.786, with notable improvements in high-aspect-ratio categories such as ships, vehicles, and containers. On a custom industrial character dataset consisting of densely arranged, multi-oriented alphanumeric components captured under complex conditions, the method achieves a mAP of 0.924, demonstrating strong generalization to scene-text-like tasks. Ablation studies isolate the contribution of each component: the shape-aware classification yields a 4.3% improvement in angle-sensitive categories, while the dynamic matching strategy produces smoother loss curves and more concentrated attention heatmaps. The method preserves the anchor-free structure and real-time inference capability of YOLOv8 while substantially improving performance in rotation-sensitive contexts. All modifications are lightweight and easily integrable into existing pipelines without structural changes to the backbone.

基于自適應角度分類與動態樣本匹配的旋轉目標檢測方法

Rotated object detection using adaptive angle classification and dynamic sample matching