Main article

Mehmet Yılmaz
Department of Electrical and Electronics Engineering, Middle East Technical University, Ankara, Türkiye
Ayşe Kaya Demir
Department of Computer Engineering, Istanbul Technical University, Istanbul, Türkiye
Burak Arslan*
Department of Civil Engineering, Bilkent University, Ankara, Türkiye
burak.arslan@bilkent.edu.tr

Abstract

Pavement-crack detection under variable illumination remains a hard problem for real-time one-stage detectors: strong shadows and over-exposed highlights corrupt mid-level feature maps, while blurred crack edges make bounding-box regression unreliable. This paper presents a comparative analytical study of channel–spatial attention mechanisms paired with dynamic Intersection-over-Union regression losses, organised around the hypothesis that the two components couple together in a closed-loop that produces super-additive improvement on tight-localisation performance. We evaluate eighteen attention configurations (six attention mechanisms at three insertion positions) crossed with four bounding-box regression losses, on a geographically stratified dataset of 1,250 pavement images collected from Turkish secondary and tertiary roads and split into low-light, standard, and high-light illumination sub-populations. The best configuration—a Convolutional Block Attention Module inserted at the end of the detector neck, paired with the Wise-IoU v3 dynamic focusing loss—improves aggregate mean average precision at IoU 0.5 by 9.0 percentage points over a strong one-stage baseline, improves mean average precision at IoU range 0.5–0.95 by a super-additive 8.3 points, and sustains a 60 FPS inference rate at only 3.16 million parameters. The improvements are largest on the low-light sub-population (+15.5% relative), confirming that attention-driven feature purification is specifically valuable where raw features are most corrupted. A Pareto analysis against nine one-stage baselines places the proposed configuration strictly above every alternative on the parameter–accuracy frontier, demonstrating that the accuracy gain does not arise from additional representational capacity but from a reorganisation of capacity around the localisation objective.

Article details

How to Cite

Yılmaz, M., Demir, A. K., & Arslan, B. (2024). Attention-Driven Feature Purification in One-Stage Detectors: A Comparative Analytical Study of Channel–Spatial Mechanisms Under Illumination-Variant Conditions. Journal of AI Analytics and Applications, 2(3), 1-26. https://doi.org/10.63646/jaiaa.2024.020301