Skip navigation

MGSANet: a multiscale graph spatial alignment network for weakly aligned RGB-thermal object detection

MGSANet: a multiscale graph spatial alignment network for weakly aligned RGB-thermal object detection

Wang, Qingwang, Sun, Yuxuan, Shen, Tao, Al-Antary, Mohammad, Alasmary, Hisham and Waqas, Muhammad ORCID logoORCID: https://orcid.org/0000-0003-0814-7544 (2025) MGSANet: a multiscale graph spatial alignment network for weakly aligned RGB-thermal object detection. IEEE Transactions on Geoscience and Remote Sensing, 64:5000318. ISSN 0196-2892 (Print), 1558-0644 (Online) (doi:10.1109/TGRS.2025.3647051)

[thumbnail of Author's Accepted Manuscript]
Preview
PDF (Author's Accepted Manuscript)
52399 WAQAS_MGSANet_A_Multiscale_Graph_Spatial_Alignment_Network_(AAM)_2025.pdf - Accepted Version
Available under License Creative Commons Attribution.

Download (77MB) | Preview

Abstract

The current mainstream research on color-thermal (i.e., RGB-T) object detection assumes that the RGB and thermal images are strictly aligned. However, in practical situations, due to the insufficient spatiotemporal synchronization, the stereo disparity of the camera in the installation position, and the errors in the image pair’s registration process, the position of the same objects in RGB and thermal images is not completely overlapped. The position shift can cause distortion, trailing, and blurring issues during the image fusion process, leading to a decrease in model detection accuracy. To address this challenge, we propose a novel multiscale graph spatial alignment network (MGSANet), which can effectively alleviate the negative effects of cross-modal image misalignment. Specifically, we represent the feature maps extracted from RGB and thermal images by a backbone network as a graph structure, and use a graph attention network (GAT) to model the spatial position deviation relationship. Furthermore, considering the multiscale characteristics of the objects, we represent the feature maps with multiscale graphs. We then align RGB and thermal feature maps in a potential feature space according to the learned deviation relationship for object detection. In addition, considering the scarcity of RGB-T datasets from the perspective of unmanned aerial vehicle (UAV) and to verify the object detection performance on different platforms, we construct an RGB-T object detection dataset collected by the UAV platform, named KUSTDrone. We conducted experiments on datasets collected by vehicle and UAV platforms, respectively. Experimental results demonstrate that MGSANet outperforms the competitive methods for weakly aligned RGB-T object detection. The dataset will be accessible at: https://github.com/KustTeamWQW/KUSTDrone with a license, and the code will also be accessible at: https://github.com/KustTeamWQW/MGSANet.

Item Type: Article
Uncontrolled Keywords: RGB-Thermal benchmark dataset, image misalignment network, multiscale graph construction, graph spatial alignment, RGB-Thermal object detection.
Subjects: Q Science > Q Science (General)
Q Science > QA Mathematics
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Faculty / School / Research Centre / Research Group: Faculty of Engineering & Science
Faculty of Engineering & Science > School of Computing & Mathematical Sciences (CMS)
Last Modified: 05 Feb 2026 12:40
URI: https://gala.gre.ac.uk/id/eprint/52399

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics