|
|
|
| A Cross-Modal Target Matching Method for Optical Image and SAR Images Based on Window Attention Mechanism |
| YANG Minghui1, WEI Yali2, LU Junyan1, LI Xinhai1 |
| 1. Hangzhou International Innovation Institute of Beihang University, Hangzhou 311115, Zhejiang, China;
2. Shanghai Electro-Mechanical Engineering Institute, Shanghai 201109, China |
|
|
|
|
Abstract This study addresses the issues of low efficiency and accuracy in target matching within remote sensing ship tracking scenarios, which are attributed to significant modal differences between optical images and synthetic aperture radar (SAR) images, as well as the inadequacy of cross-modal matching technology. In response, a novel optical-SAR cross-modal target matching method employing a window attention mechanism is proposed. The method designed a cross-modal dual-branch embedding module to process the two image types separately and extracted modality-agnostic features via a hierarchical window attention mechanism. It fused the modal information embedding and ship-size embedding to supplement the semantic and physical attribute information of ships and to enhance the learning of cross-modal-aligned features. Experimental results show that the proposed method achieves an overall mean Average Precision (mAP) of 46.0%, Top-1 (R1) matching accuracy of 60.8%, Top-5 (R5) matching accuracy of 74.4%, and Top-10 (R10) matching accuracy of 79.5% on the HOSS dataset. Compared with the state-of-the-art TransOSS model, R5 and R10 achieve improvements of 3.4% and 1.1% respectively. The key matching indicators in both the optical-to-SAR and SAR-to-optical directions are superior to those of the current optimal model. The research indicates that the proposed method outperforms the SOTA(state-of-the-art) model and provides technical support for continuous ship tracking across scenarios such as maritime search and rescue and shipping supervision.
|
|
Received: 10 November 2025
Published: 11 March 2026
|
|
|
|
|
|
| [1] |
CONG Xiaoyu, YANG Jiayi, SHAN Shichen, ZUO Qian. Research on Target Recognition Method Based on Multi-Source Information Fusion[J]. Air & Space Defense, 2026, 9(1): 12-19. |
| [2] |
MA Yonglin, LI Hao, XIONG Wei, LI Lingzhi, TANG Jingmian. Uncertainty Quantification Approach for Aerial Target Recognition Based on Hierarchical Bayesian Models[J]. Air & Space Defense, 2026, 9(1): 20-27. |
| [3] |
SHEN Tong, CHEN Jingxian, ZHONG Ping. Robustness of Radar Intelligent Recognition Models Under Adversarial Samples Attacks[J]. Air & Space Defense, 2026, 9(1): 46-51. |
| [4] |
SU Yalin, JIANG Guotao, ZHANG Tao, MA Jin, WEI Feiming, YU Wenxian. HRRP Data Augmentation Based on Conditional Diffusion Model[J]. Air & Space Defense, 2026, 9(1): 91-97. |
| [5] |
XU Han, ZHAO Jiahuan, MA Shanbin, OUYANG Yi, WANG Zhuang, JIANG Hongru. Air Combat Target Threat Assessment Method Based on Combined Weighting-ITOPSIS-GRA[J]. Air & Space Defense, 2026, 9(1): 115-122. |
| [6] |
ZHOU Yu, JIA Jun, LI Hao, DU Yihui, QIAO Wenyuan. Scene Generation Technology for Cognitive Deception of Intelligent Flying Vehicles[J]. Air & Space Defense, 2025, 8(4): 9-19. |
| [7] |
GU Chenxing, QUAN Jichuan, HUANG Zhixiong, LIU Guibin. Aerial Target Threat Assessment Model Based on Improved AHP-CRITIC-TOPSIS[J]. Air & Space Defense, 2025, 8(4): 68-77. |
| [8] |
JIAO Peng, WEI Longhuan, ZHOU Peng, ZHANG Qi. A Collaborative Decision-Making Modelling Method in Multi-Aircraft Air Combat Based on Communication Behavior Tree[J]. Air & Space Defense, 2025, 8(3): 40-49. |
| [9] |
JIANG Yihang, LUO Tiansu, LU Yingbo, ZHOU Jinpeng, YAO Fangjing. Research on Credibility Assessment Method for Simulation Model based on Mixed Data from Internal and External Field[J]. Air & Space Defense, 2025, 8(3): 59-65. |
|
|
|
|