Abstract:Visible images, synthetic aperture radar(SAR) images, and infrared images have distinct advantages, including high resolution, robust resistance to environmental interference, and clear visualisation of thermal objects, respectively. Utilising the combination of these three modalities, the key components of a warship can be accurately detected under all-day and all-weather conditions. However, existing publicly available warship datasets only consist of single-modality images. Thus, there is a lack of datasets that allow three modalities to be aligned in both time and space. To resolve this issue, this paper proposed a method for constructing a multi-modal warship image dataset through a virtual engine and multi-modal data generation model. The dataset comprised 5 055 images, with 1 685 visible images, 1 685 infrared images, and 1 685 SAR images, each with a resolution of 640×640 pixels. To allow accurate detection of the key components of a warship, this paper introduced a method via adaptive region localisation. Specifically, the reconstructed three modal images were utilised as input data. A multi-scale feature extraction neck module was employed to acquire features by exploring the information complementary of multiple modalities at different scales. To better locate the region of the objects, a regional adaptive localisation module was designed. Finally, the key parts of the warship were detected accurately. Experimental results on the constructed multi-modal warship image dataset demonstrate that the proposed method can increase detection accuracy. When compared to the benchmark model, the proposed method improves the mean average precision by 5.52% when the IoU threshold is 0.5.
王一力, 李强, 沈俊逸, 杨翊东, 王琦. 多模态舰船图像生成及其关键部位检测[J]. 空天防御, 2025, 8(1): 77-85.
WANG Yili, LI Qiang, SHEN Junyi, YANG Yidong, WANG Qi. Multi-Modal Warship Image Generation and Its Key Part Detection. Air & Space Defense, 2025, 8(1): 77-85.