Chinese Space Science and Technology ›› 2025, Vol. 45 ›› Issue (4): 48-60.doi: 10.16708/j.cnki.1000-758X.2025.0057

Previous Articles     Next Articles

A spatial enhanced Transformer based unsupervised pansharpening method

XIONG Zhangxi1,LI Wei1,2,*,YANG Fei3,LIN Hongyang3   

  1. 1.School of Information and Electronics, Beijing Institute of Technology,Beijing 100081,China
    2.The National Key Laboratory of Science and Technology on Space-Born Intelligent Information Processing, Beijing 100081, China
    3.Changchun Champion Optics Co., Ltd., Changchun 130000, China
  • Received:2024-08-05 Revision received:2025-01-03 Accepted:2025-01-20 Online:2025-07-22 Published:2025-08-01

Abstract: Addressing issues such as insufficient spatial texture and spectral distortion in the fusion of panchromatic and multispectral images, an unsupervised pansharpening method based on spatially enhanced Transformer (Pan-SET) is proposed. Firstly, a multi-scale feature extraction module is designed to obtain features of panchromatic and multi-spectral images at different scales, thereby enhancing the generalization ability of features and the robustness of the model. Secondly, a high-frequency information extraction module is designed to extract high-frequency information from the panchromatic image. The multiscale features of the panchromatic and multispectral images, after undergoing simple fusion, are jointly input into the designed spatial enhanced Transformer along with the high-frequency information of the panchromatic image. The designed spatial enhanced Transformer consists of a self-attention mechanism and a spatial detail enhancement attention mechanism. The self-attention mechanism can capture self-similarity and extract long-range features, while the spatial detail enhancement attention mechanism ensures that only textures, edges, and detailed parts are enhanced. Finally, after fusion and enhancement through multiple layers of spatial enhancement Transformer, the features are reconstructed into multi-spectral images with high spatial resolution. Comparative experiments are conducted on the GF-2 and WV-3 data in PanCollection dataset, and seven quality evaluation indices are used to objectively assess the quality of the fused images obtained by various methods. The proposed method exhibits the best performance in terms of the quality evaluation index QNR on both datasets, with values of 0.9692 and 0.9327, respectively. The visual effects and quality evaluation indices of the fused images indicate that the proposed method outperforms the comparison methods both subjectively in visual perception and objectively in evaluation, effectively reducing the spatial-spectral distortion of the fused images.

Key words: pansharpening, panchromatic, multi-spectral, multi-scale feature extraction, Transformer