中国空间科学技术 ›› 2025, Vol. 45 ›› Issue (4): 48-60.doi: 10.16708/j.cnki.1000-758X.2025.0057

• 智能遥感应用专题I • 上一篇    下一篇

基于空间增强自注意力网络的无监督全色锐化方法

熊璋玺1,李伟1,2,*,杨飞3,林弘杨3   

  1. 1.北京理工大学 信息科学与电子工程学院,北京100081
    2.天基智能信息处理全国重点实验室, 北京100081
    3.长春长光辰谱科技有限公司, 长春130000
  • 收稿日期:2024-08-05 修回日期:2025-01-03 录用日期:2025-01-20 发布日期:2025-07-22 出版日期:2025-08-01

A spatial enhanced Transformer based unsupervised pansharpening method

XIONG Zhangxi1,LI Wei1,2,*,YANG Fei3,LIN Hongyang3   

  1. 1.School of Information and Electronics, Beijing Institute of Technology,Beijing 100081,China
    2.The National Key Laboratory of Science and Technology on Space-Born Intelligent Information Processing, Beijing 100081, China
    3.Changchun Champion Optics Co., Ltd., Changchun 130000, China
  • Received:2024-08-05 Revision received:2025-01-03 Accepted:2025-01-20 Online:2025-07-22 Published:2025-08-01

摘要: 针对全色图像与多光谱图像融合时存在空间纹理不丰富、光谱失真等问题,提出一种基于空间增强自注意力网络(Transformer)的无监督全色锐化方法。首先,设计一种多尺度特征提取模块获取全色图像与多光谱图像不同尺度下的特征信息,从而提高特征的泛化能力与模型的鲁棒性。其次,设计高频信息提取模块来提取全色图像的高频信息。获取的全色图像与多光谱图像的多尺度特征在经过简单融合后与全色图像的高频信息一同输入设计的空间增强Transformer中,设计的空间增强Transformer由自注意力机制与空间纹理增强注意力机制组成,自注意力机制可以捕获自相似性并提取长距离特征,空间纹理增强注意力机制确保只对纹理、边缘以及细节部分做增强。最后,特征经过多层空间增强Transformer融合与增强后重建得到具有高空间分辨率的多光谱图像。在PanCollection数据集里的GF-2和WV-3数据上分别进行对比实验,并使用7种质量评价指标对各方法的融合图像进行客观质量评价,提出方法的融合图像在两种数据集上的质量评价指标QNR均表现最优,分别为0.9692与0.9327。融合图像的视觉效果与质量评价指标表明提出的方法在主观视觉上和客观评价上均优于对比方法,能有效降低融合图像的空谱失真度。

关键词: 全色锐化, 全色图像, 多光谱图像, 多尺度特征提取, 自注意力网络

Abstract: Addressing issues such as insufficient spatial texture and spectral distortion in the fusion of panchromatic and multispectral images, an unsupervised pansharpening method based on spatially enhanced Transformer (Pan-SET) is proposed. Firstly, a multi-scale feature extraction module is designed to obtain features of panchromatic and multi-spectral images at different scales, thereby enhancing the generalization ability of features and the robustness of the model. Secondly, a high-frequency information extraction module is designed to extract high-frequency information from the panchromatic image. The multiscale features of the panchromatic and multispectral images, after undergoing simple fusion, are jointly input into the designed spatial enhanced Transformer along with the high-frequency information of the panchromatic image. The designed spatial enhanced Transformer consists of a self-attention mechanism and a spatial detail enhancement attention mechanism. The self-attention mechanism can capture self-similarity and extract long-range features, while the spatial detail enhancement attention mechanism ensures that only textures, edges, and detailed parts are enhanced. Finally, after fusion and enhancement through multiple layers of spatial enhancement Transformer, the features are reconstructed into multi-spectral images with high spatial resolution. Comparative experiments are conducted on the GF-2 and WV-3 data in PanCollection dataset, and seven quality evaluation indices are used to objectively assess the quality of the fused images obtained by various methods. The proposed method exhibits the best performance in terms of the quality evaluation index QNR on both datasets, with values of 0.9692 and 0.9327, respectively. The visual effects and quality evaluation indices of the fused images indicate that the proposed method outperforms the comparison methods both subjectively in visual perception and objectively in evaluation, effectively reducing the spatial-spectral distortion of the fused images.

Key words: pansharpening, panchromatic, multi-spectral, multi-scale feature extraction, Transformer