场景自适应的粒度渐进多模态图像融合方法

Scene adaptive granularity progressive multimodal image fusion method

  • 摘要: 为了解决光照复杂多变条件下的多场景视觉感知难题,本文提出了一种场景自适应的粒度渐进多模态图像融合方法。该方法设计了图像编码器对场景信息进行描述编码嵌入融合网络,引导网络根据场景信息的不同生成不同风格的融合图像;设计了基于状态空间方程的特征提取模块提升网络的特征表达能力,以线性复杂度实现全局特征感知;设计了多模态特征全局细化融合的粒度渐进融合模块,以序列化的多模态特征构建跨模态坐标注意力机制对多模态特征进行微调融合;同时,使用先验知识生成增强图像作为标签,根据环境的不同构建同源异构的损失,以实现场景自适应多模态图像融合。实验结果表明,本文方法在MSRS,TNO,LLVIP三个公开数据集上与10种先进算法进行了对比分析,取得了更好的视觉效果和定量指标。

     

    Abstract: In order to solve the problem of multi-scene visual perception under complex and varied lighting conditions, a scene adaptive granularity progressive multi-modal image fusion method is proposed in this paper. In this method, an image encoder is designed to encode the scene information into the fusion network and guide the network to generate different styles of fusion images according to the different scene information. A feature extraction module based on state space equation is designed to improve the feature representation ability of the network and realize global feature perception with linear complexity. A granular progressive fusion module for global refinement fusion of multi-modal features is designed, and a cross-modal coordinate attention mechanism is constructed to fine-tune the multi-modal features by serializing multi-modal features. At the same time, the prior knowledge is used to generate enhanced images as labels, and the homologous and heterogeneous losses are constructed according to different environments to achieve scene adaptive multimodal image fusion. The experimental results show that the proposed method is compared with 10 advanced algorithms on MSRS, TNO and LLVIP three public data sets, and better visual effects and quantitative indicators are obtained.

     

/

返回文章
返回