Remote sensing image segmentation method based on dynamic optimized detail-aware network[J]. Chinese Journal of Engineering. DOI: 10.13374/j.issn2095-9389.2025.04.09.002
Citation: Remote sensing image segmentation method based on dynamic optimized detail-aware network[J]. Chinese Journal of Engineering. DOI: 10.13374/j.issn2095-9389.2025.04.09.002

Remote sensing image segmentation method based on dynamic optimized detail-aware network

  • Semantic segmentation technology has important application value in the field of remote sensing image processing and has been widely used in many fields. However, the complexity of high-resolution remote sensing images is mainly reflected in the following aspects: complex background interference, large intra-class differences and obvious inter-class similarities, resulting in blurred target boundaries. At the same time, the scale of target objects in the image varies greatly (such as buildings, vegetation, roads, etc., with large size differences), which further exacerbates the challenge of the segmentation task. The existing remote sensing image segmentation models, such as those based on Convolutional Neural Networks (CNN) and Transformer frameworks, have achieved great success. However, they still face challenges such as difficulty in fully preserving the detailed feature maps of the original encoder and dynamically capturing global contextual information. Therefore, based on the CNN-Transformer hybrid framework, a novel segmentation method called Dynamic Optimized Detail-Aware Network (DODNet) is proposed. The ResNext-50 is firstly adopted as the backbone network at encoder and a multi-subtraction perception module (MSPM) is designed to collect the spatial detail differences between multi-scale feature maps, which efficiently reduces the redundant information. Then, a dynamic information fusion block (DIFB) is designed at decoder, which combines a global bi-level routing self-attention branch and a local attention branch. The global bi-level routing self-attention branch first utilizes a learnable regional routing network to filter out low-association background areas, and then performs fine-grained attention calculation within the retained semantic key windows. This effectively addresses the dual challenges of background interference and computational efficiency in remote sensing image processing, achieving efficient global modeling. The local attention branch compensates for the local information that is difficult to capture by the global bi-level routing self-attention branch by utilizing multi-scale convolutions. Finally, a new channel-spatial attention module——unified feature extractor (UFE) is proposed for further acquiring the semantic and contextual information. The quantitative and visual analyses based on the comparison and ablation experiments on the Vaihingen and Potsdam datasets show that DODNet outperforms eight state-of-the-art segmentation methods in terms of F1 score, OA and mIoU. Especially, the mIoU reaches 84.96% and 87.64%, which verifies the strong ability of the proposed DODNet in dealing with the segmentation problem with complex background interference, large intra-class differences and obvious inter-class similarities.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return