Abstract:
This paper proposes a method for generating paired raw images and annotations based on multi-scale structural constraints, which can simultaneously generate paired raw images and annotations for augmenting the dataset required to train medical image segmentation models. This approach ultimately alleviates the limitations imposed by data scarcity on the generalization ability of medical image segmentation models. The method utilizes a label generation network to output multi-resolution annotation images that guide the generation of raw images, ensuring that both share multi-scale semantic features. Meanwhile, prior annotation information and a distance loss function are introduced into the image generation network to explicitly constrain the structural differences at the lower layers. This drives the image generator to focus more on the structural consistency between annotations and images during training, thereby producing paired high-quality raw images and annotations. Experimental results show that the raw images and annotations generated by this method exhibit diverse feature representations and highly consistent lower-level structures. As a data augmentation technique, it significantly outperforms traditional data augmentation methods in terms of accuracy gain for segmentation tasks. This method can reduce the cost of acquiring and annotating medical image data and has been demonstrated to be effectively applicable to datasets from various domains, promoting the practical application of artificial intelligence technology.