This is the official implementation of 'Semantic Interpolative Diffusion Model: Bridging the Interpolation to Masks and Colonoscopy Image Synthesis for Robust Generalization', to be published at MICCAI 2025.
- Requirements
- Dataset Preparation
- Training Your Own SIDM
- Sampling with SIDM
- Inference with SIDM
- Acknowledgement
- Citations
conda create -n SIDM python=3.8.10
conda activate SIDM
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch
The proposed framework requires different processing for medical video data and snapshot data; therefore, a separation is necessary.
Please organize the dataset with the following structure:
├── ${data_root}
│ ├── ${train_dataset_dir}
│ │ ├── images_video
│ │ │ ├── ***.png
│ │ ├── images
│ │ │ ├── ***.png
│ │ ├── masks_video
│ │ │ ├── ***.png
│ │ ├── masks
│ │ │ ├── ***.png
Details on the processing of the proposed background semantic labels can be found in datasets_label.log
.
To train your own SIDM, follow these steps:
- Verify whether the training dataset, if it is a video dataset, has been properly separated.
- Verify the data processing procedure by referring to
polyp.py
. - Run the following command:
python train.py --data_path ./TrainDataset \
--save_dir 'your_path' \
--image_size 256 \
--n_epoch 5000 \
--n_T 1000 \
--batch_size 2 \
To sampling with SIDM, run the following command:
python sampling.py
Sampling
You can configure the interpolation ratio within the code to control the sampling process. By default, a 1:1 sampling ratio is used.
Note
Make sure to correctly set the save_dir
to avoid file saving issues.
The inference code applies interpolation between any two desired data samples.
We provide the LabeledDataset
used in this study.
To perform inference using this dataset, please refer to inference.ipynb
.
This repository is based on LDM, guided-diffusion, ArSDM, CFG and SDM. We sincerely thank the original authors for their valuable contributions and outstanding work.
To be published in Sep. 2025.