πEnglish document πδΈζζζ‘£
Old HCP-Diffusion V1 at main branch
HCP-Diffusion is a Diffusion model toolbox built on top of the π± RainbowNeko Engine.
It features a clean code structure and a flexible Python-based configuration file, making it easier to conduct and manage complex experiments. It includes a wide variety of training components, and compared to existing frameworks, it's more extensible, flexible, and user-friendly.
HCP-Diffusion allows you to use a single .py config file to unify training workflows across popular methods and model architectures, including Prompt-tuning (Textual Inversion), DreamArtist, Fine-tuning, DreamBooth, LoRA, ControlNet, ....
Different techniques can also be freely combined.
This framework also implements DreamArtist++, an upgraded version of DreamArtist based on LoRA. It enables high generalization and controllability with just a single image for training.
Compared to the original DreamArtist, it offers better stability, image quality, controllability, and faster training.
Install pytorch
Install via pip:
pip install hcpdiff
# Initialize configuration
hcpinitInstall from source:
git clone https://github.com/7eu7d7/HCP-Diffusion.git
cd HCP-Diffusion
pip install -e .
# Initialize configuration
hcpinitUse xFormers to reduce memory usage and accelerate training:
# Choose the appropriate xformers version for your PyTorch version
pip install xformers==?RainbowNeko Engine supports configuration files written in a Python-like syntax. This allows users to call functions and classes directly within the configuration file, with function parameters inheritable from parent configuration files. The framework automatically handles the formatting of these configuration files.
For example, consider the following configuration file:
dict(
layer=Linear(in_features=4, out_features=4)
)During parsing, this will be automatically compiled into:
dict(
layer=dict(_target_=Linear, in_features=4, out_features=4)
)After parsing, the framework will instantiate the components accordingly. This means users can write configuration files using familiar Python syntax.
Features
| Model Name | Status |
|---|---|
| Stable Diffusion 1.5 | β Supported |
| Stable Diffusion XL (SDXL) | β Supported |
| PixArt | β Supported |
| FLUX | β Supported |
| Stable Diffusion 3 (SD3) | π§ In Development |
| Feature | Description/Support |
|---|---|
| LoRA Layer-wise Configuration | β Supported (including Conv2d) |
| Layer-wise Fine-Tuning | β Supported |
| Multi-token Prompt-Tuning | β Supported |
| Layer-wise Model Merging | β Supported |
| Custom Optimizers | β Supported (Lion, DAdaptation, pytorch-optimizer, etc.) |
| Custom LR Schedulers | β Supported |
| Method | Status |
|---|---|
| ControlNet (including training) | β Supported |
| DreamArtist / DreamArtist++ | β Supported |
| Token Attention Adjustment | β Supported |
| Max Sentence Length Extension | β Supported |
| Textual Inversion (Custom Tokens) | β Supported |
| CLIP Skip | β Supported |
| Tool/Library | Supported Modules |
|---|---|
| π€ Accelerate | β Supported |
| Colossal-AI | β Supported |
| xFormers | β Supported (UNet and text encoder) |
| Feature | Description |
|---|---|
| Aspect Ratio Bucket (ARB) | β Auto-clustering supported |
| Multi-source / Multi-dataset | β Supported |
| LMDB | β Supported |
| webdataset | β Supported |
| Local Attention Enhancement | β Supported |
| Tag Shuffling & Dropout | β Multiple tag editing strategies |
| Loss Type | Description |
|---|---|
| Min-SNR | β Supported |
| SSIM | β Supported |
| GWLoss | β Supported |
| Strategy Type | Status |
|---|---|
| DDPM | β Supported |
| EDM | β Supported |
| Flow Matching | β Supported |
| Feature | Description/Status |
|---|---|
| Image Preview | β Supported (workflow preview) |
| FID | π§ In Development |
| CLIP Score | π§ In Development |
| CCIP Score | π§ In Development |
| Corrupt Score | π§ In Development |
| εθ½ | ζθΏ°/ζ―ζζ ε΅ |
|---|---|
| Batch Generation | β Supported |
| Generate from Prompt Dataset | β Supported |
| Image to Image | β Supported |
| Inpaint | β Supported |
| Token Weight | β Supported |
HCP-Diffusion provides training scripts based on π€ Accelerate.
# Multi-GPU training, configure GPUs in cfgs/launcher/multi.yaml
hcp_train --cfg cfgs/train/py/your_config.py
# Single-GPU training, configure GPU in cfgs/launcher/single.yaml
hcp_train_1gpu --cfg cfgs/train/py/your_config.pyYou can also override config items via command line:
# Override base model path
hcp_train --cfg cfgs/train/py/your_config.py model.wrapper.models.ckpt_path=pretrained_model_pathUse the workflow defined in the Python config to generate images:
hcp_run --cfg cfgs/workflow/text2img.pyOr override parameters via command line:
hcp_run --cfg cfgs/workflow/text2img_cli.py \
pretrained_model=pretrained_model_path \
prompt='positive_prompt' \
negative_prompt='negative_prompt' \
seed=42- π§ Model Training Guide
- π§ LoRA Training Tutorial
- π¨ Image Generation Guide
- βοΈ Configuration File Explanation
- π§© Model Format Explanation
We welcome contributions to support more models and features.
Maintained by HCP-Lab at Sun Yat-sen University.
@article{DBLP:journals/corr/abs-2211-11337,
author = {Ziyi Dong and
Pengxu Wei and
Liang Lin},
title = {DreamArtist: Towards Controllable One-Shot Text-to-Image Generation
via Positive-Negative Prompt-Tuning},
journal = {CoRR},
volume = {abs/2211.11337},
year = {2022},
doi = {10.48550/arXiv.2211.11337},
eprinttype = {arXiv},
eprint = {2211.11337},
}