B4M: Breaking Low-Rank Adapter for Making Content-Style Customization
Yu Xu1,2, Fan Tang1, Juan Cao1, Yuxin Zhang3, Oliver Deussen4, Weiming Dong3, Jintao Li1, Tong-Yee Lee5
1Institute of Computing Technology, Chinese Academy of Sciences, 2University of Chinese Academy of Sciences, 3 Institute of Automation, Chinese Academy of Sciences, 4University of Konstanz, 5National Cheng Kung University
Abstract:
Personalized generation paradigms empower designers to customize visual intellectual properties with the help of textual descriptions by adapting pre-trained text-to-image models on a few images. Recent studies focus on simultaneously customizing content and detailed visual style in images but often struggle with entangling the two. In this study, we reconsider the customization of content and style concepts from the perspective of parameter space construction. Unlike existing methods that utilize a shared parameter space for content and style learning, we propose a novel framework that separates the parameter space to facilitate individual learning of content and style by introducing "partly learnable projection" (PLP) matrices to separate the original adapters into divided sub-parameter spaces. A "break-for-make" customization learning pipeline based on PLP is proposed: we first "break" the original adapters into "up projection" and "down projection" for content and style concept under orthogonal prior and then "make" the entity parameter space by reconstructing the content and style PLPs matrices by using Riemannian precondition to adaptively balance content and style learning. Experiments on various styles, including textures, materials, and artistic style, show that our method outperforms state-of-the-art single/multiple concept learning pipelines regarding content-style-prompt alignment.
Our code is built on Huggingface Diffusers (0.22.0), please follow sdxl for environment setup.
First clone this repo, and then
cd B4M
pip install -e .
The training process include two stages.
-
Train the content model
Run the following script:bash code/train_content.sh
-
Train the style model
Run the following script:bash code/train_style.sh
Note: In both scripts, please replace any dataset paths, output directories, and other file paths with your own.
After completing the first stage, run the following script to start the second-stage training:
bash code/train_second_stage.sh
As before, make sure to update the paths in the script to fit your environment.
After training is complete, you can run inference using:
python infer.py
Make sure to configure the model path and input settings inside the script as needed.
Some checkpoints are hosted on Google Drive due to file size limitations on GitHub. You can download them here:
Download Checkpoints from Google Drive
Content Reference | Style Reference | Prompt Checkpoint | Checkpoint |
---|---|---|---|
teddybear.jpg | paper.jpg | "an image of snq teddybear made from paper cutout art style" | teddybear_paper |
- Open source the code
- Upload the checkpoints used in the paper examples
If you make use of our work, please cite our paper:
@article{xu2025b4m,
title={B4M: Breaking Low-Rank Adapter for Making Content-Style Customization},
author={Xu, Yu and Tang, Fan and Cao, Juan and Zhang, Yuxin and Deussen, Oliver and Dong, Weiming and Li, Jintao and Lee, Tong-Yee},
journal={ACM Transactions on Graphics},
volume={44},
number={2},
pages={1--17},
year={2025},
publisher={ACM New York, NY}
}