
Bria AI
The leading enterprise-grade, ethically sourced generative AI for visual content creation.


Custom Diffusion is a method for fine-tuning text-to-image diffusion models like Stable Diffusion with a few images (4-20) of a new concept. It achieves fast training (~6 minutes on 2 A100 GPUs) by fine-tuning only a subset of model parameters—key and value projection matrices in the cross-attention layers. This reduces storage per concept to 75MB. The method allows combining multiple concepts, such as a new object and artistic style. It uses regularization images to prevent overfitting. The tool provides scripts for single and multi-concept fine-tuning and merging fine-tuned models. Custom Diffusion supports training and inference through the Diffusers library and offers a dataset of 101 concepts with evaluation prompts.
Custom Diffusion is a method for fine-tuning text-to-image diffusion models like Stable Diffusion with a few images (4-20) of a new concept.
Explore all tools that specialize in text-to-image generation. This domain focus ensures Custom Diffusion delivers optimized results for this specific requirement.
Fine-tunes only key and value projection matrices in cross-attention layers, reducing training time significantly.
Each additional concept requires only 75MB of extra storage due to the parameter-efficient fine-tuning approach.
Supports the combination of multiple concepts, such as new object + new artistic style, multiple new objects, and new object + new category.
Enables merging of two fine-tuned models using optimization techniques to create a single model.
Supports training and inference using the Diffusers library, providing a user-friendly interface and access to advanced features.
Uses a small set of regularization images (200) to prevent overfitting during fine-tuning.
1. Clone the Custom Diffusion repository: `git clone https://github.com/adobe-research/custom-diffusion.git`
2. Navigate to the repository: `cd custom-diffusion`
3. Clone the Stable Diffusion repository: `git clone https://github.com/CompVis/stable-diffusion.git`
4. Navigate to the Stable Diffusion directory: `cd stable-diffusion`
5. Create a conda environment: `conda env create -f environment.yaml`
6. Activate the environment: `conda activate ldm`
7. Install clip-retrieval and tqdm: `pip install clip-retrieval tqdm`
8. Download the Stable Diffusion model checkpoint: `wget https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4.ckpt`
9. For single-concept fine-tuning, download and unzip the dataset: `wget https://www.cs.cmu.edu/~custom-diffusion/assets/data.zip && unzip data.zip`
10. Run the training script: `bash scripts/finetune_real.sh "cat" data/cat real_reg/samples_cat cat finetune_addtoken.yaml <pretrained-model-path>`
11. Save updated model weights: `python src/get_deltas.py --path logs/<folder-name> --newtoken 1`
12. Sample the fine-tuned model: `python sample.py --prompt "<new1> cat playing with a ball" --delta_ckpt logs/<folder-name>/checkpoints/delta_epoch=000004.ckpt --ckpt <pretrained-model-path>`
All Set
Ready to go
Verified feedback from other users.
"Users praise the efficiency and flexibility of Custom Diffusion for fine-tuning text-to-image models."
Post questions, share tips, and help other users.

The leading enterprise-grade, ethically sourced generative AI for visual content creation.

Enterprise-safe generative AI built directly into the Adobe creative ecosystem.

Turn text into high-conversion visual assets with AI-driven style precision and SEO-ready metadata.

Turn ideas into stunning AI visuals instantly with unlimited generation on top models.

State-of-the-art high-resolution image synthesis via efficient latent space compression.

A powerful web-based AI production platform for professional-grade visual asset generation.