AMD Nitro-E

🔆 Introduction

Nitro-E is a family of text-to-image diffusion models focused on highly efficient training. With just 304M parameters, Nitro-E is designed to be resource-friendly for both training and inference. For training, it only takes 1.5 days on a single node with 8 AMD Instinct™ MI300X GPUs. On the inference side, Nitro-E delivers a throughput of 18.8 samples per second (batch size 32, 512px images) a single AMD Instinct MI300X GPU. The distilled version can further increase the throughput to 39.3 samples per second. On a consumer iGPU device Strix Halo, our model can generate a 512px image using only 0.16 second.

This repository provides training and data preparation scripts to reproduce our results. We hope this codebase for efficient diffusion model training enables researchers to iterate faster on ideas and lowers the barrier for independent developers to build custom models.

📝 Change Log

[2025.10.24]: 🔥Release Nitro-E Release Nitro-E-512px model, Nitro-E-512px-GRPO post-training GRPO model, Nitro-E-512px-dist distilled model, training and inference code!

Instruction

Environment

Docker Image

When running on AMD Instinct^TM GPUs, it is recommended to use the public PyTorch ROCm images to get optimized performance out-of-the-box.

docker pull rocm/pytorch:rocm6.2.2_ubuntu22.04_py3.10_pytorch_release_2.3.0

Dependencies

pip install diffusers==0.32.2 transformers==4.49.0 accelerate==1.7.0 wandb torchmetrics pycocotools torchmetrics[image] mosaicml-streaming==0.11.0 beautifulsoup4 tabulate timm==0.9.1 pyarrow einops omegaconf sentencepiece==0.2.0 pandas==2.2.3 alive-progress

Install Flash attention

git clone https://github.com/ROCm/flash-attention.git
cd flash-attention
MAX_JOBS=`nproc` python setup.py install

Data Processing

The E-MMDiT models were trained on a dataset of ~25M images consisting of both real and synthetic data that are openly available on the internet, including Segment-Anything-1B, JourneyDB, and FLUX-generated images using prompts from DiffusionDB and DataComp.

We provide a full pipeline to create the data. Please go to datasets folder and run:

cd datasets
bash scripts/get_data_all.sh

to create the complete version of the dataset.

Or run:

cd datasets
bash scripts/get_data_partial.sh

to create a tiny version for testing purposes.

Model Training

Launch a training session using this script:

bash scripts/train_512.sh

Please modify configs/accelerate.yaml accordingly for multi-GPU / multi-node distributed training setup, torch compile, etc., and specific yaml files in configs for experimental settings.

Model Inference

Full-step Model

import torch
from core.tools.inference_pipe import init_pipe

device = torch.device('cuda:0')
dtype = torch.bfloat16
repo_name = "amd/Nitro-E"
resolution = 512
ckpt_name = 'Nitro-E-512px.safetensors'

# for 1024px model
# resolution = 1024
# ckpt_name = 'Nitro-E-1024px.safetensors'

pipe = init_pipe(device, dtype, resolution, repo_name=repo_name, ckpt_name=ckpt_name)
prompt = 'A hot air balloon in the shape of a heart grand canyon'
images = pipe(prompt=prompt, width=resolution, height=resolution, num_inference_steps=20, guidance_scale=4.5).images

Distilled Model

import torch
from core.tools.inference_pipe import init_pipe

device = torch.device('cuda:0')
dtype = torch.bfloat16
resolution = 512
repo_name = "amd/Nitro-E"
ckpt_name = 'Nitro-E-512px-dist.safetensors'

pipe = init_pipe(device, dtype, resolution, repo_name=repo_name, ckpt_name=ckpt_name)
prompt = 'A hot air balloon in the shape of a heart grand canyon'

images = pipe(prompt=prompt, width=resolution, height=resolution, num_inference_steps=4, guidance_scale=0).images

🔗 Related Projects

Nitro-T: Efficient Training of diffusion models.
Nitor-1: One-step distillation of diffusion models.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
configs		configs
core		core
datasets		datasets
scripts		scripts
LICENSE		LICENSE
README.md		README.md
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AMD Nitro-E

🔆 Introduction

📝 Change Log

Instruction

Environment

Docker Image

Dependencies

Install Flash attention

Data Processing

Model Training

Model Inference

Full-step Model

Distilled Model

🔗 Related Projects

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

AMD-AGI/Nitro-E

Folders and files

Latest commit

History

Repository files navigation

AMD Nitro-E

🔆 Introduction

📝 Change Log

Instruction

Environment

Docker Image

Dependencies

Install Flash attention

Data Processing

Model Training

Model Inference

Full-step Model

Distilled Model

🔗 Related Projects

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages