OpenShift AI Demo: Text-to-Image Generation

This demonstration showcases the complete machine learning workflow in Red Hat OpenShift AI, taking you from initial experimentation to production deployment. Using Stable Diffusion for text-to-image generation, you'll learn how to experiment with models, fine-tune them with custom data, create automated pipelines, and deploy models as scalable services.

What You'll Learn

Data Science Projects: Creating and managing ML workspaces in OpenShift AI
GPU-Accelerated Workbenches: Leveraging NVIDIA GPUs for model training and inference
Model Experimentation: Working with pre-trained models from Hugging Face
Fine-Tuning: Customizing models with your own data using Dreambooth
Pipeline Automation: Building repeatable ML workflows with Data Science Pipelines
Custom Runtime Development: Building KServe runtimes
Model Serving: Deploying models as REST APIs using KServe with multiple deployment options
Production Integration: Connecting served models to applications and MCP servers
Multi-Modal AI: Combining text and image generation in unified applications

Prerequisites

Platform Requirements

Red Hat OpenShift cluster (4.12+)
Red Hat OpenShift AI installed (2.9+)
- For managed service: Available as add-on for OpenShift Dedicated or ROSA
- For self-managed: Install from OperatorHub
GPU node with at least 45GB memory (NVIDIA L40S recommended, A10G minimum for smaller models)

Storage Requirements

S3-compatible object storage (MinIO, AWS S3, or Ceph)
Two buckets configured:
- pipeline-artifacts: For pipeline execution artifacts
- models: For storing trained models

Access Requirements

OpenShift AI Dashboard access
Ability to create Data Science Projects
(Optional) Hugging Face account with API token for model downloads

Quick Start

Access OpenShift AI Dashboard
- Navigate to your OpenShift console
- Click the application launcher (9-dot grid)
- Select "Red Hat OpenShift AI"
Create a Data Science Project
- Click "Data Science Projects"
- Create a new project named image-generation
Set Up Storage
- Import setup/setup-s3.yaml to create local S3 storage (for demos)
- Or configure your own S3-compatible storage connections
Create a Workbench
- Select PyTorch notebook image
- Allocate GPU resources
- Add environment variables (including HF_TOKEN if available)
- Attach data connections

Clone This Repository

git clone https://github.com/cfchase/text-to-image-demo.git
cd text-to-image-demo

Follow the Notebooks
- 1_experimentation.ipynb: Initial model testing
- 2_fine_tuning.ipynb: Training with custom data
- 3_remote_inference.ipynb: Testing deployed models

Key Components

Workbenches: Jupyter notebook environments for development
Pipelines: Automated ML workflows using Kubeflow
Custom Runtime: Diffusers runtime for image generation
Model Serving: Deploy models as REST APIs with multiple storage options
Storage: S3-compatible object storage, PVC, or HuggingFace Hub integration
External Integration: MCP server support for modern AI application development

Detailed Setup Instructions

1. Storage Configuration

Option A: Demo Setup (Local S3)

oc apply -f setup/setup-s3.yaml

This creates:

MinIO deployment for S3-compatible storage
Two PVCs for buckets
Data connections for workbench and pipeline access

Option B: Production Setup (External S3)

Create data connections with your S3 credentials:

Connection 1: "My Storage" - for workbench access
Connection 2: "Pipeline Artifacts" - for pipeline server

2. Workbench Configuration

When creating your workbench:

Notebook Image: Choose based on your needs

Standard Data Science: Basic Python environment
PyTorch: Includes PyTorch, CUDA support (recommended for this demo)
TensorFlow: For TensorFlow-based workflows
Custom: Use your own image with specific dependencies

Resources:

Small: 2 CPUs, 8Gi memory
Medium: 7 CPUs, 24Gi memory
Large: 14 CPUs, 56Gi memory
GPU: Add 1-2 NVIDIA GPUs (required for this demo)

Environment Variables:

HF_TOKEN=<your-huggingface-token>  # For model downloads
AWS_S3_ENDPOINT=<s3-endpoint-url>   # Auto-configured if using data connections
AWS_ACCESS_KEY_ID=<access-key>      # Auto-configured if using data connections
AWS_SECRET_ACCESS_KEY=<secret-key>  # Auto-configured if using data connections
AWS_S3_BUCKET=<bucket-name>         # Auto-configured if using data connections

3. Pipeline Server Setup

In your Data Science Project, go to "Pipelines" → "Create pipeline server"
Select the "Pipeline Artifacts" data connection
Wait for the server to be ready (2-3 minutes)

4. Model Serving Configuration

After training your model:

Deploy the custom Diffusers runtime:

cd diffusers-runtime
make build
make push

Choose your deployment template based on model storage:

# For S3 storage-based models
oc apply -f templates/redhat-dog.yaml

# For HuggingFace Hub models (recommended)
oc apply -f templates/redhat-dog-hf.yaml

# For PVC-based storage
oc apply -f templates/redhat-dog-pvc.yaml

# For testing with lightweight models
oc apply -f templates/tiny-sd-gpu.yaml

The runtime includes advanced optimizations:
- Automatic hardware detection (CUDA/MPS/CPU)
- Intelligent dtype selection with fallback chains
- Configurable memory optimizations
- Universal model loading support

Project Structure

text-to-image-demo/
├── README.md                    # This file
├── ARCHITECTURE.md              # Technical architecture details
├── PIPELINES.md                 # Pipeline automation guide
├── SERVING.md                   # Model serving guide
├── DEMO_SCRIPT.md              # Step-by-step demo script
│
├── 1_experimentation.ipynb      # Initial model testing
├── 2_fine_tuning.ipynb         # Custom training workflow
├── 3_remote_inference.ipynb    # Testing served models
│
├── requirements-base.txt        # Base Python dependencies
├── requirements-gpu.txt         # GPU-specific packages
│
├── finetuning_pipeline/        # Kubeflow pipeline components
│   ├── Dreambooth.pipeline     # Pipeline definition
│   ├── get_data.ipynb         # Data preparation step
│   ├── train.ipynb            # Training execution step
│   └── upload.ipynb           # Model upload step
│
├── diffusers-runtime/          # Custom KServe runtime
│   ├── Dockerfile             # Runtime container definition
│   ├── model.py              # Main KServe predictor (refactored)
│   ├── device_manager.py      # Hardware detection and management
│   ├── dtype_selector.py      # Intelligent dtype selection
│   ├── optimization_manager.py # Memory optimization controls
│   ├── pipeline_loader.py     # Universal model loading
│   ├── Makefile              # Build and deployment automation
│   └── templates/            # Kubernetes deployment manifests
│       ├── redhat-dog.yaml        # S3 storage deployment
│       ├── redhat-dog-hf.yaml     # HuggingFace Hub deployment
│       ├── redhat-dog-pvc.yaml    # PVC storage deployment
│       └── tiny-sd-gpu.yaml       # Lightweight test deployment
│
└── setup/                     # Deployment configurations
    └── setup-s3.yaml         # Demo S3 storage setup

Workflow Overview

1. Experimentation Phase

Load pre-trained Stable Diffusion model
Test basic text-to-image generation
Identify limitations with generic models

2. Training Phase

Prepare custom training data (images of "Teddy")
Fine-tune model using Dreambooth technique
Save trained weights to S3 storage

3. Pipeline Automation

Convert notebooks to pipeline steps
Create repeatable training workflow
Enable parameter tuning and experimentation

4. Model Serving

Deploy custom KServe runtime
Create inference service
Expose REST API endpoint

5. Application Integration

Test model via REST API
Integrate with applications
Monitor performance

Troubleshooting

GPU Issues

No GPU detected: Ensure your node has GPU support and correct drivers
Out of memory: Reduce batch size or use gradient checkpointing
CUDA errors: Verify PyTorch and CUDA versions match

Storage Issues

S3 connection failed: Check credentials and endpoint URL
Permission denied: Verify bucket policies and access keys
Upload timeouts: Check network connectivity and proxy settings

Pipeline Issues

Pipeline server not starting: Check data connection configuration
Pipeline runs failing: Review logs in pipeline run details
Missing artifacts: Verify S3 bucket permissions

Serving Issues

Model not loading: Check model path (S3/PVC/HuggingFace) and format
Inference errors: Review KServe pod logs, check dtype compatibility
Timeout errors: Increase resource limits or timeout values

Memory issues: Enable optimizations via environment variables:

env:
  - name: DTYPE
    value: "auto"  # or bfloat16, float16, float32
  - name: ENABLE_ATTENTION_SLICING
    value: "true"
  - name: ENABLE_VAE_SLICING
    value: "true"
  - name: ENABLE_CPU_OFFLOAD
    value: "true"

Additional Resources

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests to improve this demo.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
chatbot		chatbot
diffusers-runtime		diffusers-runtime
gen_images_pipeline		gen_images_pipeline
llm-deployment		llm-deployment
mcp-server		mcp-server
misc_notebooks		misc_notebooks
setup		setup
utils		utils
.gitignore		.gitignore
1_experimentation.ipynb		1_experimentation.ipynb
2_fine_tuning.ipynb		2_fine_tuning.ipynb
3_testing.ipynb		3_testing.ipynb
4_remote_inference.ipynb		4_remote_inference.ipynb
ARCHITECTURE.md		ARCHITECTURE.md
CLAUDE.md		CLAUDE.md
DEMO_SCRIPT.md		DEMO_SCRIPT.md
Fine Tuning.pipeline		Fine Tuning.pipeline
Makefile		Makefile
README.md		README.md
distributed_training_kfto.ipynb		distributed_training_kfto.ipynb
requirements-base.txt		requirements-base.txt
requirements-gpu.txt		requirements-gpu.txt
setup.sh		setup.sh
test_tuned_model.ipynb		test_tuned_model.ipynb

cfchase/text-to-image-demo

Folders and files

Latest commit

History

Repository files navigation