Skip to content

feat: Add multilingual prompt optimizer with LangGraph agent support #16894

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ChristopheZhao
Copy link

@ChristopheZhao ChristopheZhao commented Mar 15, 2025

What I'm trying to accomplish

This PR aims to enhance the text-to-image generation workflow by adding intelligent prompt processing capabilities. Specifically, it:

  1. Makes the WebUI more accessible to non-English speakers by automatically detecting and translating prompts
  2. Improves image generation quality through prompt optimization using LangGraph-based agent workflows
  3. Provides a seamless experience that works within the existing txt2img interface

Summary of changes in code

  • Added scripts/txt2img_prompt_optimizer.py - A new script that:

    • Implements a Script class that integrates with the WebUI's txt2img tab
    • Uses LangGraph to create an agent-based workflow for prompt processing
    • Detects non-English text and translates it to English
    • Optimizes prompts to improve generation quality while preserving intent
    • Handles API key management through environment variables
    • Provides graceful fallbacks when optional dependencies are missing
  • Updated requirements.txt to include:

    • python-dotenv for environment variable management
    • langgraph for building the agent workflow
  • Updated requirements_versions.txt with specific versions:

    • Added compatible versions of new dependencies
    • Ensured version compatibility with existing dependencies
  • Updated .gitignore to exclude:

    • .env files containing sensitive API keys

Issues fixed

This PR addresses the feature request in Issue #4576, which requested multilingual prompt support but was previously marked as "not planned".

The implementation:

  1. Adds multilingual support through automatic translation of non-English prompts
  2. Goes beyond the original request by also implementing prompt optimization
  3. Integrates seamlessly with the existing txt2img interface without requiring changes to the core pipeline

Screenshots/videos:

Here's a demonstration of how our system handles backend translations and their effectiveness for prompts in various languages. We will use 'a kitten under a pine tree' as a prompt to test the effects across different languages.

Chinese (simplified):

  • backend

image

  • frontend

image

Japanese;

  • backend

image

  • frontend

image

French;

  • backend

image

  • frontend

image

Spanish;

  • backend

image

  • frontend

image

Vietnamese.

  • backend

image

  • frontend

image

Kiswahili

  • backend

image

  • frontend

image

  • And, of course, English prompts are also automatically optimized.
    • backend

image

  • frontend

image

Checklist:

- Add txt2img_prompt_optimizer.py script for automatic prompt translation and optimization
- Support non-English prompts with automatic translation to English
- Implement prompt optimization using LangGraph workflow
- Add python-dotenv and langgraph dependencies
- Update requirements.txt and requirements_versions.txt with new dependencies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant