Skip to content

Conversation

melisasvr
Copy link

@melisasvr melisasvr commented Mar 20, 2025

Overview

This PR enhances the Presentation Generator Tool by integrating AI-generated images using Imagen 3, as outlined in the instructions for the 12 Mar - Presentation Generator. Images are now generated for each slide based on its title and content, styled for educational presentations, and stored in Google Cloud Storage (GCS) with public URLs attached to the slide objects. This builds on the develop branch (from PR #189) and completes the core image generation feature. Additionally, comprehensive unit tests have been added to ensure functionality and reliability.

Changes Made

Imagen 3 Integration:

  • Added generate_image_with_imagen3 in app/tools/presentation_generator_updated/slide_generator/tools.py.
  • Uses Vertex AI’s imagegeneration@006 model with OAuth authentication via GOOGLE_APPLICATION_CREDENTIALS.
  • Generates images from slide content and uploads them to the melis-presentation GCS bucket.

Slide Enhancement:

  • Updated SlideGenerator.generate_slides() to call generate_image_with_imagen3 for each slide.
  • Extended Slide model to include image_url (e.g., https://storage.googleapis.com/melis-presentation/slides/<hex>.jpg).

Prompt Optimization:

  • Constructs prompts like "An infographic-style illustration for 'Intro to Python'... Minimalist and clean...".
  • Tailors image style to slide layout (e.g., twoColumn → split design, sectionHeader → hero image).
  • Matches suggested dimensions (e.g., 1600x900 for sectionHeader).

Unit Tests:

  • Updated app/tools/presentation_generator_updated/slide_generator/tests/test_core.py with tests for:
    • Executor functionality with image URL generation.
    • Validation of slide content (topic coverage, garbage detection).
    • Handling of image generation failures.
    • Pydantic model integrity for Slide and SlidePresentation.

Fixes:

  • Resolved .env loading, TypeErrors in content handling, and Imagen 3 401 Unauthorized errors.

Example Output

Input: ['Intro'], 'Python', 'university', 'en'

Output:

{
  "slides": [
    {
      "title": "Intro to Python",
      "template": "sectionHeader",
      "content": "An overview of Python, a high-level programming language widely used in academia and industry.",
      "image_url": "https://storage.googleapis.com/melis-presentation/slides/233b7178859b2fcf345297d8abe8c113.jpg"
    },
    {
      "title": "Why Python?",
      "template": "titleAndBullets",
      "content": [
        "Versatile: Used in web development, data science, and automation.",
        "Readable: Simple syntax ideal for teaching programming concepts.",
        "Community: Extensive libraries and academic support."
      ],
      "image_url": "https://storage.googleapis.com/melis-presentation/slides/7f537b2210b53dd65980caf5f2ed7a63.jpg"
    },
    {
      "title": "Python in Academia",
      "template": "twoColumn",
      "content": {
        "left": "Research: Used in computational biology and physics simulations.",
        "right": "Education: Core language in many university CS curricula."
      },
      "image_url": "https://storage.googleapis.com/melis-presentation/slides/9a8b6c5d4e3f2g1h0i7j8k9l.jpg"
    }
  ]
}

@buriihenry buriihenry self-assigned this Mar 22, 2025
Copy link
Contributor

@buriihenry buriihenry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good effort however you have not described the entire PoC implemetation as per contributions guidelines. Would be good to share a Loom video for the workflow.
Could you share the Json object sample for testing as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants