Skip to content

Conversation

sairishi-exe
Copy link

@sairishi-exe sairishi-exe commented Mar 2, 2025

Description

Implements a two-step presentation generation tool using LLM (Gemini 1.5 Pro). The tool generates structured outlines and converts them into detailed slides using parallel processing. Includes caching for context preservation between endpoints.

Related Issue

Implements presentation generation functionality for Marvel AI platform.

Type of Change

  • New feature: A non-breaking change that adds functionality.
  • Documentation update: Changes or updates to documentation.
  • Performance improvement: Parallel processing for slide generation.

Proposed Solution

Implemented a two-step presentation generation process:

  1. Outline Generation:

    • Takes user inputs (grade, topic, objectives)
    • Generates structured outline using LLM
    • Stores context using a Redis cache service
    • Ouline validation with Pydantic models
  2. Slides Generation:

    • Uses stored context in cache to generate detailed slides
    • Implements parallel processing for efficiency
    • Validates output using Pydantic models

How to Test

  1. First make sure to include this in requirements.txt
# Other dependencies...
redis[hiredis]>=5.0.0
aioredis>=2.0.0
  1. Add these variables to .env file
REDIS_HOST=redis
REDIS_PORT=6379
REDIS_DB=0
REDIS_PREFIX=marvel_ai:
REDIS_TTL=3600
  1. Docker Setup:
  • Pull Redis image (open source container that lets us access Redis service):

    docker pull redis
  • Build and start containers:

    docker-compose up --build -d
  • To Verify Redis connection:

    # Enter Redis CLI
    docker exec -it marvel-ai-backend-redis redis-cli
    
    # Test connection
    ping
    # Should return "PONG"

Note: Redis stores presentation context between outline and slide generation steps, in the in-memory cache.

  1. Go to localhost:8000/docs to test the endpoints on Swagger UI using the sample input below, starting with
    /generate-outline endpoint.

Sample Input:

{
   "user": {
      "id": "string",
      "fullName": "string",
      "email": "string"
   },
   "type": "tool",
   "tool_data": {
      "tool_id": "presentation-generator",
      "inputs": [
         {
            "name": "grade_level",
            "value": "university"
         },
         {
            "name": "n_slides",
            "value": 9
         },
         {
            "name": "topic",
            "value": "Linear Algebra"
         },
         {
            "name": "objectives",
            "value": ""
         },
         {
            "name": "lang",
            "value": "en"
         }
      ]
   }
}
  1. Then, test the /generate-slides endpoint using the presentation_id returned from /generate-outline:
POST /generate-slides/{presentation_id}

Example Usage

https://www.loom.com/share/54ea6f8fa78a486d96d2712b1c9ece46?sid=ba2d8843-0ff2-46a5-ac24-05503fd899b1

Documentation Updates

  • Yes
  • No

Documentation needed for:
The README.md file in the app/tools folder needs to be updated in the output schema.

Checklist

  • I have performed a self-review of my code
  • I have commented my code
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests
  • New and existing unit tests pass locally
  • Dependencies are properly managed

Additional Information

Check out my Notion document for more information:
Notion Document w/ Detailed Explanation

@sairishi-exe sairishi-exe changed the base branch from Develop to ai-squad-001 March 2, 2025 17:00
@buriihenry
Copy link
Contributor

buriihenry commented Mar 7, 2025

Hey @sairishi-exe could you record a loom video for this implementation?

@sairishi-exe sairishi-exe force-pushed the feature/presentation-generator branch from fa97f5a to bd58027 Compare March 9, 2025 00:22
@AaronSosaRamos AaronSosaRamos self-requested a review March 9, 2025 20:11
Copy link
Contributor

@AaronSosaRamos AaronSosaRamos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good work! Please, solve the given comments. Thank you.



# Dependency injection
async def get_cache_service(request: Request) -> CacheInterface:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this required?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dependency injection is part of the SOLID code principle (this was an attempt at writing clean code, so hopefully I'm doing it right). Basically, if a dev decides to use any other Cache service instead of Redis, they can simply create their new service class following the CacheService Interface in cache_service.py, and initialize their chosen service object in main.py, without having to change any of the code in the router.py. Similarly the person working on the endpoints in router.py does not have to worry about the configuration of the cache service, which demonstrates loose coupling. As our codebase gets larger, I thought it would be better to write code with these abstractions in mind, so that open source contributors can focus on individual units for their contribution, without worrying too much about the dependencies. While the tradeoff is that we are using extra function calls, which can increase performance overhead, it will pay in the long run if we keep our codebase neat.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, although I think is more related with the Liskov Substitution Principle, I think that the tradeoff won't be a problem if we have a correct justification and pathway for integrating Redis because of the cost/opportunity analysis that we have to perform with this feature.

# Handles two-step presentation generation:
# 1. Generate outline with initial inputs
# 2. Generate slides using stored outline and inputs
@router.post("/generate-outline", response_model=Union[ToolResponse, ErrorResponse])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, no additional endpoints are needed. Use the submit-tool endpoint for dynamically load this tool.

Copy link
Author

@sairishi-exe sairishi-exe Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to our Gen AI instruction document, we are supposed to create two separate endpoints for generating the outline and slides. The goal is to emulate a User Experience similar to Gamma.ai where,

  • User first requests an outline generation, [implemented]
  • Then, edits the LLM generated outline to their liking [not implemented in this sprint],
  • This outline is used as context to generate slides, [implemented]
  • Then, slides are edited as per user [also not implemented in this sprint].

In order for the /generate-slides endpoint to have access to the relevant outline context, we need to store the outline info into the cache with a unique presentation_id key to access it when needed, which is why I used Redis. I used caching so that data can be instantly accessed, unlike with an actual database. I will explain more about why I specifically chose Redis over other caching strategies in the comment below this one.

Copy link
Contributor

@AaronSosaRamos AaronSosaRamos Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for letting us now about this, but no additional endpoint is required. All the tools must be called under the/submit-tool endpoint because it allow dynamic tool registration/implementation for decoupling the integration with the API.

logger.info(f"Initializing Application Startup")
logger.info(f"Successfully Completed Application Startup")

app.state.cache_service = RedisService()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is Redis needed in this approach? Since we are not currently using Redis yet, but if a good justiifcation is provided, we can definitely evaluate its integration.

Copy link
Author

@sairishi-exe sairishi-exe Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So as I mentioned in the above comment, I decided to use caching to pass context between requests. I tried doing my own research as so:

  • Using DevTools I tried to reverse engineer which caching strategy, if at all, is being used by gamma. Unfortunately, I couldn't find much (I'm not used to using devtools which could be the reason).
  • Later I decided to actively engage Claude 3.5 Sonnet, to debate which design to go with and here is what I found,

Two strategies, to handle the passing of context from different requests:

  1. Browser/Client side caching
  2. Server Side Caching (w/ Redis)

I learned that Browser Side caching is definitely faster, but negligibly so (this still needs to be tested). Server side caching offers much better persistence (in case browser cache is cleared, server fails, cross-browser and cross-device synchronization if user decides to switch for any reason) and scalability, which is why I preferred Redis. The upside of persistence and scalability offered by Redis trumps the negligible upside of latency and performance found from Browser caching, as our end goal is for teachers to have a stable and accurate assistant over a fast one.

Not only that, I still believe some form of caching is definitely needed for our tools to make them more efficient in the long run, whether it be server-side or client

Notion document with more detailed explanation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, but how can it support us in terms of maximizing the quality of the responses? Although latency and cache are good approaches for improving the speed of the requests, since our main business priority is to enhance the context-awareness of the LLM, how can Redis enhance this goal?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An evaluation for Redis imp. is needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this docker-compose file needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docker-compose.yml file is used so that, the open source Redis container and the backend container can be spun-up together and work with each other.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An evaluation for Redis is needed.

@buriihenry
Copy link
Contributor

Hi @sairishi-exe did you manage to merge the code with that of AI-squad-003?

Copy link
Contributor

@AaronSosaRamos AaronSosaRamos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, solve the given responses in the comments. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants