-
Notifications
You must be signed in to change notification settings - Fork 6
fix: add exponential retry logic to stackit embedder #75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: deps-main
Are you sure you want to change the base?
Conversation
…edder.py Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good with one suggestion on handling rate limited requests.
libs/rag-core-api/src/rag_core_api/impl/embeddings/stackit_embedder.py
Outdated
Show resolved
Hide resolved
…ror handling and add username/password to Redis connection chore: remove unused KeyDB references and update related configurations fix: increase maximum retry delay for StackitEmbedder settings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request introduces robust retry logic with exponential backoff to the StackitEmbedder and refactors service configuration management in the RAG infrastructure. The main focus is improving embedding API reliability while modernizing service deployment configurations.
- Adds comprehensive retry logic with exponential backoff to handle embedding API failures
- Introduces a reusable retry decorator framework with rate limit awareness
- Updates service configurations to use Redis instead of KeyDB and standardizes image tags
Reviewed Changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
libs/rag-core-lib/src/rag_core_lib/impl/utils/utils.py | Utility functions for parsing time durations and extracting HTTP headers/status codes |
libs/rag-core-lib/src/rag_core_lib/impl/utils/retry_decorator.py | Reusable retry decorator with exponential backoff and rate limit handling |
libs/rag-core-api/src/rag_core_api/impl/settings/stackit_embedder_settings.py | Adds retry configuration fields to embedder settings |
libs/rag-core-api/src/rag_core_api/impl/embeddings/stackit_embedder.py | Implements retry logic in embedding methods using the new decorator |
libs/admin-api-lib/src/admin_api_lib/impl/summarizer/langchain_summarizer.py | Refactors summarizer to use new retry decorator instead of custom retry logic |
infrastructure/rag/values.yaml | Updates service configurations, image tags, and Redis credentials |
services/frontend/apps/*/Dockerfile | Adds verbose flag to npm install commands |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
libs/rag-core-api/src/rag_core_api/impl/embeddings/stackit_embedder.py
Outdated
Show resolved
Hide resolved
USECASE_KEYVALUE_HOST: "rag-redis-primary" | ||
USECASE_KEYVALUE_USERNAME: "default" | ||
USECASE_KEYVALUE_PASSWORD: "MOqTsoa22R" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded password in configuration file poses a security risk. Consider using Kubernetes secrets or environment variable references instead of plain text passwords in the values file.
USECASE_KEYVALUE_HOST: "rag-redis-primary" | |
USECASE_KEYVALUE_USERNAME: "default" | |
USECASE_KEYVALUE_PASSWORD: "MOqTsoa22R" | |
USECASE_KEYVALUE_PASSWORD: "${USECASE_KEYVALUE_PASSWORD}" |
Copilot uses AI. Check for mistakes.
username: "default" | ||
password: "" | ||
password: "MOqTsoa22R" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded password in langfuse Redis configuration. This should be moved to a Kubernetes secret or use a secure configuration management approach.
Copilot uses AI. Check for mistakes.
This pull request introduces robust retry logic for embedding requests in the
StackitEmbedder
and refactors configuration management for external services in the RAG infrastructure. The main changes are the addition of exponential backoff for embedding failures, new retry-related settings, and a more organized approach to service configuration invalues.yaml
.Embedding robustness and configuration enhancements:
Retry logic improvements:
embed_documents
method inStackitEmbedder
, allowing up to a configurable number of retries with delays on transient failures. This improves reliability when the embedding API is temporarily unavailable.stackit_embedder.py
for better observability.Embedder settings:
StackitEmbedderSettings
formax_retries
,retry_base_delay
, andretry_max_delay
, allowing fine-grained control of retry behavior. [1] [2]Infrastructure/service configuration refactor:
infrastructure/rag/values.yaml
to move service deployment and connection details (PostgreSQL, Redis, ClickHouse, S3/MinIO) under dedicated configuration sections, clarifying which services are external and improving maintainability. [1] [2]langfuse
configuration block with explicit image, web, worker, authentication, and environment variable settings for more granular control over deployment.Embedder deployment configuration:
STACKIT_EMBEDDER_MAX_RETRIES
,STACKIT_EMBEDDER_RETRY_BASE_DELAY
,STACKIT_EMBEDDER_RETRY_MAX_DELAY
) to thestackitEmbedder
section invalues.yaml
to enable configuration via environment variables.