feat(litellm): align embedding instrumentation with pending spec #2238

codefromthecrypt · 2025-09-28T01:53:26Z

This PR aligns litellm with the specification changes around embeddings in #2162

Spec changes from 2162

Consistent span name: "CreateEmbeddings"
Standardized attribute structure: embedding.embeddings.N.embedding.{text|vector}
Unified invocation parameter tracking: embedding.invocation_parameters
Proper llm.system attribute for provider identification

Code improvements:

Full batch embedding support with indexed attributes
Separated invocation parameters from input data
Improved handling of token IDs vs text inputs
Vectors stored as tuples instead of JSON strings

This is the same as #2210, except litellm.

Note

Aligns LiteLLM embedding spans to the new spec with "CreateEmbeddings" name, structured per-embedding text/vector, invocation parameter capture, provider tagging, and full batch handling.

Embedding Instrumentation (LiteLLM):
- Rename span to CreateEmbeddings; set llm.system to "litellm".
- Capture invocation params under embedding.invocation_parameters (exclude input).
- Record inputs as embedding.embeddings.N.embedding.text for strings/lists of strings; skip for token IDs.
- Emit vectors as tuples per item at embedding.embeddings.N.embedding.vector.
Result Handling:
- Iterate EmbeddingResponse.data to set per-index vectors instead of a single JSON string.
Tests:
- Add batch and edge-case embedding tests (tests/test_batch_embedding.py).
- Update existing tests to expect new span name, attributes, vectors-as-tuples, and invocation parameters.
- Add setup_litellm_instrumentation fixture.

^{Written by Cursor Bugbot for commit c0c6750. This will update automatically on new commits. Configure here.}

This PR aligns litellm with the specification changes around embeddings in Arize-ai#2162 **Spec changes from 2162** - Consistent span name: `"CreateEmbeddings"` - Standardized attribute structure: `embedding.embeddings.N.embedding.{text|vector}` - Unified invocation parameter tracking: `embedding.invocation_parameters` - Proper `llm.system` attribute for provider identification **Code improvements:** - Full batch embedding support with indexed attributes - Separated invocation parameters from input data - Improved handling of token IDs vs text inputs - Vectors stored as tuples instead of JSON strings This is the same as Arize-ai#2210, except litellm. Signed-off-by: Adrian Cole <[email protected]>

codefromthecrypt requested a review from a team as a code owner September 28, 2025 01:53

github-project-automation bot added this to Instrumentation Sep 28, 2025

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Sep 28, 2025

codefromthecrypt mentioned this pull request Sep 28, 2025

feat: add embedding hiding configuration and align spec with instrumentation #2162

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(litellm): align embedding instrumentation with pending spec #2238

feat(litellm): align embedding instrumentation with pending spec #2238

Uh oh!

codefromthecrypt commented Sep 28, 2025 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

feat(litellm): align embedding instrumentation with pending spec #2238

Are you sure you want to change the base?

feat(litellm): align embedding instrumentation with pending spec #2238

Uh oh!

Conversation

codefromthecrypt commented Sep 28, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

codefromthecrypt commented Sep 28, 2025 •

edited by cursor bot

Loading