fix: MCP authorization parameter implementation #4052

omaryashraf5 · 2025-11-03T23:50:42Z

What does this PR do?

Adding a user-facing authorization parameter to MCP tool definitions that allows users to explicitly configure credentials per MCP server, addressing GitHub Issue #4034 in a secure manner.

Test Plan

tests/integration/responses/test_mcp_authentication.py

bbrowning · 2025-11-04T00:17:21Z

Can you point me to where in the Responses API spec it has this authentication attribute? I only see authorization listed for MCP tools.

omaryashraf5 · 2025-11-04T01:32:05Z

@bbrowning Thanks for your comment! Yes, I changed it to 'authorization' However, this static approach would only be helpful for MCP credentials that are hardcoded in tool definitions (long lived tokens). But its not ideal for cases where we need to have different mcp credentials per user. Automatically forwarding the user's OAuth token to MCP server is not an option, so an alternative approach would be for the user to explicitly pass their own OAuth token through the client? (dynamic per-request)

bbrowning · 2025-11-04T02:15:49Z

@bbrowning Thanks for your comment! Yes, I changed it to 'authorization' However, this static approach would only be helpful for MCP credentials that are hardcoded in tool definitions (long lived tokens). But its not ideal for cases where we need to have different mcp credentials per user.

I'm not sure I follow what you're saying. Every inference request passes in the tools available for that request. So, with every inference request, the client can pass in an updated token for any MCP servers that request references. And that means every user also passes in their own credentials. Or, am I misunderstanding how you intend this to work?

omaryashraf5 · 2025-11-04T02:51:17Z

@bbrowning Thanks for your comment! Yes, I changed it to 'authorization' However, this static approach would only be helpful for MCP credentials that are hardcoded in tool definitions (long lived tokens). But its not ideal for cases where we need to have different mcp credentials per user.

I'm not sure I follow what you're saying. Every inference request passes in the tools available for that request. So, with every inference request, the client can pass in an updated token for any MCP servers that request references. And that means every user also passes in their own credentials. Or, am I misunderstanding how you intend this to work?

This PR supports the case where authorization tokens change between response creation requests.

For example:

response1 = client.responses.create(
model="llama3",
input="What is X?",
tools=[{"type": "mcp", "authorization": {"token": "user_a_token"}}]
)

response2 = client.responses.create(
model="llama3",
input="What is Y?",
tools=[{"type": "mcp", "authorization": {"token": "user_b_token"}}] # Different token
)

within a single response, multiple inference iterations happen --> authorization tokens can not be updated between these inference iterations.

Internally, this might do:

Inference iteration 1 → calls MCP with "initial_token"
Inference iteration 2 → calls MCP with "initial_token" (same token)
Inference iteration 3 → calls MCP with "initial_token" (same token)
Question: Can the token be refreshed between iterations 1→2→3?
No

omaryashraf5 · 2025-11-04T02:55:03Z

this approach is static within each individual response but dynamic across responses.

ashwinb · 2025-11-04T03:22:10Z

src/llama_stack/apis/agents/openai_responses.py



+@json_schema_type
+class MCPAuthorization(BaseModel):


we should adopt exactly the type specified here: https://platform.openai.com/docs/api-reference/responses/create#responses_create-tools-mcp_tool-authorization otherwise we are forking off our Responses API without a good reason. (This is beyond the fact that passing { username, password } being not correct all.)

@ashwinb thanks for your comment, I was about to handle this point. Removed that type

mattf

remove all the reformatting and make it clear what is being changed.

mergify · 2025-11-04T10:32:48Z

This pull request has merge conflicts that must be resolved before it can be merged. @omaryashraf5 please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mattf

this should be simply -

add authz field to OpenAIResponseInputToolMCP
pass authz to mcp client
sanitize response to not include authz field

we should not be providing multiple ways to provide the token.

we must not be passing a token for one service, e.g. inference, to another, e.g. mcp server.

- Add Field(exclude=True) to authorization parameter to prevent token leakage in responses - Add model validator to reject Authorization header in headers dict - Users must use dedicated 'authorization' parameter instead of headers - Headers field is preserved for legitimate non-auth headers (tracing, routing, etc.) This implements the security requirement that authorization params are never returned in responses, unlike generic headers which may be echoed back.

Updated test_mcp_authorization_error_when_header_provided to match the new validation error message from the Pydantic validator.

src/llama_stack/apis/agents/openai_responses.py

Per reviewer feedback, API models should be pure data structures without business logic. Moved the Authorization header validation from the Pydantic @model_validator in openai_responses.py to the handler in streaming.py. - Removed @model_validator from OpenAIResponseInputToolMCP - Added validation at handler level in _process_mcp_tool() - Maintains same security check: rejects Authorization in headers dict - Follows separation of concerns: models are data, handlers have logic

mergify · 2025-11-07T19:05:51Z

This pull request has merge conflicts that must be resolved before it can be merged. @omaryashraf5 please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Per reviewer feedback, validation should be in the openai_responses.py handler, not the streaming.py file. Moved validation logic to create_openai_response() method which is the main entry point for response creation. - Added validation in create_openai_response() before processing - Removed duplicate validation from _process_mcp_tool() in streaming.py - Validation runs early and rejects malformed requests immediately - Maintains same security check: rejects Authorization in headers dict

Addresses reviewer concern about token isolation between services. The remote provider now rejects Authorization headers in mcp_headers to prevent accidentally passing inference tokens to MCP servers. This makes the remote provider consistent with the inline provider: - Both reject Authorization in headers dict - Both require dedicated authorization parameter - Prevents token leakage across service boundaries Related changes: - Added validation in get_headers_from_request() - Throws ValueError if Authorization found in mcp_headers - Added TODO for dedicated authorization field in provider_data

Completes the TODO for extracting authorization from a dedicated field. What changed: - Added mcp_authorization field to MCPProviderDataValidator - Updated get_headers_from_request() to extract from mcp_authorization - Authorization is now properly isolated per MCP endpoint API usage example: { "provider_data": { "mcp_headers": { "http://mcp-server.com": { "X-Trace-ID": "trace-123" } }, "mcp_authorization": { "http://mcp-server.com": "mcp_token_xyz789" } } } Security guarantees: - Authorization cannot be in mcp_headers (validation rejects it) - Each MCP endpoint gets its own dedicated token - No cross-service token leakage possible

omaryashraf5 · 2025-11-07T20:10:09Z

this should be simply -

add authz field to OpenAIResponseInputToolMCP

pass authz to mcp client

sanitize response to not include authz field

we should not be providing multiple ways to provide the token.

we must not be passing a token for one service, e.g. inference, to another, e.g. mcp server.

Thank you for your feedback.

Added Authorization field to OpenAIResponseInputToolMCP
Passed Authorization token to mcp client
Used Field(exclude=True) on authorization paramter to never be returned in responses, logs or serliazation (Sanitize response)

Thanks for pointing out that we should have "one way to provide the token". I want to make sure I understand correctly:

When you say we shouldn't provide multiple ways to provide the token, do you mean:

Option A: Remove headers field entirely
Pros: Simplest
Cons: Users can't pass non-auth headers (tracing, routing,..etc)

Option B: Keep headers but strictly enforce no Authorization in headers
Pros: Users can pass custom headers for tracing, routing,..etc
Cons: Need validation to prevent Authorization in headers.

I went with option B but let me know if you disagree.

As for your last point:

I updated the MCPProviderDataValidator data model and get_headers_from_request in model_context_protocol.py

Adds inline documentation to help users understand: - How to structure provider_data in HTTP requests - Where to place mcp_headers vs mcp_authorization - Security requirements (no Authorization in headers) - Token format requirements (without Bearer prefix) - Example usage with multiple MCP endpoints

Based on user feedback, improved comments to distinguish between the two security layers: 1. PRIMARY: Line 89 - Architectural prevention - get_request_provider_data() only reads from request body - Never accesses HTTP Authorization header - This is what actually prevents inference token leakage 2. SECONDARY: Lines 97-104 - Validation prevention - Rejects Authorization in mcp_headers dict - Enforces using dedicated mcp_authorization field - Prevents users from misusing the API Previous comment was misleading by suggesting the validation prevented inference token leakage, when the architecture already ensures that isolation.

github-actions · 2025-11-07T22:26:42Z

✱ Stainless preview builds

This PR will update the llama-stack-client SDKs with the following commit message.

fix: MCP authorization parameter implementation

Edit this comment to update it. It will appear in the SDK's changelogs.

✅ llama-stack-client-go studio · code · diff

Your SDK built successfully.
generate ❗ → lint ❗ → test ❗
go get github.com/stainless-sdks/llama-stack-client-go@2990b92a2cc6e1b230ea3e5affe999a3e4c69754

✅ llama-stack-client-kotlin studio · code · diff

Your SDK built successfully.
generate ❗ → lint ✅ → test ❗

⏳ llama-stack-client-node studio · conflict

⏳ llama-stack-client-python studio · conflict

⏳ These are partial results; builds are still running.

This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
Last updated: 2025-11-10 21:28:43 UTC

Updates integration tests to use the new mcp_authorization field instead of the old method of passing Authorization in mcp_headers. Changes: - tests/integration/tool_runtime/test_mcp.py - tests/integration/inference/test_tools_with_schemas.py - tests/integration/tool_runtime/test_mcp_json_schema.py (6 occurrences) All tests now use: provider_data = {"mcp_authorization": {uri: AUTH_TOKEN}} Instead of the old rejected format: provider_data = {"mcp_headers": {uri: {"Authorization": f"Bearer {AUTH_TOKEN}"}}} This aligns with the security architecture that prevents accidentally leaking inference tokens to MCP servers.

Fixed incorrect import in test_mcp_authentication.py: - Changed: from llama_stack import LlamaStackAsLibraryClient - To: from llama_stack.core.library_client import LlamaStackAsLibraryClient This aligns with the correct import pattern used in other test files.

Added Field(exclude=True) to mcp_authorization field to ensure tokens are NEVER exposed in: - API responses (model_dump()) - JSON serialization (model_dump_json()) - Logs - Any Pydantic serialization This prevents accidental token leakage through: - Error messages - Debug logs - API response payloads - Monitoring/telemetry systems The field is still accessible within the application code but will be automatically excluded from all Pydantic serialization operations.

mattf · 2025-11-10T20:04:35Z

Option B: Keep headers but strictly enforce no Authorization in headers
Pros: Users can pass custom headers for tracing, routing,..etc
Cons: Need validation to prevent Authorization in headers.

I went with option B but let me know if you disagree.

how are tracing, routing, etc related to this change?

omaryashraf5 · 2025-11-10T21:12:28Z

Option B: Keep headers but strictly enforce no Authorization in headers
Pros: Users can pass custom headers for tracing, routing,..etc
Cons: Need validation to prevent Authorization in headers.
I went with option B but let me know if you disagree.

how are tracing, routing, etc related to this change?

The current change just accepts the authorization field outside the header. I was just stating general pros for keeping the header field (apologies for the confusion)

MCP authentication parameter implementation

d0a8878

omaryashraf5 requested review from ashwinb, bbrowning, ehhuang, franciscojavierarceo, hardikjshah, leseb, mattf, raghotham, reluctantfuturist, slekkala1, terrytangyuan and yanxi0830 as code owners November 3, 2025 23:50

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 3, 2025

omaryashraf5 marked this pull request as draft November 3, 2025 23:50

Omar Abdelwahab added 2 commits November 3, 2025 15:57

Added minor changes

57eb575

precommit

c49fef8

Omar Abdelwahab added 2 commits November 3, 2025 16:55

added a fix

1143db0

minor fix

376f0fc

omaryashraf5 changed the title ~~fix: MCP authentication parameter implementation~~ fix: MCP authotization parameter implementation Nov 4, 2025

omaryashraf5 changed the title ~~fix: MCP authotization parameter implementation~~ fix: MCP authorization parameter implementation Nov 4, 2025

ashwinb reviewed Nov 4, 2025

View reviewed changes

Removed the MCPAuthorization class relying on bearer token

9dbeeac

mattf requested changes Nov 4, 2025

View reviewed changes

omaryashraf5 self-assigned this Nov 6, 2025

omaryashraf5 requested review from leseb and mattf November 6, 2025 22:34

mattf requested changes Nov 7, 2025

View reviewed changes

Omar Abdelwahab added 2 commits November 7, 2025 10:50

test: update error message match for authorization validation

8ce30b7

Updated test_mcp_authorization_error_when_header_provided to match the new validation error message from the Pydantic validator.

ashwinb reviewed Nov 7, 2025

View reviewed changes

src/llama_stack/apis/agents/openai_responses.py Outdated Show resolved Hide resolved

mergify bot added the needs-rebase label Nov 7, 2025

Omar Abdelwahab added 3 commits November 7, 2025 11:06

Omar Abdelwahab added 4 commits November 7, 2025 12:14

precommit

ccb870c

formatting

c563d8a

formatting changes

2295a1a

mergify bot removed the needs-rebase label Nov 7, 2025

Omar Abdelwahab and others added 2 commits November 7, 2025 14:05

Merge branch 'main' into add-mcp-authentication-param

1a7ba68

Omar Abdelwahab added 3 commits November 7, 2025 14:46

precommit run

c353873

omaryashraf5 requested a review from mattf November 8, 2025 18:38

Merge branch 'main' into add-mcp-authentication-param

114ab69

fix: MCP authorization parameter implementation #4052

Are you sure you want to change the base?

fix: MCP authorization parameter implementation #4052

Conversation

omaryashraf5 commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Uh oh!

bbrowning commented Nov 4, 2025

Uh oh!

omaryashraf5 commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bbrowning commented Nov 4, 2025

Uh oh!

omaryashraf5 commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

omaryashraf5 commented Nov 4, 2025

Uh oh!

ashwinb Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

omaryashraf5 Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattf left a comment

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Nov 4, 2025

Uh oh!

mattf left a comment • edited by omaryashraf5 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mergify bot commented Nov 7, 2025

Uh oh!

omaryashraf5 commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✱ Stainless preview builds

Uh oh!

mattf commented Nov 10, 2025

Uh oh!

omaryashraf5 commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

omaryashraf5 commented Nov 3, 2025 •

edited

Loading

omaryashraf5 commented Nov 4, 2025 •

edited

Loading

omaryashraf5 commented Nov 4, 2025 •

edited

Loading

omaryashraf5 Nov 4, 2025 •

edited

Loading

mattf left a comment •

edited by omaryashraf5

Loading

omaryashraf5 commented Nov 7, 2025 •

edited

Loading

github-actions bot commented Nov 7, 2025 •

edited

Loading

omaryashraf5 commented Nov 10, 2025 •

edited

Loading