-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Open
Labels
Description
What happened?
I have configured Claude 3.7 caching in LiteLLM as:
When I send a request I get the following error:
litellm.BadRequestError: VertexAIException BadRequestError - b'{"type":"error","error":{"type":"invalid_request_error","message":"A maximum of 4 blocks with cache_control may be provided. Found 7."},"request_id":"req_vrtx_011CSvF6dqXwsNBJ8JhXuNBA"}'. Received Model Group=claude-3-7-sonnet-20250219
I think this is because https://github.com/BerriAI/litellm/blob/main/litellm/integrations/anthropic_cache_control_hook.py#L130-L133 is adding cache control to every item in the content
list. Based on https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#continuing-a-multi-turn-conversation, the cache control should only be added to the last item in the list.
Relevant log output
Are you a ML Ops Team?
No
What LiteLLM version are you on ?
1.75.8
Twitter / LinkedIn details
No response