-
Notifications
You must be signed in to change notification settings - Fork 3
feat: add claude 4.5 sonnet #816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthrough
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
daras_ai_v2/language_model.py(8 hunks)scripts/init_llm_pricing.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
scripts/init_llm_pricing.py (1)
usage_costs/models.py (1)
ModelProvider(54-69)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test (3.10.12, 1.8.3)
| claude_4_5_sonnet = LLMSpec( | ||
| label="Claude 4.5 Sonnet • Anthropic", | ||
| model_id="claude-sonnet-4-5", | ||
| llm_api=LLMApis.openai, | ||
| context_window=1_000_000, | ||
| max_output_tokens=64_000, | ||
| is_vision_model=True, | ||
| supports_json=True, | ||
| is_thinking_model=True, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify model specifications and capabilities.
The new Claude 4.5 Sonnet model definition includes impressive specifications:
- 1M context window
- 64K max output tokens
- Vision, JSON, and thinking capabilities
- Uses
LLMApis.openai(OpenAI-compatible API)
Since this is beyond my knowledge cutoff, please verify:
- Model ID "claude-sonnet-4-5" is correct
- Context window is actually 1,000,000 tokens
- Max output tokens is actually 64,000 tokens
- The model supports vision, JSON mode, and extended thinking
- Anthropic provides an OpenAI-compatible API endpoint for Claude models (hence
LLMApis.openai)
🌐 Web query:
What are the specifications for Claude 4.5 Sonnet including context window, max output tokens, and supported features (vision, JSON, thinking)? Does Anthropic offer an OpenAI-compatible API?
💡 Result:
Short answer:
- Context window: 200,000 tokens (general availability); 1,000,000-token beta available for eligible orgs (use beta header). [1][2][3]
- Max output tokens: 64,000 (standard GA). (Longer-output options exist as beta/extra headers for some Sonnet releases; check your account/headers). [3][4]
- Supported features: vision (image + PDF URL support / vision capabilities), structured/JSON outputs, and extended "thinking" (step‑by‑step / extended reasoning) supported via the Messages API (native) and partially via the OpenAI‑compatibility endpoint (thinking is available but full step‑by‑step thoughts are only returned on the native Anthropic API). [1][2][5]
- OpenAI‑compatible API: Yes — Anthropic provides an OpenAI‑SDK compatibility endpoint so you can use many OpenAI SDK fields (model, max_tokens, function calling, etc.) by changing base URL/key/model name. It has documented limitations and is intended primarily for testing/comparison rather than providing the full native feature set. [1][5]
Sources:
- Anthropic / API release notes and docs (OpenAI compatibility, Messages API features).
- Anthropic context‑windows doc (200K GA, 1M beta; context‑awareness details).
- Google Cloud Vertex AI model card for Claude Sonnet 4.5 (64K output; context length / beta 1M).
- Anthropic models overview / docs referencing long‑output headers and token limits.
- Anthropic OpenAI SDK compatibility documentation (details on "thinking" support and limitations).
If you want, I can paste the exact doc links or show the exact headers/parameters to use in requests.
Refine Claude 4.5 Sonnet configuration
- context_window should be 200_000 tokens for GA (1 M is beta – requires special header)
- max_output_tokens 64 000 is correct
- supports_json and is_vision_model are valid
- full step-by-step “thinking” is only available via native Anthropic API (use LLMApis.anthropic); OpenAI-compat endpoint offers partial support
- model_id “claude-sonnet-4-5” is correct
🤖 Prompt for AI Agents
In daras_ai_v2/language_model.py around lines 780 to 789, update the Claude 4.5
Sonnet LLMSpec: set context_window to 200_000 (GA limit) instead of 1_000_000,
keep max_output_tokens at 64_000 and retain supports_json and is_vision_model,
and change llm_api from LLMApis.openai to LLMApis.anthropic so full step‑by‑step
“thinking” uses the native Anthropic API (leave model_id and is_thinking_model
as-is).
| is_deprecated=True, | ||
| redirect_to="claude_3_7_sonnet", | ||
| redirect_to="claude_4_5_sonnet", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider the impact of cross-tier redirects.
Several deprecated models redirect to claude_4_5_sonnet across different capability/pricing tiers:
claude_3_5_sonnet→claude_4_5_sonnet✓ (same tier)claude_3_opus→claude_4_5_sonnet(Opus → Sonnet)claude_3_sonnet→claude_4_5_sonnet✓ (same tier)claude_3_haiku→claude_4_5_sonnet(Haiku → Sonnet)
Potential concerns:
-
Cost impact: Users who selected
claude_3_haikufor its lower cost will be redirected to a more expensive Sonnet model. The pricing difference:- Haiku: $0.25 input / $1.25 output per 1M tokens
- Sonnet 4.5: $3 input / $15 output per 1M tokens (12x more expensive!)
-
Use case mismatch: Haiku users may have chosen it for speed/throughput in high-volume scenarios where Sonnet's capabilities aren't needed.
Recommendation: Consider redirecting claude_3_haiku to a newer Haiku model (if one exists) or document this cost increase prominently.
Also applies to: 859-860, 871-872, 883-884
Q/A checklist
How to check import time?
You can visualize this using tuna:
To measure import time for a specific library:
To reduce import times, import libraries that take a long time inside the functions that use them instead of at the top of the file:
Legal Boilerplate
Look, I get it. The entity doing business as “Gooey.AI” and/or “Dara.network” was incorporated in the State of Delaware in 2020 as Dara Network Inc. and is gonna need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Dara Network Inc can use, modify, copy, and redistribute my contributions, under its choice of terms.