Skip to content

Conversation

@nikochiko
Copy link
Member

Q/A checklist

  • I have tested my UI changes on mobile and they look acceptable
  • I have tested changes to the workflows in both the API and the UI
  • I have done a code review of my changes and looked at each line of the diff + the references of each function I have changed
  • My changes have not increased the import time of the server
How to check import time?

time python -c 'import server'

You can visualize this using tuna:

python3 -X importtime -c 'import server' 2> out.log && tuna out.log

To measure import time for a specific library:

$ time python -c 'import pandas'

________________________________________________________
Executed in    1.15 secs    fish           external
   usr time    2.22 secs   86.00 micros    2.22 secs
   sys time    0.72 secs  613.00 micros    0.72 secs

To reduce import times, import libraries that take a long time inside the functions that use them instead of at the top of the file:

def my_function():
    import pandas as pd
    ...

Legal Boilerplate

Look, I get it. The entity doing business as “Gooey.AI” and/or “Dara.network” was incorporated in the State of Delaware in 2020 as Dara Network Inc. and is gonna need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Dara Network Inc can use, modify, copy, and redistribute my contributions, under its choice of terms.

@coderabbitai
Copy link

coderabbitai bot commented Oct 29, 2025

📝 Walkthrough

Walkthrough

This PR adds a new DeepSeek V3.1 model to the LargeLanguageModels enum with configuration for the Fireworks API, including context window, output token limits, and JSON/thinking capabilities. It marks the existing DeepSeek R1 model as deprecated with a redirect to the new version. Corresponding pricing information for the new model is added to the LLM pricing initialization script.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Homogeneous changes following established patterns (model addition with deprecation redirect)
  • Limited scope affecting only two files with straightforward additions
  • Configuration values (pricing, context window, model IDs) should be verified for accuracy
  • No complex logic or structural changes

Possibly related PRs

Suggested reviewers

  • devxpy

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "deprecate deepseek r1, add v3.1" clearly and accurately summarizes the main changes in the pull request. The changes include deprecating the deepseek_r1 model entry and adding a new deepseek_v3p1 entry with pricing configuration. The title is concise, specific, and directly reflects these two primary objectives without vague language or unnecessary details.
Description Check ✅ Passed The PR description matches the required template structure completely, including all key sections: the Q/A checklist with all four items present, the "How to check import time" section with detailed instructions and examples, and the Legal Boilerplate section with the exact required text about rights and permissions. Although the checklist items are unchecked, this is the standard format for PR descriptions at the time of submission and does not indicate incompleteness.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch deprecate-deepseek

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f6172b8 and 57501ad.

📒 Files selected for processing (2)
  • daras_ai_v2/language_model.py (1 hunks)
  • scripts/init_llm_pricing.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
scripts/init_llm_pricing.py (2)
daras_ai_v2/language_model.py (1)
  • LargeLanguageModels (82-1046)
usage_costs/models.py (1)
  • ModelProvider (54-69)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test (3.10.12, 1.8.3)
🔇 Additional comments (3)
scripts/init_llm_pricing.py (1)

1134-1142: LGTM! Pricing values verified as accurate.

The pricing entry is correctly specified at $0.56/$1.68 per million tokens for input/output, matching current Fireworks AI rates.

daras_ai_v2/language_model.py (2)

446-448: LGTM! Deprecation pattern is correct.

The deprecation of deepseek_r1 with redirection to the new deepseek_v3p1 model follows the established pattern used for other deprecated models in this file.


450-458: All specifications verified as accurate.

The DeepSeek V3.1 configuration is correct:

  • context_window: 163_800 — Matches Fireworks model page listing (163.8k tokens)
  • max_output_tokens: 20_500 — Reasonable and within bounds (Fireworks default is 2K, examples show up to 4096)
  • model_id — Correct Fireworks API format
  • is_thinking_model: True — Confirmed; DeepSeek V3.1 is a hybrid thinking/non-thinking model with reasoning capability

The code follows the established pattern and requires no changes.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants