Skip to content

Conversation

@lllllyh01
Copy link
Collaborator

Description

Describe your changes in detail (optional if the linked issue already contains a detailed description of the changes).
Fixes #3219
Currently, ChatAgent only records token usage on step level. This PR adds finer-grained token usage tracking for each tool calls in one step.

Checklist

Go over all the following points, and put an x in all the boxes that apply.

  • I have read the CONTRIBUTION guide (required)
  • I have linked this PR to an issue using the Development section on the right sidebar or by adding Fixes #issue-number in the PR description (required)
  • I have checked if any dependencies need to be added or updated in pyproject.toml and uv lock
  • I have updated the tests accordingly (required for a bug fix or a new feature)
  • I have updated the documentation if needed:
  • I have added examples if this is a new feature

If you are unsure about any of these, don't hesitate to ask. We are here to help!

@github-actions github-actions bot added the Review Required PR need to be reviewed label Oct 24, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 24, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch lyh_tool_token

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Member

@Wendong-Fan Wendong-Fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @lllllyh01 for the PR, the calculation logic needs to be fixed

output_tokens = self._count_tokens(output_text)

# Calculate total tokens
total_tokens = profile.base_tokens + input_tokens + output_tokens
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the token calculation might be unreliable. Could you clarify the reasoning for setting the base token count to 100?

self._initialize_default_profiles()
)

def _initialize_default_profiles(self) -> Dict[str, ToolCostProfile]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this part is not necessary and hard to maintain

Comment on lines +187 to +188
# Extract text from common result fields
text_fields = ["content", "text", "message", "result", "output"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this be able to extract all text information?

Comment on lines +201 to +207
try:
return len(self.token_counter.encode(text))
except Exception as e:
logger.error(f"Error counting tokens: {e}")
pass

return len(text.split())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use token_counter.count_tokens_from_messages

Comment on lines +26 to +58
class ToolCategory(Enum):
"""Categories of tools based on their typical token consumption."""

# Low-cost tools (typically < 100 tokens)
SEARCH_API = "search_api" # Google Search, Bing Search, etc.
SIMPLE_UTILITY = (
"simple_utility" # Basic file operations, simple calculations
)

# Medium-cost tools (100-1000 tokens)
CODE_EXECUTION = "code_execution" # Python execution, shell commands
DOCUMENT_PROCESSING = "document_processing" # PDF parsing, text analysis
API_CALLS = "api_calls" # REST API calls, webhooks

# High-cost tools (1000+ tokens)
BROWSER_AUTOMATION = (
"browser_automation" # Browser interactions, screenshots
)
MULTIMODAL_PROCESSING = (
"multimodal_processing" # Image analysis, audio processing
)
LLM_CALLS = "llm_calls" # Sub-agent calls, complex reasoning


@dataclass
class ToolCostProfile:
"""Cost profile for a specific tool type."""

category: ToolCategory
base_tokens: int # Base token consumption


class ToolCostInfo(TypedDict):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think these classes may not necessary

@Wendong-Fan Wendong-Fan added Waiting for Update PR has been reviewed, need to be updated based on review comment and removed Review Required PR need to be reviewed labels Oct 25, 2025
@Wendong-Fan Wendong-Fan modified the milestones: Sprint 41, Sprint 40 Oct 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Waiting for Update PR has been reviewed, need to be updated based on review comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants