Skip to content

Conversation

gitbuda
Copy link
Member

@gitbuda gitbuda commented Aug 30, 2025

  • All metric implementations are dummy -> make sure the implementations are correct ❌
  • Add the cost-centric metrics like: the number of tokens, the actual LLM cost, latency ❌
  • Add more complex metrics where the evaluation result depends on the expected result ❌

Did the evaluation of the existing frameworks (HERE) -> DeepEval seems the best (going with that).

@gitbuda gitbuda self-assigned this Aug 30, 2025
@gitbuda
Copy link
Member Author

gitbuda commented Sep 9, 2025

In the short run, DeepEval FTW

@gitbuda gitbuda closed this Sep 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant