-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Pull requests: EleutherAI/lm-evaluation-harness
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix PIL image hashing to use actual bytes instead of object repr
#3331
opened Oct 7, 2025 by
tboerstad
Loading…
feat: Add support for accelerate-wrapped models in simple_evaluate()
#3313
opened Sep 26, 2025 by
DhruvaKashyap
Loading…
Support empty response for Completions and ChatCompletions API
#3309
opened Sep 22, 2025 by
tboerstad
Loading…
Adding New Task SLR-Bench : Scalable Logical Reasoning Benchmark
#3305
opened Sep 20, 2025 by
Ahmad21Omar
Loading…
Add long-context evaluation benchmarks (LongBench v2, Babilong, InfiniteBench, Phonebook)
#3256
opened Aug 21, 2025 by
Mariani-code
Loading…
Trim thinking content from model output in IFEval
#3240
opened Aug 14, 2025 by
davideguidobene
Loading…
Adding support for evaluating with Mistral and Pixtral models
#3235
opened Aug 13, 2025 by
LearnerSXH
Loading…
Adding support for Structured Generation with XGrammar
#3232
opened Aug 12, 2025 by
ceferisbarov
Loading…
5 tasks
Fix: respect
target_delimiter
when using a gen_prefix
on multiple-choice tasks
#3220
opened Aug 7, 2025 by
karanikolopoulos
Loading…
feat: COT trace response handling in evaluator and model classes
#3204
opened Aug 3, 2025 by
hhh2210
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.