Refactor metrics logic to create array of pass@1 metrics for each sample

If we refactor the current metrics logic to compute pass@1 metrics for each sampled file in isolation and keep track of the full array, we can have a proper std estimation that would work for arbitrary metric class. The current one added in https://github.com/NVIDIA/NeMo-Skills/pull/757 only has partial coverage as it's not possible to have a clean access to all pass@1 metrics in the current structure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor metrics logic to create array of pass@1 metrics for each sample #780

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor metrics logic to create array of pass@1 metrics for each sample #780

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions