Skip to content

Collect the IDs of the predicted methods in the test set #3

@mauricioaniche

Description

@mauricioaniche

Right now, we only collect performance metrics (e.g., precision, recall, accuracy).

We need to collect some examples for future qualitative analysis. In other words, for each of the models we build, a collection of [method_id, expected_prediction, model_prediction].

This way we can later look at code examples of false positives, false negatives, etc.

I suppose all these changes will be:

  • _single_run_model should receive X_train, X_test, y_train, and y_test (which will be implemented in Train, validation, and test predicting-refactoring-ml#36), we should pass X_test_id.
  • _single_run_model then returns, besides the performance metrics, a dataframe as suggested above.
  • This should be printed to the logs in a way that becomes easy to parse later. Suggestion: "PRED,refactoring,model,id_element,expected_prediction,predicted_value". "PRED" is just a prefix that is easy to be find by grep.

I'm using method as an example, but it can also be a class or a variable or a field, i.e., everything we predict.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions