Collect the IDs of the predicted methods in the test set

Right now, we only collect performance metrics (e.g., precision, recall, accuracy).

We need to collect some examples for future qualitative analysis. In other words, for each of the models we build, a collection of `[method_id, expected_prediction, model_prediction]`.

This way we can later look at code examples of false positives, false negatives, etc.

I suppose all these changes will be:
- `_single_run_model` should receive X_train, X_test, y_train, and y_test (which will be implemented in refactoring-ai/predicting-refactoring-ml#36), we should pass X_test_id.
- `_single_run_model` then returns, besides the performance metrics, a dataframe as suggested above.
- This should be printed to the logs in a way that becomes easy to parse later. Suggestion: "PRED,refactoring,model,id_element,expected_prediction,predicted_value". "PRED" is just a prefix that is easy to be find by `grep`.

I'm using method as an example, but it can also be a class or a variable or a field, i.e., everything we predict.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Collect the IDs of the predicted methods in the test set #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Collect the IDs of the predicted methods in the test set #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions